Generation of Textual Description: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
(Created page with "==Introduction==")
 
 
(16 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Introduction==
=Introduction=
 
=Motivation=
 
=Deliverables=
 
= Project Timeline & Milestones =
 
{|class="wikitable"
! style="text-align:center;"|Timeframe
! Task
! Completion
|-
| align="center" |Week 4
|
* Exploring the dataset
* Exploring in-context learning models for text summarization
| align="center" | ✓
|-
| align="center" |Week 5
|
* Identify patterns and edge cases from the dataset (e.g missing fields, "odd" values)
* Define different summarization formats accordingly to be used for in-context learning
* Explore the connection between the Catastici and Sommarioni dataset
| align="center" | ✓
|-
| align="center" |Week 6
|
* Refine summarization formats
* Construct a pipeline connecting translation generation, summarization and validation
| align="center" | ✓
|-
| align="center" |Week 7
|
* Evaluate summarization results
| align="center" |
|-
| align="center" |Week 8
|
* TBD
| align="center" |
|-
| align="center" |Week 9
|
* TBD
| align="center" |
|-
| align="center" |Week 10
|
* TBD
| align="center" |
|-
| align="center" |Week 11
|
* TBD
|-
 
| align="center" |Week 12
|
* TBD
| align="center" |
|-
 
| align="center" |Week 13
|
* TBD
| align="center" |
|-
 
| align="center" |Week 14
|
* TBD
| align="center" |
|-
|}
 
=Methodology=
==Generating Summarization Formats for In-context Learning==
 
 
==Choosing NLP Model for Translation==
Tried different SOTA models for translation --> Unsatisfactory results due to the nature of the language being in old time and specific to Venetian dialect. Preliminary results with 10 examples:
 
{| class="wikitable" style="background-color:#ffffff; color:#383838;"
|-
! Original text
! mBART
! Google Translate
! GPT-4
|- style="vertical-align:middle;"
| style="background-color:#dae8fc;" | casa e bottega da barbier
| and a barber pole
| House and Bottega da Barbier
| House and barber shop
|-
| style="background-color:#dae8fc;" | casa
| style="vertical-align:middle;" | home
| house
| style="vertical-align:middle;" | House
|-
| style="background-color:#dae8fc;" | bottega da strazariol
| style="vertical-align:middle;" | a strawberry bottle
| Bottega da Strazariol
| style="vertical-align:middle;" | Rag dealer's shop
|-
| style="background-color:#dae8fc;" | casa e bottega da tentor
| style="vertical-align:middle;" | home and pushbutton
| House and Bottega da Tentor
| style="vertical-align:middle;" | House and dyer’s shop
|-
| style="background-color:#dae8fc;" | magazen
| style="vertical-align:middle;" | warehouse
| magazen
| style="vertical-align:middle;" | Warehouse
|-
| style="background-color:#dae8fc;" | mezà
| style="vertical-align:middle;" | Eight
| mezà
| style="vertical-align:middle;" | Halfway house or mezzanine level
|-
| style="background-color:#dae8fc;" | cas vuota
| style="vertical-align:middle;" | empty house
| Cas empty
| style="vertical-align:middle;" | Empty house
|-
| style="background-color:#dae8fc;" | casa a pepian
| style="vertical-align:middle;" | the pepian house
| House in Pepian
| style="vertical-align:middle;" | House on the ground floor
|-
| style="background-color:#dae8fc;" | bottega da confetti
| style="vertical-align:middle;" | Packaging bottle
| Bottega da sugaredi
| style="vertical-align:middle;" | Confectioner’s shop
|-
| style="background-color:#dae8fc;" | casa e bottega
| style="vertical-align:middle;" | home and doorbell
| House and Bottega
| style="vertical-align:middle;" | House and shop
|}
-> gonna choose GPT-4
 
=Results=
 
=Limitations and further work=
 
=Conclusion=
 
=Credits=

Latest revision as of 15:46, 2 November 2024

Introduction

Motivation

Deliverables

Project Timeline & Milestones

Timeframe Task Completion
Week 4
  • Exploring the dataset
  • Exploring in-context learning models for text summarization
Week 5
  • Identify patterns and edge cases from the dataset (e.g missing fields, "odd" values)
  • Define different summarization formats accordingly to be used for in-context learning
  • Explore the connection between the Catastici and Sommarioni dataset
Week 6
  • Refine summarization formats
  • Construct a pipeline connecting translation generation, summarization and validation
Week 7
  • Evaluate summarization results
Week 8
  • TBD
Week 9
  • TBD
Week 10
  • TBD
Week 11
  • TBD
Week 12
  • TBD
Week 13
  • TBD
Week 14
  • TBD

Methodology

Generating Summarization Formats for In-context Learning

Choosing NLP Model for Translation

Tried different SOTA models for translation --> Unsatisfactory results due to the nature of the language being in old time and specific to Venetian dialect. Preliminary results with 10 examples:

Original text mBART Google Translate GPT-4
casa e bottega da barbier and a barber pole House and Bottega da Barbier House and barber shop
casa home house House
bottega da strazariol a strawberry bottle Bottega da Strazariol Rag dealer's shop
casa e bottega da tentor home and pushbutton House and Bottega da Tentor House and dyer’s shop
magazen warehouse magazen Warehouse
mezà Eight mezà Halfway house or mezzanine level
cas vuota empty house Cas empty Empty house
casa a pepian the pepian house House in Pepian House on the ground floor
bottega da confetti Packaging bottle Bottega da sugaredi Confectioner’s shop
casa e bottega home and doorbell House and Bottega House and shop

-> gonna choose GPT-4

Results

Limitations and further work

Conclusion

Credits