VenBioGen: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
Line 9: Line 9:


== Project Plan and Milestones ==
== Project Plan and Milestones ==
'''Milestone 1 (up to 1 October)'''
'''Milestone 1 (up to 1 October)'''
* Brainstorm project ideas
* Brainstorm project ideas
* Prepare presentation to present two ideas for the project
* Prepare presentation to present two ideas for the project
'''Milestone 2 (up to 14 November)'''
'''Milestone 2 (up to 14 November)'''
* Finalize project planning
* Finalize project planning
* Build generative model
* Build generative model, using Python with Tensorflow 1 (to guarantee compatibility with pre-built GPT-2 and BERT models)
** Finetune GPT-II model
** Finetune GPT-II model
** Finetune Bert
** Finetune Bert
* Connect both models to create generative pipeline
* Examine generated biographies, and acceptance threshold, and think about further improvements
* Examine generated biographies, and acceptance threshold, and think about further improvements
'''Milestone 3 (up to 20 November)'''
'''Milestone 3 (up to 20 November)'''
* Create frontend for interface
* Create frontend for interface
* Create backend for interface
* Create backend for interface, and connect it to the generative module
* Finalize a RESTful API for communication between both ends
* Finalize a RESTful API for communication between both ends
'''Milestone 4 (up to 5 December)'''  
'''Milestone 4 (up to 5 December)'''  
* Add post-processing date adjustment module
* Add post-processing date adjustment module
* Add historical realism enhancement module
* Add historical realism enhancement module
* Assess feasibility and costs involved with outsourcing evaluation to a platform such as AMZ Mechanical Turk
* Assess feasibility and costs involved with outsourcing evaluation to a platform such as AMZ Mechanical Turk
'''Milestone 5 (up to 11 December)'''  
'''Milestone 5 (up to 11 December)'''  
* Have reached a decision with regards to evaluation procedure
* Have reached a decision with regards to evaluation procedure

Revision as of 12:54, 19 November 2020

Introduction

In this project, we use generative models to come up with creative biographies of Venetian people that existed before the 20th century. Our motivation was originally to observe how such a model would pick up on underlying relationships between Venetian actors in old centuries as well as their relationships with people in the rest of the world. These underlying relationships might or might not come to light in every generated biography, but we can be sure that the model has the potential to offer fresh perspectives on historical tendencies.

Motivation

This project is an interesting exploration into the intersection between Humanities and Natural Language Processing (NLP) applied to text generation.

By generating fictional biographies of Venetians up to and including the 19th century, this model allows, on one hand, us to evaluate the current state, progress and limitation of text generation. Even more worthwhile, the question of whether underlying historical tendencies - most commonly under the form of international or inter-familial relationships - are learnt by the model can be explored.

Project Plan and Milestones

Milestone 1 (up to 1 October)

  • Brainstorm project ideas
  • Prepare presentation to present two ideas for the project

Milestone 2 (up to 14 November)

  • Finalize project planning
  • Build generative model, using Python with Tensorflow 1 (to guarantee compatibility with pre-built GPT-2 and BERT models)
    • Finetune GPT-II model
    • Finetune Bert
  • Connect both models to create generative pipeline
  • Examine generated biographies, and acceptance threshold, and think about further improvements

Milestone 3 (up to 20 November)

  • Create frontend for interface
  • Create backend for interface, and connect it to the generative module
  • Finalize a RESTful API for communication between both ends

Milestone 4 (up to 5 December)

  • Add post-processing date adjustment module
  • Add historical realism enhancement module
  • Assess feasibility and costs involved with outsourcing evaluation to a platform such as AMZ Mechanical Turk

Milestone 5 (up to 11 December)

  • Have reached a decision with regards to evaluation procedure
  • Conduct model evaluation
  • Review code
  • Finalize documentation
  • Finalize report

Interface

Generative module

Data processing pipeline

Text generation

GPT2

BERT

Enhancements module

Date adjustments

Adding historical realism

Quality assessment

Results