Main Page: Difference between revisions
Line 73: | Line 73: | ||
==== Week 5: Text Processing and Understanding ==== | ==== Week 5: Text Processing and Understanding ==== | ||
20.10 (2h) FDH 2-5 Text Processing : Diachronic and synchronic analysis. n-grams, TF-IDF, Topic Modeling, Word Space Models and Word embeddings (2h) [https://tube.switch.ch/videos/4aeff31f Video recording link.] | |||
21.10 (2h) FDH 2-6 Text Understanding : Close, surface, distant and machine reading, Information extraction, Named Entities, Resources, Large-Scale Projects (2h) [https://tube.switch.ch/videos/b9821f1e Video recording link.]. Work on Project (2h). | |||
==== Week 6: Images ==== | ==== Week 6: Images ==== |
Revision as of 08:39, 16 September 2021
Welcome to the wiki of the course Foundation of Digital Humanities (DH-405).
Contact
Professor: Frédéric Kaplan
Assistants: Paul Guhennec, Didier Dupertuis, Beatrice Vaienti
Rooms: Wednesday (CM1110) and Thursday (BC03)
Links
Summary
This course gives an introduction to the fundamental concepts and methods of the Digital Humanities, both from a theoretical and applied point of view. The course introduces the Digital Humanities circle of processing and interpretation, from data acquisition to new understandings and services. The first part of the course presents the technical pipelines for digitising, analysing and modelling written documents (printed and handwritten), maps, photographs and 3d objects and environments. The second part of the course details the principles of the most important algorithms in particular deep learning approaches (for document analysis and image generation) and knowledge modelling (semantic web, ontologies, graph databases). The third part of the course focuses on platform management from the points of view of data, users and bots. Students will practise the skills they learn by engaging in a class-wide collective project.
Plan
Part I : Concepts
Week 1 : What are Digital Humanities?
22.09 (2h) Welcome and Introduction to the course
- FDH-0 (1h) Introduction to the course and Digital Humanities, structure of the course. Introduction to Framapad with a simple exercise. Principle of collective note talking and use in the course. State of the Digital Humanities at EPFL, in Switzerland and in Europe. Video recording link (2020).
23.09 (4h) What are Digital Humanities? What is their object of study?
- FDH-1-1 (1h) What Are Digital Humanities : Digital Humanities, Digital Studies, Humanities Computing and Studies about Digital Culture. Digital Humanism vs. Digital Humanities. Why digital methods tend to dissolve traditional disciplinary frontiers. A focus on practice. Translation issues. Video recording link (2020).
- FDH-1-2 (1h) Digital Humanities as a field : Big Data Digital Humanities vs Small Data Digital Humanities. The 3 circles. Exercise on relationship between elements in Digital Culture schema. Video recording link.
- FDH-1-3 (2h) Big Data of the Past. Data acceleration regime. Inferred Patterns. Redocumentation. Fictional Spaces Video recording link (2020).
Week 2 : Patrimonial Capitalism and Commons
29.09 :
- FDH 1-4 Patrimonial Capitalism (1h) Introduction to the DH circle linking the digitisation of sources, their processing, their analysis, visualisation and the creation of societal value (insight, culture) leading ultimately to the digitisation of new sources. Presentation of some sustainable DH circles (genealogy, image banks). Patrimonial capitalism and the risk of monopolistic companies. Parallelism with the race for sequencing the Human Genome. Video recording link.
- FDH 1-5 The Commons (1h) What are the commons ? What is the public domains ? History and evolution. Copyright overreaching. Frontal collision. Governing with the commons Video recording link.
30.09
- FDH 1-6 Anatomy of a large-scale project (1h) Venice Time Machine. European Time Machine. Video recording link (pre-recorded). Video recording link (live)
- Past projects presentation. Video recording link.
- FDH 1-7 Projects.See also Projects page Video recording link (pre-recorded). Video recording link (live).
Part II : Pipelines
Week 3: Digitisation
06.10
- FDH 2-1 Introduction to the Digitization Process. The Story of Google books. Document digitization as a problem of conversion of dimensions. Digitization is logistic optimization. Alienation. Digitization on demand. Video recording link
07.10
- (2h) FDH 2-2 Document Structure. General presentation of the pipeline. Content and Structure. Circulation. Standards. Open Annotation Data Model. Shared Canvas.IIIF. Synchronic patterns and diachronic homology. Video recording link
- (2h) Projects presentations. 5' per project with max 3 slides. Fill out the group table before the course. You can find a group using the framapad. Video recording link
Week 4: Writing Systems and Text Encoding
13.10 (2h) FDH 2-3 : Writing Systems Video recording link
14.10 - (2h) FDH 2-4 : Text Encoding Video recording link - (2h) Work and support on projects
Week 5: Text Processing and Understanding
20.10 (2h) FDH 2-5 Text Processing : Diachronic and synchronic analysis. n-grams, TF-IDF, Topic Modeling, Word Space Models and Word embeddings (2h) Video recording link.
21.10 (2h) FDH 2-6 Text Understanding : Close, surface, distant and machine reading, Information extraction, Named Entities, Resources, Large-Scale Projects (2h) Video recording link.. Work on Project (2h).
Week 6: Images
21.10 (2h) FDH 2-7 : Image systems. Video recording link
22.10 (2h) FDH 2-8 : Image processing Video recording link Time machine search engine (2h) Work on project
(FDH 2-9 : Image understanding not done this year)
Week 7: Maps
26.10 (2h) FDH-2-10 Map systems Video recording link
27.10 (2h) FDH-2-11 Map processing (2h) Video recording linkWork on project
Week 8: Architecture and Objects
03.11 (2h) FDH-2-12: Architecture and Object Systems. Video recording link
04.10 (2h) FDH-2-13: Architecture and Object Processing: Modelling vs Sampling : Model-based Procedural methods. Architectural grammars. Class I and Class II elements. The question of realism. Video recording link. (2h)Work on project
Part III : Knowledge modelling and processing
Week 9 : Semantic modelling
11.11
- (1h) FDH-3-0 Summary of the concept viewed so far and introduction to part 3 Video recording link
- (1h) FDH-3-1 Semantic modelling. RDF, Metaknowledge Video recording link
12.11 Midterm presentation (10%)
Time | Project name | Group nº |
---|---|---|
10:20-10:35 | Biography Generation | Group 6 |
10:35-10:50 | Paintings / Photos geolocalisation | Group 7 |
10:50-11:05 | Procedural Venice | Group 8 |
11:05-11:20 | Austrian cadastral map | Group 9 |
11:20-11:35 | Rolandi Librettos | Group 10 |
Time | Project name | Group nº |
---|---|---|
13:20-13:35 | Biography Generation | Group 1 |
13:35-13:50 | Opera Rolandi archive | Group 2 |
13:50-14:05 | Terzani online museum | Group 3 |
14:05-14:20 | Photorealistic Venice (GANs) | Group 4 |
14:20-14:35 | Deciphering Venetian handwriting | Group 5 |
Week 10 : Ontologies, Constraints and Rule systems
18.11 (2h) FDH 3-2 Universal Ontologies Video recording link
19.11 (2h) FDH 3-3 Rule systems, simulations and parallel worlds Video recording link
-- Project plan and milestones deliverable on the Wikipage of each project (10%)
Week 11 : Non conceptual knowledge systems and topological data science
25.11 (2h) FDH 3-4 Non conceptual knowledge systems Video recording link (Part 1) Video recording link (Part 2)
26.11 (2h) FDH 3-5 Topological data science Video recording link
Part IV : Platforms
Week 12 : Data, User and Bot Management
02.12 (2h) Data Management : FAIR principle, Creative Commons, Data Management models, Sustainability, Right to Forgotten. Management of uncertainty, incoherence and errors. Iconographic principle of precaution Video recording link
03.12 (2h) User Management : Part I: Persona. Part II: Motivation and onboarding dynamics. Three case studies: Twitter. Quora. Wikipedia. Part III: "Wisdom" of the crowds. Collectivism vs Liberalism. Open source as a form of liberalism for engineering. The ambiguous of fork. Part IV: The "power" of the crowds. Mechanical Turk. Crowdflower. Crowdfunding. Video recording link
(2h) Bot Management : Three case studies on bot management : Twitter, Wikipedia, Google. Video recording link
Week 13 : Work on projects
09.12 (2h) Work on project
10.12 (4h) work on project
Week 14 : Exam
14.12 (5pm)
-- Deadline for GitHub repository (10%)
-- Deadline for Report writing (40%)
16.12 (2h) Final project presentation Part 1 (20%)
Time | Project name | Group nº |
---|---|---|
13:15-13:40 | VenBioGen | Group 1 |
13:40-14:05 | Procedural Venice | Group 8 |
14:05-14:30 | Rolandi Librettos | Group 10 |
14:30-14:55 | Photorealistic rendering of painting + Venice Underwater | Group 4 |
17.12 (2h) Final project presentation Part 2 (20%)
Time | Project name | Group nº |
---|---|---|
10:15-10:40 | Opera Rolandi archive | Group 2 |
10:40-11:05 | Paintings / Photos geolocalisation | Group 7 |
11:05-11:30 | Deciphering Venetian handwriting | Group 5 |
11:30-11:55 | WikiBio | Group 6 |
Time | Project name | Group nº |
---|---|---|
13:15-13:40 | Austrian cadastral map | Group 9 |
13:40-14:05 | Terzani online museum | Group 3 |
Resources
Assessment and Notation grid
- 2 oral presentations (30%)
- 1 midterm presentation of the project (10%)
- 1 final discussing the project result (20%)
- Written deliverables (Wiki writing) (40%)
- Quality of the project (30%)
2 collective oral presentations (30%)
Midterm presenting the project planning (10%)
10' max presentation + 5' questions
Notation grid :
- The presentation contains a planning (4)
- + 0.5 The slides are clear and well presented
- + 0.5 The oral presentation is dynamic and fluid
- + 0.5 The planning is realistic.
- + 0.5 The students answer well to the questions
Final discussing the project result (20%)
10-15' for presentation and 5-10' for questions
Notation grid :
- The presentation presents the results of the project (4)
- + 0.5 The slides are clear and well presented
- + 0.5 The oral presentation is dynamic and fluid
- + 0.5 The results are well discussed
- + 0.5 The students answer well to the questions
Written deliverables (Wiki writing) (40%)
- Projet plan and milestones (10%) (>300 words)
- Motivation and description of the deliverables (10%) (>300 words)
- Detailed description of the methods (10%) (>500 words)
- Quality assessment and discussion of limitations (10%) (>300 words)
The indicated number of words is a minimal bound. Detailed description can in particular be extended if needed.
Production (30%)
- Quality of the realisation 20%
- Code deliverable on github 10%