Main Page: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
Line 55: Line 55:
==== Week 3:  Written Documents (2D) pipeline ====
==== Week 3:  Written Documents (2D) pipeline ====


30.09 Introduction to the the Digitization Process (1h) . Document digitisation as a problem of conversion of dimensions. Digitisation is logistic optimization. Alienation. Digitisation on demand.  
30.09 Introduction to the Digitization Process (1h). Document digitization as a problem of conversion of dimensions. Digitization is logistic optimization. Alienation. Digitization on demand.  
(1h) Pipeline for Written documents.  Part I: Standards. Open Annotation Data Model. Shared Canvas. Part 2: Regulated Representations, Homologous points and the construction of hypothetical realities.
(1h) Pipeline for Written documents.  Part I: Standards. Open Annotation Data Model. Shared Canvas. Part 2: Regulated Representations, Homologous points, and the construction of hypothetical realities.


01.10 (4h) [[Projects]] presentations. 5' per project with max 3 slides. Fill out the [[Projects#Groups|group table]] before the course.
01.10 (4h) [[Projects]] presentations. 5' per project with max 3 slides. Fill out the [[Projects#Groups|group table]] before the course. You can find a group using the [https://annuel2.framapad.org/p/fdh framapad].


==== Week 4: Projects ====
==== Week 4: Projects ====

Revision as of 12:00, 24 September 2020

Welcome to the wiki of the course Foundation of Digital Humanities (DH-405).

Contact

Professor: Frédéric Kaplan

Assistant: Raphaël Barman

Rooms: Wednesday (CM1113) and Thursday (BC03)

Links

Summary

This course gives an introduction to the fundamental concepts and methods of the Digital Humanities, both from a theoretical and applied point of view. The course introduces the Digital Humanities circle of processing and interpretation, from data acquisition to new understandings and services. The first part of the course presents the technical pipelines for digitising, analysing and modelling written documents (printed and handwritten), maps, photographs and 3d objects and environments. The second part of the course details the principles of the most important algorithms in particular deep learning approaches (for document analysis and image generation) and knowledge modelling (semantic web, ontologies, graph databases). The third part of the course focuses on platform management from the points of view of data, users and bots. Students will practise the skills they learn by engaging in a class-wide collective project.

Plan

Introduction

Week 1 : What are Digital Humanities?

16.09 (2h) Welcome and Introduction to the course

  • FDH-0 (1h) Introduction to the course and Digital Humanities, structure of the course. Introduction to Framapad with a simple exercise. Principle of collective note talking and use in the course. State of the Digital Humanities at EPFL, in Switzerland and in Europe. Video recording link.

17.09 (4h) What are Digital Humanities? What is their object of study?

  • FDH-1-1 (1h) What Are Digital Humanities : Digital Humanities, Digital Studies, Humanities Computing and Studies about Digital Culture. Digital Humanism vs. Digital Humanities. Why digital methods tend to dissolve traditional disciplinary frontiers. A focus on practice. Translation issues. Video recording link.
  • FDH-1-2 (1h) Digital Humanities as a field : Big Data Digital Humanities vs Small Data Digital Humanities. The 3 circles. Exercise on relationship between elements in Digital Culture schema. Video recording link.
  • FDH-1-3 (2h) Big Data of the Past. Video recording link.

Week 2 : Patrimonial Capitalism and Great Commons

23.09 :

  • FDH 1-4 Patrimonial Capitalism (1h) Introduction to the DH circle linking the digitisation of sources, their processing, their analysis, visualisation and the creation of societal value (insight, culture) leading ultimately to the digitisation of new sources. Presentation of some sustainable DH circles (genealogy, image banks). Patrimonial capitalism and the risk of monopolistic companies. Parallelism with the race for sequencing the Human Genome. Video recording link.
  • FDH 1-5 The Commons (1h) Video recording link.

24.09

Part I : Pipelines

Week 3: Written Documents (2D) pipeline

30.09 Introduction to the Digitization Process (1h). Document digitization as a problem of conversion of dimensions. Digitization is logistic optimization. Alienation. Digitization on demand. (1h) Pipeline for Written documents. Part I: Standards. Open Annotation Data Model. Shared Canvas. Part 2: Regulated Representations, Homologous points, and the construction of hypothetical realities.

01.10 (4h) Projects presentations. 5' per project with max 3 slides. Fill out the group table before the course. You can find a group using the framapad.

Week 4: Projects

07.10 (2h) Presentation of the Gallica wrapper.

08.10 (4h) Presentation of BNF XML ALTO and PAGE XML, Transkribus and VGG Image annotator (VIA). Work on projects

Week 5: Image (2D) Pipeline

14.10 (2h) Pipeline for Artworks photographs. Image banks and phototarchives. Photography as documentation. Scanning techniques for photographs. Segmentation. Visual similarity vs visual connections.

15.10 (4h) Introduction to deep learning analysis of image similarity. Exercises with the Replica database and search engine. 5 mn presentation. Work on projects.

Week 6: Maps (2D) Pipeline

21.10 (2h) What are cartographic documents. Exercice on ancient maps. History of cartography. Odometry. Triangulation. Coordinate systems. Metric systems. Projection. Cadaster. Aerial photography.

22.10 (1-2h) Introduction to GIS. Points, Lines, Polygons. Coordinate Sytems. Georefencing exercice (2-3h) Work on Projects.

Week 7: Object/Environment (3D) Pipeline

26.10 (2h) Pipeline for 3D spaces. Modelling vs Sampling : Part I : Modelling. Photogrammety. Demo Sketchup. Model-based Procedural methods. Architectural grammars. Class I and Class II elements. The question of realism. Part II : Sampling. Principles of Phtogrammetry

27.10 (2h) Photogrammetric tutorial. Video to 3D pipelines. 3D modelisations of past years. (2h) Introduction to the Mirror Worlds concept

Part II : Algorithms

Week 8 : Deep Learning algorithms

02.11 (2h) A panorama of Deep learning methods. Successes. Fundamental principles. Neurons. Receptive Fields. Hierarchical representation / texture. Gradient descent. Credit Assignment Path. Most important architectures. Convolutional neural networks. Recurrent neural networks. Siamese Networks. Word2Vec. Generative Adversarial Networks. Style Transfer. Importance of Deep learning for Digital Humanities. Can Deep Learning networks and Big Data of the Past lead to new forms of Artificial Intelligence ?

03.11 (2h) Computer vision and deep learning tutorial (Sofia). Conditional Random Fields (CRF) tutorial. Both are available on this Github repository.

Week 9 : Project

10.11 (2h) Midterm presentation (10%)

Time First student name Second student name Third student name Project name
10:15-10:30
10:30-10:45 -
10:45-11:00 -
11:15-11:30 -
11:30-11:45 -


11.11 (4h) Project development

Week 10 : Knowledge modelling

18.11 (2h) The beauty of Knowledge modelling. Tables. Databases. Semantic web, Ontologies, URI, RDF, CIDOC-CRM, How to code event, places and influence. Metaknowledge. The Hypergraph

19.11 (4h) (a) Exercise in semantic modelling and inference (Maud). Graph writing. Presentation of some interesting ontologies: SKOS, VIAF, Geonames, TGN, W3C Time Ontology. SPARQL and SPARQL endpoint. Exercice on SPARQL endpoints: DBPedia [1], Talk of Europe [2], Persée [3], Le Temps ARchive [4], available on this Github repository. (b) Work on projects

-- Project plan and milestones deliverable on the Wikipage of each project (10%)

Part III : Platform management

Week 11 : Work on Project

25.11 (2h) Work on project

26.11 (4h) Work on project

Week 12 : Data, User and Bot Management

02.12 (2h) Data Management  : FAIR principle, Creative Commons, Data Management models, Sustainability, Right to Forgotten. Management of uncertainty, incoherence and errors. Iconographic principle of precaution

03.12 (2h) User Management : Part I: Persona. Part II: Motivation and onboarding dynamics. Three case studies: Twitter. Quora. Wikipedia. Part III: "Wisdom" of the crowds. Collectivism vs Liberalism. Open source as a form of liberalism for engineering. The ambiguous of fork. Part IV: The "power" of the crowds. Mechanical Turk. Crowdflower. Crowdfunding (2h) Bot Management : Three case studies on bot management : Twitter, Wikipedia, Google.

Week 13 : Work on projects

09.12 (2h) Work on project

10.12 (4h) work on project

-- Deadline for GitHub repository (10%)

-- Deadline for Report writing (40%)

Week 14 : Exam

16.12 (2h) Final project presentation (20%)

17.12 (2h) ----


Resources

Assessment and Notation grid

  • 2 oral presentations (30%)
    • 1 midterm presentation of the project (10%)
    • 1 final discussing the project result (20%)
  • Written deliverables (Wiki writing) (40%)
  • Quality of the project (30%)

2 collective oral presentations (30%)

Midterm presenting the project planning (10%)

10' max presentation + 5' questions

Notation grid :

  • The presentation contains a planning (4)
  • + 0.5 The slides are clear and well presented
  • + 0.5 The oral presentation is dynamic and fluid
  • + 0.5 The planning is realistic.
  • + 0.5 The students answer well to the questions

Final discussing the project result (20%)

10-15' for presentation and 5-10' for questions

Notation grid :

  • The presentation presents the results of the project (4)
  • + 0.5 The slides are clear and well presented
  • + 0.5 The oral presentation is dynamic and fluid
  • + 0.5 The results are well discussed
  • + 0.5 The students answer well to the questions

Written deliverables (Wiki writing) (40%)

  • Projet plan and milestones (10%) (>300 words)
  • Historical introduction to the source(s) (5%) (>200 words)
  • Detailed description of the methods (10%) (>500 words)
  • Quality assessment (10%) (>300 words)
  • Motivation and description of the website (5%) (>200 words)

Production (30%)

  • Quality of the realisation 20%
  • Code deliverable on github 10%