Main Page: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
Line 67: Line 67:
=== Part II : Algorithms ===
=== Part II : Algorithms ===


01.11 (2h) Algorithms for Document processing : Document analysis and Deep learning methods
==== Week 7 : Deep Learning algorithms ====
 
01.11 (2h) A panorama of Deep learning methods  


03.11 (4h) (a) Machine vision tutorial (Benoit, Sofia). Introduction to Jupyter. Deep learning in practice.  
03.11 (4h) (a) Machine vision tutorial (Benoit, Sofia). Introduction to Jupyter. Deep learning in practice.  
Line 77: Line 79:
10.11 (4h) (a) Exercise in semantic modelling and inference (Maud) (b) Midterm presentation with project planning and prior art (10%)
10.11 (4h) (a) Exercise in semantic modelling and inference (Maud) (b) Midterm presentation with project planning and prior art (10%)


15.11 (2h) Algorithms for Generative models and simulation : Rule-based inference, Deep learning based generation. Discussion on new regimes of visibility.  
15.11 (2h) Algorithms for Generative models and simulation : Rule-based inference. Discussion on new regimes of visibility.  


17.11 (4h) (a) Exercise in deep-learning based generation (Benoit, Sofia) (b) Project development
17.11 (4h) Project development


-- Deadline for individual essay (30%)
-- Deadline for individual essay (30%)
Line 86: Line 88:


22.11 (2h) Data Management  : Computing infrastructure,  Data Management models, Sustainability. Apps. Management of uncertainty, incoherence and errors. Iconographic principle of precaution. Example of Wikipedia and Europeana.  
22.11 (2h) Data Management  : Computing infrastructure,  Data Management models, Sustainability. Apps. Management of uncertainty, incoherence and errors. Iconographic principle of precaution. Example of Wikipedia and Europeana.  
24.11 (4h) Project work
 
24.11 (4h) VENICE


29.11 (2h) User Management : Representation, Rights, Traceability, Vandalism, Motivation, Negotiation spaces. Right to be forgotten.  
29.11 (2h) User Management : Representation, Rights, Traceability, Vandalism, Motivation, Negotiation spaces. Right to be forgotten.  

Revision as of 20:21, 31 October 2017

Welcome to the wiki of the course Foundation of Digital Humanities (DH-405).

Contact

Professor: Frédéric Kaplan

Assistants: Vincent Buntinx and Lia Costiner

Rooms: Wednesday (CMN1113) and Friday (CM1104)

Links

Summary

This course gives an introduction to the fundamental concepts and methods of the Digital Humanities, both from a theoretical and applied point of view. The course introduces the Digital Humanities circle of processing and interpretation, from data acquisition to new understandings and services. The first part of the course presents the technical pipelines for digitising, analysing and modelling written documents (printed and handwritten), maps, photographs and 3d objects and environments. The second part of the course details the principles of the most important algorithms for document processing (layout analysis, deep learning methods), knowledge modelling (semantic web, ontologies, graph databases) generative models and simulation (rule-based inference, deep learning based generation). The third part of the course focuses on platform management from the points of view of data, users and bots. Students will practise the skills they learn directly analysing and interpreting Cultural Datasets from ongoing large-scale research projects (Venice Time Machine, Swiss newspaper archives).

Plan

Introduction

Week 1 : Structural tensions in Digital Humanities

20.09 (2h) Introduction to the course and Digital Humanities, structure of the course. Introduction to Framapad with a simple exercise. Principle of collective note talking and use in the course. State of the Digital Humanities at EPFL, in Switzerland and in Europe. Structuring tensions 1: Digital Humanities, Digital Studies, Humanities Computing and Studies about Digital Culture. Digital Humanism vs. Digital Humanities. Why digital methods tend to dissolve traditional disciplinary frontiers. A focus on practice. Translation issues.

22.09 (2h) (a) Structuring tensions 2: Big Data Digital Humanities vs Small Data Digital Humanities. The 3 circles. Exercise on relationship between elements in Digital Culture schema. (2h) Practical session: Introduction to MediaWiki. Objective: Learning the basic syntax of MediaWiki. Get a first experience of collaborative editing. Learning to write from a neutral point of view. Creation of the articles by the student followed by peer-review by another student (enriching, completing references). Each student picks a DH person and DH concept, write a Wiki page for each (30 mn + 30 mn). Each student chooses another person and another concept among the ones already covered, enrich with complementary information and references (20 mn + 20 mn)

(25.09 2pm : Experiment with Digital Art History interface INN116)

Week 2 : Patrimonial capitalism and common goods

27.09 (1h) Introduction to the DH circle linking the digitisation of sources, their processing, their analysis, visualisation and the creation of societal value (insight, culture) leading ultimately to the digitisation of new sources. Presentation of some sustainable DH circles (genealogy, image banks). Patrimonial capitalism and the risk of monopolistic companies. Parallelism with the race for sequencing the Human Genome. Introduction to the TIme Machine FET Flagship and mutualised infrastructure approach. (1h) General presentation of the Time Machine pipeline at the Datasquare / ArtLab pavillon.

29.09 (4h) Forum ArtTech (Rolex Learning Center). Mininig Big Data of the Past. Patrimonial capitalism and businesses opportunities. Examples of FamilySearch, myHeritage, Corbis.

Part I : Pipelines

Week 3: Documents pipeline

04.10 (2h) The Digitization Process and Pipelines. What is a document? What is a digital image? Exercise on Book Scanners typologies. Document digitisation as a problem of conversion of dimensions. Digitisation is logistic optimization. Alienation. Digitisation on demand. Fedorov's notion of optimal experiment.

06.10 (2h) Pipeline for Written documents. Part I: Standards. Open Annotation Data Model. Shared Canvas. Part 2: Regulated Representations. DHCanvas. (2h) Presentation of the Projects. Presentation of the main databases used in the course and ClioWire platform.

Week 4: Artworks Pipeline

11.10 (2h) Pipeline for Artworks photographs. Image banks and phototarchives. Scanning techniques for photographs. Segmentation. Visual similarity vs visual connections.

13.10 (4h) Introduction to deep learning approaches. Exercises with the Replica database and search engine. 5 mn presentation. Work on projects. Formation of the groups.

Week 5: Maps Pipeline

18.10 (2h) What are cartographic documents. Exercice on ancient maps. History of cartography. Odometry. Triangulation. Coordinate systems. Metric systems. Projection. Cadaster. Aerial photography.

20.10 (4h) (a) Introduction to GIS. Points, Lines, Polygons. Coordinate Sytems. QGIS Hands On. Exercise on Venetian cadaster (Bastien). (b) Oral presentation of project plan

Week 6: 3D Pipeline

25.10 (2h) Pipeline for 3D spaces. Modelling vs Sampling : Part I : Modelling. Photogrammety. Demo Sketchup. Model-based Procedural methods. Architectural grammars. Class I and Class II elements. The question of realism.

27.10 (4h) Part II : Sampling. Photogrammetric tutorial (Nils)

Part II : Algorithms

Week 7 : Deep Learning algorithms

01.11 (2h) A panorama of Deep learning methods

03.11 (4h) (a) Machine vision tutorial (Benoit, Sofia). Introduction to Jupyter. Deep learning in practice.

-- Deadline Bibliography and discussion of the state of the art (10%)

08.11 (2h) Algorithms for Knowledge modelling : Semantic web, ontologies, graph database, homologous points, disambiguation.

10.11 (4h) (a) Exercise in semantic modelling and inference (Maud) (b) Midterm presentation with project planning and prior art (10%)

15.11 (2h) Algorithms for Generative models and simulation : Rule-based inference. Discussion on new regimes of visibility.

17.11 (4h) Project development

-- Deadline for individual essay (30%)

Part III : Platform management

22.11 (2h) Data Management  : Computing infrastructure, Data Management models, Sustainability. Apps. Management of uncertainty, incoherence and errors. Iconographic principle of precaution. Example of Wikipedia and Europeana.

24.11 (4h) VENICE

29.11 (2h) User Management : Representation, Rights, Traceability, Vandalism, Motivation, Negotiation spaces. Right to be forgotten.

01.12 (4h) Testing phase

-- Deadline peer-grading (5%)

06.12 (2h) Bot Management : Versioning. Open source repositories.

08.12 (4h) Testing phase and report writing

13.12 (2h) Report writing

-- Deadline for GitHub repository (10%)

-- Deadline Detailed description of the methods (10%) (>500 words)

-- Deadline Quantitive analysis of the performances (10%) (>300 words)


15.12 (4h) Final project presentation (20%) and Discussion of the collective paper

References

Key Figures

Identity map (Cardon)

Maps for Big Data Digital Humanities (Kaplan)

Semiotic Triangle (McCloud)

Image similarity

Uncanny Valley (Mori)

Databases

(Page to be created indicating characteristics, quantity and copyright)

Le Temps Archives

Cini Photoarchive

Venice Time Machine documents

Scans of Acedemic Book and journals about Venice

Linked Book

Assessment and Notation grid

The final grade is based on 65% collective work and 35% individual work

a) 2 collective oral presentations

  • 1 midterm presenting the project planning and prior art (10%)
  • 1 final discussing the project result (20%)

b) Collective Written deliverables (Wiki writing)

  • Bibliography and discussion of the state of the art (10%) (>300 words)
  • Detailed description of the methods (10%) (>500 words)
  • Quantitive analysis of the performances (10%) (>300 words)

c) Collective Code deliverable

  • Organisation of the GitHub repository (5%)

d) Individual essay (Word or Open Office)

  • Introduction/Motivation On the relevance of ClioWire in the Digital Humanities landscape and Beyond (15 %) (> 500 words)
  • Discussion and Future Work (15%) (> 500 words)
  • Peergrading (5%)