Extracting Toponyms from Maps of Jerusalem: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
Line 97: Line 97:
==== Word Combination ====
==== Word Combination ====


{| class="wikitable" style="margin: 1em auto 1em auto;"
{| class="wikitable" style="float: left; margin-right: 1em;"
|-
|-
| [[File:etmj_meth_comb_single_line.png|thumb|Single Line]]
| [[File:etmj_meth_comb_single_line.png|thumb|Single Line]]

Revision as of 14:45, 5 December 2023

Project Timeline

Timeframe Task Completion
Week 4
  • Finalize and present project proposals.
    • Toponym extraction project selected.
Week 5
  • Survey SOTA toponym extraction tools.
Week 6
  • Port MapKurator's Spotter tool and model weights into Windows-based Python.
  • Select two (later four) maps to use when implementing, evaluating, and fine-tuning MapKurator's model.
Week 7
  • Create ground truth labels for first map with VIA's online interface.
Week 8
  • Create ground truth labels for second map.
  • Implement 1:1-matched precision and recall via IoU (geometry) and normalized Levenshtein (text).
  • Calculate baseline accuracy statistics.
Week 9
  • Implement multi-layer pyramid application of MapKurator's Spotter.
Week 10
  • Create ground truth labels for third map.
  • Implement toponym rectification and amalgamation on pyramid-derived toponyms.
Week 11
  • Calculate pyramid accuracy statistics.
  • Fine-tune toponym rectification and amalgamation.
  • Deliver Midterm presentation.
Week 12
  • Launch Wiki.
  • Group words into toponyms via polygon size and location.
  • Apply NLP tools to correct toponyms based on MapKurator strategy.
Week 13
  • Create ground truth labels for fourth map.
  • Calculate final accuracy statistics.
  • Hierarchize final toponyms and develop Voronoi map.
Week 14
  • Prototype toponym-disagreement visualizer.
  • Finalize Wiki and deliver presentation.

Introduction & Motivation

A sample of the linguistic, geometrical, and typographical diversity in 19th-century maps of Jerusalem.

Methodology

MapKurator

Pyramid

Word Recitification

Word Amalgamation

Word Combination

Single Line
Multiple Lines
Curved Line

Evaluation

Results

Limitations

Future work

Github Repository

Jerusalem Maps EPFL DH405

References

Literature

  • Kim, Jina, et al. "The mapKurator System: A Complete Pipeline for Extracting and Linking Text from Historical Maps." arXiv preprint arXiv:2306.17059 (2023).
  • Li, Zekun, et al. "An automatic approach for generating rich, linked geo-metadata from historical map images." Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020

Webpages