Switzerland Road extraction from historical maps

From FDHwiki
Jump to navigation Jump to search

Introduction

Historical maps provide valuable information about spatial transformation of the landscape over time spans. This project, based on historical maps of Switzerland, is to vectorize road network and landcover and to visualize the transformation using a machine vision library developed at the DHLAB.

The main data source of this project is GeoVITe (Geodata Versatile Information Transfer environment),a browser-based access to geodata for research and teaching, operated by the Institute of Cartography and Geoinformation of ETH Zurich (IKG) since 2008.

Figure1:Dufour map of Switzerland divided by sequencial grids(1:25000)

Motivation

Historical maps contain rich information, which is helpful in urban planning, historical study, and various humanities research. Digitization of massive printed documents is a significant step before further research. However, most historical maps are scanned in rasterized graphical images. To conveniently use geographic data extracted from these maps in GIS software, vectorization is needed.

However, the vectorization process has always been a challenge due to manual painting. In this project, we try to use the dh-segmentation tool for automatic vectorization. With 60 high-resolution patches(5km*5km) for the training dataset, the model is tested on randomly selected patches and proposed to approximate idealized main roads of the Dufour map of the selected region -- Solothurn in Switzerland.

Plan and Milestones

Milestone 1:

  • Choose the topic for the project, present the first ideas. Get familiar with the data provided.
  • Define the final subject of the project. Find reliable sources of information. Get familiar with the dhSegment tool.

Milestone 2: (Now)

  • Prepare a small dataset (60 samples) for training and testing with Convolutional Neural Network, main part of the dhSegment tool. This dataset should consist of small patches extracted from the Dufour Map of Switzerland, their versions in jpeg format and binary labels created in GIMP.
  • Determine the way of downloading a huge amount of samples from GeoVITe.
  • Prepare for the mid-term presentation and write the project plan and milestone.
  • Try the dhSegment tool with the created dataset. Evaluate the results, modify the algorithm. Make conclusions about using this tool for road extraction, advantages and disadvantages of this approach.

Milestone 3:

  • Download the final dataset automatically using the python script: patches with corresponding coordinates completely covering different regions of Switzerland.
  • Test the dhSegment tool on the final dataset.

Milestone 4:

  • Get the vectorised map of roads with skeletonization in Python using OpenCV.
  • Visualise the results and prepare a final presentation.
Deadline Task Completion
By Week 4
  • Brainstorm, select project, and raise bold and feasible ideas
  • Present preliminary proposal and modify it according to the feedback
By Week 6
  • Literature review on relevant road extraction research, define the most appropriate way of data labeling
  • Experiment on possible solutions
  • Determine overall methodology
By Week 8
  • Prepare the training dataset of square patches of the Dufour Map downloaded from GeoVITe
  • Create labels, binary masks with black background and white roads, using GIMP
By Week 10
  • Prepare the mid-term presentation
  • Test the pre-trained model provided by DHLAB on the created small dataset
  • Improve and modify the algorithm of the segmentation tool
...
By Week 12
  • Download all patches automatically with the prepared python script and create the main dataset covering different regions of Switzerland
  • Work on road extraction for different regions: apply the segmentation process to the main dataset
  • Finish Georeferencing and finalize the vectorised map
...
By Week 14
  • Determine implement of visualisation
  • Sort out all the data in the Github repository
  • Finish the Wiki page and prepare for the final presentation
...

Methodology

Dataset:

  • Dufour Map from GeoVITe
  • The 1:100 000 Topographic Map of Switzerland was the first official series of maps that encompassed the whole of Switzerland. It was published in the period from 1845 to 1865 and thus coincides with the creation of the modern Swiss Confederation.
  • Classification : Main roads
  • Layer: Topographic Raster Maps - Historical Maps - Dufour Maps
  • Coordinate system: CH1903/LV03
  • Predefined Grids: 1:25000
  • Patch Size: 1000 by 1000 pixels

DhSegment:

A generic framework for historical document processing using Deep learning approach, created by Benoit Seguin and Sofia Ares Oliveira at DHLAB, EPFL.

Data preparation:

  • GeoVITe: Automatic data crawling / Manually data accessing
  • Swisstopo: black-and-white images -> difficult to annotate with low resolution

Labeling:

60 patches (1000x1000 pixels) using GIMP for model testing:

  • Original patches with spatial information: tiff
  • Patches for training: jpeg
  • labels in black and white: png


Figure2:a region of Bischofszell
Figure3:label of the sample patch

OpenCV:

OpenCV is used for skeletonization to reduce foreground regions in the binary image that largely preserves the extent and connectivity of the original region while throwing away most of the original foreground pixels. It facilitates quick and accurate image processing on the light skeleton instead of an otherwise large and memory-intensive operation on the original image. typically 1 pixel two basic morphological operations: dilation and erosion First, it starts off with an empty skeleton. Then, the opening of the original image is computed. After that, the opening is subtracted from the original image. Afterward, the union of the current skeleton and temp are computed to erode the original image and refine the skeleton. Finally, Repeat the steps above until the original image is completely eroded.


Gdal:

Limitation

The main limitation of our project is due to the data source platform. GeoVITE only allows small patches downloading, while automatic downloading from other sources leads to unsatisfying low-quality images.

Results

Reference

Links

  • Github repository: [1]
  • GeoVITE :[2]
  • dhSegment:[3]