Paris Metropolitan, an evolution: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(126 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Definition of the project ==
== Definition of the project ==


The group first selected a range of different maps showing the Paris Metropolitan at different years. In total, we collected from [https://gallica.bnf.fr/accueil/fr/content/accueil-fr?mode=desktop ''Gallica''] a set of two maps of the planning of the metro, from the definition of the routes to the addition of stations, a first map from 1908 of the actual metro after its construction in 1990, a second map from 1915, with already visible impacts of the first war, and a third map from 1950, a more contemporain look at the metro as we know it today. Our first idea was to analyse these maps in order to understand the evolution of the Paris Metropolitan, how different areas of major cultural attractions evolved around or hand in hand with the metro stations and how it was impacted by catastrophic events such as wars. However, as the goal of the project is to produce a working interface within a short amount of time, we decided to reduce the work of extraction of data to one map, which is the first map from 1908. Based on this map, we thus intend to build a superposition of a current map from Paris and the metro network extracted from the old map. From this visual display, will be able to see the evolution of the Paris Metropolitan from 1908 until nowadays and important historical explanations will be linked to corresponding stations. The result should take the form of a website page that displays the interactive map. The users would have the possibility to display the different layers, namely the layer of the old map, its metro network, with stations and lines, and this on top of a current map of Paris. Popup windows on each stations would display specific historical information about the station, which would be linked to the sources.<br>
The group first selected a range of different maps showing the Paris Metropolitan, also simply called metro, at different years of the last century. In total, we collected from [https://gallica.bnf.fr/accueil/fr/content/accueil-fr?mode=desktop ''Gallica''] a set of two maps of the planning of the metro, from the definition of the routes to the addition of stations, a first map from 1908 of the actual metro after its construction in 1900, a second map from 1915, with already visible impacts of the first World War, and a third map from 1950, a more contemporary look at the metro as we know it today. Our first idea was to analyze these maps in order to understand the evolution of the Paris Metropolitan, how different areas of major cultural attractions evolved around or hand in hand with the metro stations and how it was impacted by catastrophic events such as wars. However, as the goal of the project is to produce a working interface within a short amount of time, we decided to reduce the work of extraction of data to three maps, which are the first map from 1908, the map from 1915 and the last one from 1950. Based on these maps, we thus intend to build an overlay of a current map from Paris and the metro network extracted from the old maps. From this visual display, will be able to see the evolution of the Paris Metropolitan from 1908 until nowadays and important historical explanations will be linked to corresponding stations. The result should take the form of a website page that displays the interactive map. The users would have the possibility to display the different layers, namely the layer of the old maps, their metro networks, with stations and lines, and this on top of a current map of Paris. Popup windows on each stations would display specific historical information about the station, which would be linked to the sources.<br>
It is based on this prototype that the possibility to conduct similar data extractions on the other maps selected will be considered.
It is based on this prototype that the possibility to conduct similar data extractions on other maps can be considered. The project thus intend to propose a solid foundation, with a database and a display framework, to future similar data extractions from old metro maps.
 
==GitHub repository==
In order to achieve this project, a [https://github.com/yev111/Paris_Metropolitan_an_evolution GitHub repository] was created as well as a [http://valentinebernasconi.ch/paris_metropolitan/index.html website] to display the results. However, as the project handles very large files, namely the three old maps, some files used for the data extraction in QGIS could not have been added to the GitHub repository.


== Main steps ==
== Main steps ==
Line 14: Line 17:
* Determine or extract coordinates of the different stations
* Determine or extract coordinates of the different stations
* Create a Database with all the information gathered for each station from 1908
* Create a Database with all the information gathered for each station from 1908
* Compare the first path and stations from 1908 with the actual built one
* Compare the first path and stations from 1908 with later representations of the Paris Metropolitan
* Compare evolution of cultural attractions listed above, if new institutions appeared and if new metro lines were created due to them
* Create a website in order to display the maps in an interactive way
* Create a website in order to display the maps in an interactive way
**  pop up windows on specific points, such as stations with strong historical backgrounds
**  pop up windows on specific points, such as stations with strong historical backgrounds
** an overlay of maps in order to better see the evolution from 1908 to nowadays
** an overlay of maps in order to better see the evolution from 1908 to more recent years


== Milestones ==
== Milestones ==


'''Week 9''': Georeference, alignment with contemporain map and extraction of the paths and stations<br>
'''Week 9''' (14.11 - 16.11): <br>
'''Week 10''': Finish Database and analyse of the evolution of the Metropolitan Paris based on the information gathered<br>
*Georeference, alignment with contemporain map and extraction of the paths and stations
'''Week 11''': Creation of the website to display the data<br>
*Preparation of the structure of a GeoJSON database
'''Week 12''': Finalization of the project<br>
*Preparation of the midterm presentation
*Finalisation of milestones
'''Week 10''' (21.11 - 23.11): <br>
*Analyse of the evolution of the Paris Metropolitan based on the information gathered
*Finish Database in GeoJSON
*Finish writing the description of the extraction methods
*Planning for the creation of a website
'''Week 11''' (28.11 - 30.11):<br>
*Creation of the website to display the data
*Implementation of an interactive map
'''Week 12''' (5.12 - 7.12): <br>
*Finalization of the project
*Finish writing the report (historical introduction to the map, analysis of the performance of extraction, motivation and description of the services)


== Historical introduction to the map ==
== Historical introduction to the map ==
The maps used for the project were all published by A. Taride. The publishing house was founded by Alphonse Taride in 1852 <ref>Babelio, [https://www.babelio.com/auteur/-Taride/293708 ''Taride Babelio''], last accessed on 2018-11-13</ref> in Paris and was one of the first to create road, tourist and school maps. The group grew in 1895 when, helped by the "Union vélocipédique de France" and engineers from "Les ponts et chaussées", they printed their first maps at the scale of 1/25 000 000 <ref>Corpus Cartographique Etampois, [https://www.babelio.com/auteur/-Taride/293708 ''Alphonde Taride, Carte routière de l'Etampois en 1914''], last accessed on 2018-11-13</ref>. These maps were considered as a reference in Europe and North Africa until 1930 <ref>Corpus Cartographique Etampois, [https://www.babelio.com/auteur/-Taride/293708 ''Alphonde Taride, Carte routière de l'Etampois en 1914''], last accessed on 2018-11-13</ref>. Among their collection of vélocipédiques maps, they also proposed a range of Paris metropolitan maps and touristic guides, translated in different languages <ref>Babelio, [https://www.babelio.com/auteur/-Taride/293708 ''Taride Babelio''], last accessed on 2018-11-13</ref> . Unfortunatelly, the firm is on a wane after the second World War and reduce his work to Paris maps and globes. The firm was apparently later redeemed by the last french producer of globes. <br>
Based on the maps, we can see that the publishing house was first located at 18 and 20 Boulevard Saint Denis (described on the 1908 and 1915 maps) and then moved to 154 Boulevard Saint Germain, as we can see on the 1950 map. They are all printed in color and display the city of Paris overlaid by the metropolitan network. They all provide a legend with the definition of the different paths and stations. The map from 1908 is at a scale of 1/8000, the map from 1915 of 1/21 000 and the one from 1950, at a scale of 1/33 000. <br><br>
All the maps chosen represent the path of the Metropolitan of Paris overlaid on the city map. The first line of the Metropolitan of Paris was built in 1900 for the Paris Exposition Universelle. However, the race to build a network of railway transports started way before, around 1845, as many capital cities were considering the possibility to develop such transport systems, such as London, who built the world's first underground railway in 1890. Unfortunately, the project remained a subject of discussions during several years as the city of Paris wanted to build his own network for its inhabitants and the national railway services wanted to extend existing transports for people around the big metropole. The upcoming Exposition Universelle and the developments of such innovative means of transports in big capitals finally weighted on the balance to create an underground railway network specifically for the city of Paris <ref>Peter Hall, ''Underground as City Maker: London Versus Paris, 1863–2013'', 2013, pp. 177-183, last accessed on 2018-11-13</ref>. The construction of the network was given to the engineer Fulgence Bienvenüe, the "Père Métro", who almost devoted his entire career to the project. The first planifications of the Metropolitan provided a total of six lines, labelled from A to F, and 18 stations<ref>Michel Dansel, ''Paris-Metro'', Editions du Dauphin, 1975, p. 23</ref>. The construction of the first line, from Porte de Vincennes to Porte Dauphine and which was called line A at the time of the planification, started in November 1898 <ref>Michel Dansel, ''Paris-Metro'', Editions du Dauphin, 1975, pp. 23-24</ref>. According to the law from March 1898 enabling the construction of the Metropolitan<ref>Michel Dansel, ''Paris-Metro'', Editions du Dauphin, 1975, p. 23</ref>, the tracks had to cross one above the other, the total length of the trains was set to a maximum of 72 meters and the platforms to 75 meters. The city transformed into a big construction site and first trains were already tested in December 1899. However, it is only the year after, on July 19th 1900, and after the first months of the Exposition Universelle that the inauguration of the first line discreetly took place. Despite its official introduction away from the crowds, the press craze the morning after enabled the Metropolitan to benefit from a successful start. At rush hours, the underground network had a frequency of one train every 10 minutes with a constant speed of 25km per hours, whereas other means of transport at the surface could not go faster than 10 km per hours<ref>Michel Dansel, ''Paris-Metro'', Editions du Dauphin, 1975, p. 28</ref> ! The construction of the following lines from B-1, B-2 to line C, actual line 3 who was at the time passing through different business districts, and the remaining stations of the first plans took place in the following years and finished before the first World War. Unfortunately, due to the latter, further improvements of the network were slowed down and constructions only restarted  from 1925<ref>Michel Dansel, ''Paris-Metro'', Editions du Dauphin, 1975, p. 54</ref>. Another big upheaval, is the creation of the Régie Autonomie des Transports Parisiens (RATP) on January 1st 1949<ref>Michel Dansel, ''Paris-Metro'', Editions du Dauphin, 1975, pp. 57-58</ref>. The creation of the famous control office that is still in charge of the network nowadays is due to important political and social changes that occurred at the end of the second World War.
The chart below displays the construction of the Paris Metropolitan from its foundation to the last map selected in 1950. It can be observed that the vast majority of the metro stations were constructed before the first World War as well as in between the two wars. The construction during the wars was significantly reduced as resources might have gone into military.
<div><ul>
<li style="display: inline-block;">[[File:Paris construction.png|thumb|none|550px| Distribution of the construction of the Paris Metropolitan based on major historical events like WW1 and WW2]] </li>
</ul></div>


== Detailed description of the extraction methods ==
== Detailed description of the extraction methods ==
In order to extract information from the map, we decided to work with the open source software [https://www.qgis.org/fr/site/ ''QGIS3''].<br>
In order to extract information from the map, we decided to work with the open source software [https://www.qgis.org/fr/site/ ''QGIS3''].<br>
This software is a powerful GIS platform which enables to georeference maps, draw points, lines and polygones in order to create new and interactive maps. It benefits from a good community of users, with even a specific associative [https://www.qgis.ch/fr ''platform''] for Switzerland.<br>
This software is a powerful GIS platform which enables to georeference maps, draw points, lines and polygons in order to create new and interactive maps or to enhance and analyze existing ones. It benefits from a good community of users, with even a specific associative [https://www.qgis.ch/fr ''platform''] for Switzerland.<br>
The extraction of the information from the Paris map of 1908 required different steps, from the georeferencing to the extraction of the metropolitan network itself.
The extraction of the information from the Paris map of 1908 required different steps, from the georeferencing to the extraction of the metropolitan network itself.


===Georeferencing===
===Georeferencing===
The first step of the work with the 1908 map consists of georeferencing it, namely register it with a coordinate system in order to map it with a location on the surface of the earth <ref>GIS Ressources, [http://www.gisresources.com/georeferencing-2/ ''What is Georeferencing?''], last accessed on 2018-11-05</ref>. Indeed, the basic map of the Metropolitan of Paris is just an image file that does not contain any geographic information such as coordinates. In order to create these coordinates, we had to work with a shapefile of the buildings of the city of Paris. These ressources are easily found [https://opendata.paris.fr/explore/dataset/volumesbatisparis2011/table/ "online"], as their are freely made available by the city of Paris <ref>Open Data [https://opendata.paris.fr/explore/dataset/volumesbatisparis2011/table/ "Open Data | Volumes bâtis - Données géographiques"], last accessed on 2018-11-05</ref>. This shapefile is first imported in the new project created in QGIS. Then, the CRS (Coordonate Reference System) we are working with in the context of the project has to be correctly set to the french CRS, called RGF93. After having synchronized the layers with this CRS, the plugin Géoréférenceur GDAL<ref>QGIS 2.14 documentation [https://docs.qgis.org/2.14/fr/docs/user_manual/plugins/plugins_georeferencer.html "Extension de géoréférencement"], last accessed on 2018-11-05</ref> is used. It consists of a window, from which our image, the Paris map of 1908, is imported as a raster and displayed. A raster is an image file to which is added georeferencing information <ref>EMSE [https://www.emse.fr/tice/uved/SIG/Glossaire/co/Raster_format.html "Glossaire des SIG - Raster (Format)"], last accessed on 2018-11-05</ref>. In order to determine the right coordinates on the image of 1908, a tool from the plugin enables to find matching points from the image to the shapefile of the city of Paris. The work consists of selecting a point on the image and to then find its visual correspondance on the shapefile of the buildings of Paris. We chose these points according to recognizable buildings which exists on both maps. In the end, a total of 6 coordinate points in order to launch the georeferencing. The transformation we then performed was generated with a Polynomiale 1 and a .tiff file was created. This .tiff file contains both the image and the coordinate information of the raster and can be used as a foundation for the following steps of the project.
The first step of the work with the 1908 map consists of georeferencing it, namely register it with a coordinate system in order to map it with a location on the surface of the earth <ref>GIS Resources, [http://www.gisresources.com/georeferencing-2/ ''What is Georeferencing?''], last accessed on 2018-11-05</ref>. Indeed, the basic map of the Metropolitan of Paris is just an image file that does not contain any geographic information such as coordinates. In order to create these coordinates, we had to work with a shapefile of the buildings of the city of Paris. These resources are easily found [https://opendata.paris.fr/explore/dataset/volumesbatisparis2011/table/ "online"], as their are freely made available by the city of Paris <ref>Open Data [https://opendata.paris.fr/explore/dataset/volumesbatisparis2011/table/ "Open Data | Volumes bâtis - Données géographiques"], last accessed on 2018-11-05</ref>. This shapefile is first imported in the new project created in QGIS. Then, the CRS (Coordonate Reference System) we are working with in the context of the project has to be correctly set to the french CRS, called RGF93. After having synchronized the layers with this CRS, the plugin Géoréférenceur GDAL<ref>QGIS 2.14 documentation [https://docs.qgis.org/2.14/fr/docs/user_manual/plugins/plugins_georeferencer.html "Extension de géoréférencement"], last accessed on 2018-11-05</ref> is used. It consists of a window, from which our image, the Paris map of 1908, is imported as a raster and displayed. A raster is an image file to which is added georeferencing information <ref>EMSE [https://www.emse.fr/tice/uved/SIG/Glossaire/co/Raster_format.html "Glossaire des SIG - Raster (Format)"], last accessed on 2018-11-05</ref>. In order to determine the right coordinates on the image of 1908, a tool from the plugin enables to find matching points from the image to the shapefile of the city of Paris. The work consists of selecting a point on the image and to then find its visual correspondence on the shapefile of the buildings of Paris. We chose these points according to recognizable buildings which exists on both maps. In the end, a total of 6 coordinate points in order to launch the georeferencing. The transformation we then performed was generated with a Polynomiale 1 and a .tiff file was created. This .tiff file contains both the image and the coordinate information of the raster and can be used as a foundation for the following steps of the project.


===Overlaying the old map on a contemporain basemap===
===Overlaying the old map on a contemporary basemap===
Now that we have our old map as a raster, its image can be easily placed on top of a contemporain map with the help of its coordinates. QGIS3 proposes a feature called XYZ Tiles and which enables to easily add a corresponding basemap from the [https://www.openstreetmap.org/#map=5/51.500/-0.100 "openstreetmap website"].
Now that we have our old map as a raster, its image can be easily placed on top of a contemporary map with the help of its coordinates. QGIS3 proposes a feature called XYZ Tiles and which enables to easily overlay the raster on a tile layer, a corresponding basemap, from the [https://www.openstreetmap.org/#map=5/51.500/-0.100 "openstreetmap website"].


===Extracting information===
===Extracting information===
Based on the raster created, the extraction of information can be performed. In order to extract the different stations, a Shapefile has to be created. A Shapefile, already introduced previously for the georeferencing, is a set of different files, .shp, .dbf, .shx, .prj, which all contains information that enables to create a map, with a coordinate system and a visual display via points, vectors or polygons. When creating a new layer Shapefile with QGIS, the name has to specified as well as the type of geometry aforesaid that can either be points, lines or polygons. Then, a list of fields can be define. This list of field is later filled for each point created and can constitute a database.  
Based on the raster created, the extraction of information can be performed. In order to extract the different stations, a shapefile has to be created. A shapefile, already introduced previously for the georeferencing, is a set of different files, .shp, .dbf, .shx, .prj, which all contains information that enables to create a map, with a coordinate system and a visual display via points, vectors or polygons. When creating a new layer shapefile with QGIS, the name has to specified as well as the type of geometry aforesaid that can either be points, lines or polygons. Then, a list of fields can be define. This list of field is later filled for each point created and can constitute a database.  
====Railway stations====
====Railway stations====
For the layer containing information about the stations, the type of geometry used is the point. We also added two fields of information. A first string field called Name, in order to store the name of the station, and a second string field called Ligne, in order to store the name of the line the station belongs to. Then, each station visible on the old Metropolitan map from Paris was selected with a point and given a Name and a Ligne accordingly.
For the layer containing information about the stations, the type of geometry used is the point. We also added different fields of information for a future database discussed [[#Creating a database|below]]. Among others, a first string field called stop_name, in order to store the name of the station, and a second string field called stop_line, in order to store the name of the line the station belongs to. Then, each station visible on the old Metropolitan map from Paris was selected with a point and given a name and a line accordingly.
====Railway lines====
====Railway lines====
We then created a second layer of Shapefile for the lines of the metro. The geometry used for this purpose is the line and the string field Name was created. In order to map the vectors of the line with the exact coordinate of the stations previously set and create a coherent path, the snapping feature of QGIS3 <ref>QGIS 2.18 Documentation [https://docs.qgis.org/2.18/en/docs/user_manual/working_with_vector/editing_geometry_attributes.html "Editing"], last accessed on 2018-11-05</ref> was enabled. The snapping enables to select a perimeter of anchor of a point on a layer. When drawing a line, existing points are then working as magnets and attract the line in order to make it pass through it. Thus, the 6 different lines of the Metropolitan of Paris from 1908, namely line 1, 2 Nord and 2 Sud, 3, 4, 5, 6, where created and passing through the exact coordinates of the stations previously established.
We then created a second layer of shapefile for the lines of the metro. The geometry used for this purpose is the line and the string field name was created. In order to map the vectors of the line with the exact coordinate of the stations previously set and create a coherent path, the snapping feature of QGIS3 <ref>QGIS 2.18 Documentation [https://docs.qgis.org/2.18/en/docs/user_manual/working_with_vector/editing_geometry_attributes.html "Editing"], last accessed on 2018-11-05</ref> was enabled. The snapping enables to select a perimeter of anchor of a point on a layer. When drawing a line, existing points are then working as magnets and attract the line in order to make it pass through it. Thus, the 6 different lines of the Metropolitan of Paris from 1908, namely line 1, 2 Nord and 2 Sud, 3, 4, 5, 6, were created and passing through the exact coordinates of the stations previously established.
 
Note that only metro lines were extracted from the maps even if other train lines are present in the maps. For instance, the 1915 map displays a newly constructed '''Nord-Sud''' line but was not part of the Paris Metropolitan until it merged later - visible in the 1950 map. Also, certain lines displayed in the 1915 map are still under construction and hence are not yet displayed in the corresponding map on the [http://valentinebernasconi.ch/paris_metropolitan/index.html website].
 
===Implementing a process for further maps===
As the extraction method presented above is not an expensive process in terms of time, mainly due to the scale of the data, the information from the last two maps from 1915 and 1950 presented in [[#Selected maps|Selected maps]] were also extracted. In order to create a coherent database and a reusable process for later maps, as discussed [[#Creation of a database|below]], the new stations and lines of both maps were extracted in a chronological way. We thus first started with the map from 1915 and determined the new stations that were created in-between 1908 and 1915 thanks to the georeference of the map and the possibility to overlay the previous data extracted. The process consists of three steps :
#Create a new shapefile layer for the stations of the map (here we started with 1915)
#Look for the name changes or station mergers. Add the points of the new stations as well as the different fields of information already used for the first extractions. Same applies with new lines.
#Then, for each station which name changed in-between the current map and the previous georeferenced one (here in-between 1908 and 1915), a new point is added to the current shapefile layer with the information change in the corresponding fields of information.
Thus, each shapefile layer corresponding to each different map (and consequently to different years) contains both the geolocalisation of the newly built stations and the stations whose name changed compared to the last georeferenced map. Layer after layer, we can thus see the metropolitan network growing.
 
== Creation of a database ==
 
A significant aspect of this project is to create a database in order to further work with the data extracted, such as creating interactive maps, and to set the foundation of a reusable tool for later maps and contemporary changes of the Paris Metropolitan. In order to create this database, we used the information added manually for each point entries in QGIS. It was therefore important to create from the beginning the required fields of information in preparation of the database. For the purpose of consistency, the naming convention of the [https://data.ratp.fr/explore RATP] GeoJSON database was implemented and extended for the database of this project. In order to create a coherent database that can handle the adding of new stations, name or line changes and stations merging, the following fields were created:
* '''stop_id''' ''<int>'': this corresponds to the identification number of each platform of a station. Indeed, when a station is affiliated to multiple metro lines, a new entry is created for each line, decomposing the station in different platforms. When a station point is created on a metro line, a stop_id is given with the following format : YYYYE, YYYY being the year of the map, E the unique number of entry of the specific point in the shapefile. When the name of the station is changed, the platform keeps its identification number.
* '''stop_name''' ''<string>'': this corresponds to the name of the station as written on the map the station was extracted from. When the name changes on a new map, the entry corresponding to the station is duplicated, the stop_id remains the same and the stop_name field is changed to the corresponding information
* '''stop_line''' ''<string>'': this corresponds to the name of the line to which the station is affiliated. If there are multiple lines passing through a station, then an entry is created for each platform within the station, as explained above, with a unique stop_id for each of them.
* '''stop_info''' ''<string>'': historical information about the station
* '''start_map''' ''<int>'': this number is under the form YYYY and corresponds to the year of creation of the map from which the station was retrieved. If the station changed name, this field corresponds to the year of the map where the name changed occurred.
* '''end_map''' ''<int>'': this number is under the form YYYY and corresponds to the year of the map where the station name disappears, either because of a merged or because of a name or a line change.
* '''open_date''' ''<date_format>'': this is the actual date of opening of the station, independently from the map it was retrieved. It is under the format YYYY-MM-DD.
* '''close_date''' ''<date_format>'': this is the actual date of closure or name change of the station, if it occurred, and this independently from the map it was retrieved. It is under the format YYYY-MM-DD or NaN, if the station is still in use.
* '''ratp_id''' ''<int>'': this id corresponds to the identification number of the station according to the database provided by the [https://data.ratp.fr/explore RATP]. By adding this identification, we enable the project to be used in varied contexts and to be affiliated to the RATP itself.
* '''coordinates''' ''[<float>, <float>]'': this corresponds to the coordinate of the georeferenced point of the maps with respect to present coordinates of Paris.<br>
 
Thanks to the software QGIS, we were able to store the data for each station when creating its corresponding geolocated point on the shapefile. The whole can be then exported in various format, such as GeoJSON and csv, two formats widely used in the case of databases containing geographic references. We thus decided to export the database of each shapefile in both GeoJSON and csv formats in order to enable various use of the data collected. As we had a set of three different databases, each corresponding to one of the three maps from which the data was extracted, we decided to manually group them in one single file and operate a few changes either manually or with the help of a short program created with Python. Indeed, there were important name changes for a few lines that occurred from 1908 to 1950, for example the line 2 Sud, which became line 5 in-between 1908 and 1915 and line 6 before 1950. As we did not want to manually recreate a point at each location of the stations on the line for each map, the entries corresponding to that line were duplicated and the corresponding '''start_map''' and '''end_map''' were corrected with a few lines of code in Python.<br><br>
 
As a result, we obtain two databases, one for the stations and one for the lines, which is implemented in the same way, and which exist only in a GeoJSON, as it handles better polylines. The database works as following:
Each station was given a '''stop_id''' when first appearing on a map. The '''start_map''' corresponds to the year of the map where the information was retrieved and the '''end_map''', the year of the last map where the information was seen. Each station is affiliated to a unique line which name is stored under '''stop_line'''. If multiple lines pass through a station, then there is a new entry for each lines, thus decomposing the station in different platforms. This is why it is said that the "stop_id" corresponds to a platform for a specific metro line within a station.
===Adding new entries to the database===
If new entries are added to the databases, then they must take into account the following rules:
#If the station is completely new, then a new entry is created with a new '''stop_id''' and '''stop_name'''. The '''start_map''' corresponds to the map from which the information is retrieved. The '''end_map''' corresponds to the last map from which the information is retrieved. This field should be updated to the other existing entires if the stations with the same lines still appear on the last map.
#If the station is still in operation today, the '''ratp_id''', when found, is added to the station. If the station is a ghost station - not in operation anymore - the '''ratp_id''' entry would be empty.
#If the name of a station is changed, then the existing entry is duplicated. The '''stop_id''' remains, the '''stop_name''' is changed, the '''start_map''' is set to the current map from which the information is retrieved and '''end_map''' to the last map in which the information appears.
#If the name of the metro line is changed, then same apply as above, except that it is the '''stop_line''' that changes.
#If a station was merged, then the '''end_map''' of the station remains as the year of the last map from which the information was seen. The '''end_map''' of the main station which absorbed the other is changed to the last map from which the information was retrieved. The '''close_date''', if known, should be set to the accurate year the station was merged.
 
The whole process might be more understandable with the following graph:
<div><ul>
<li style="display: inline-block;">[[File:Database_graph.png|none|800px|]] </li>
</ul></div>
 
===Challenges encountered===
While trying to be consistent with the [https://data.ratp.fr/explore RATP] GeoJSON database, there were some challenges that needed to address between the GeoJSON database and the maps used for georeferencing.
* The '''stop_name''' was not always identical between RATP database and the station name displayed on the maps. This is partially due to the fact that some stations have either changed names, or existing names received add-ons due to possible merging of stations. Another reason is that the RATP database not only includes the Paris Metropolitan but also its RER stations that today go far beyond the Paris city limits into the city's suburbs, and also all the bus stops. While the bus stops in the database are written in capitalized letters (ex. OPERA), the metro stations are written ordinary (ex. Opéra). However, some metro stations had add-ons that had no real logic and given that there was no legend attached to the database, there was no way to explain the choice of station name. For instance, the metro station St. Paul on line 1 had the '''stop_name:''' Saint-Paul (Le Marais). It is not clear why (Le Marais) was added to the station name, since all maps display St. Paul only. The different naming conventions used by RATP were challenging as it was not trivial to retrieve the '''ratp_id''' for this project's database. Stations that did not have identical names between the RATP database and the maps needed to be searched manually to retrieve the corresponding  '''ratp_id''' .
* The RATP database uses a distinct '''stop_id''' for each direction and each line number of the same station. This means, that if a metro station has two lines passing through it, it would have four (4) distinct '''stop_id''', depending on what direction the train is heading (east-west, or north-south). Given the systematic approach to retrieve the RATP's id to map to our '''ratp_id''', one of the distinct IDs was used without specific order. Reason of the '''ratp_id''' is to be able to link our database to RATP's station, and hence any ID would link the '''ratp_id''' to the corresponding RATP metro station.
* There are metro stations that have been closed and never reopened, as well as metro stations that changed their name. In this project's databased, it was decided to avoid complexity by adding  '''rename_date'''  in addition to '''close_date'''. Instead, if a station is renamed, the old-named station would receive a '''close_date''' that corresponds to the date it was renamed, and the renamed '''stop_name''' would receive the '''open_date''' corresponding to the date it was renamed - both of the stations would keep the same '''stop_id''' and coordinates. If a station closed and never reopened, it would receive the corresponding closing date with no duplicate '''stop_id'''. 
* The RATP database allocates various coordinates to the same metro station as there are multiple entry points to the station. However, on metro maps, there is only one single point for each station. Therefore, for the project's database, the coordinate of the georeferenced point of the maps is taken for the '''coordinates''' entry.
 
== Quantitative analysis of the performances of extraction ==
In order to determine the accuracy of the information retrieved from the three maps, two contemporary databases containing geolocation information of RATP stations were used. The first one is a database from the [https://data.iledefrance.fr/page/home/ open data] of the département Ile-de-France which is lead by the communication du Conseil régional d'Île-de-France. The second set of data was retrieved from the [https://dataratp2.opendatasoft.com RATP open data], managed by the RATP itself. We decided to use two sets of data because both contains different geolocations for the stations of the Metropolitan. Furthermore, it seems that the database from the RATP also contains bus stops and all entrances of each Metropolitan station. In order to better determine the accuracy of our results, we compared with both sources.<br><br>
The comparison was made with the help of the software QGIS. Both data set are under the form of geoJSON files and can be imported as shapefiles. Then, based on these shapefiles, a new layer made of buffers was generated for each of them with the MMQGIS plugin. These buffers are in fact circles that are created for each point on the source layer. The radius of these circles was set to 50 meters in order to have a margin of error, as geolocation points already differ from one source to another. Then, the comparison was made manually with the help of a excel file to store the results. If the station extracted from the old maps is located inside the perimeter of 50 meters generated around the station location from the official database, then the extraction is considered as successful. Otherwise, the extraction is incorrect. From this information gathered for each official dataset, the two below sector charts were generated. ''Correct'' represents the percentage of stations that were successfully geolocated, whereas ''incorrect'' are the station that were not within the perimeter created around the official sources. ''Impossible to check'' are the stations retrieved from the old maps that did not have a matching station in the idf or RATP databases. This is mainly due to the fact that some stations closed or were merged with other stations.
<div><ul>
<li style="display: inline-block;">[[File:quantitative_idf_output.png|thumb|none|500px| Results for the performances of geolocations extraction based on the idf dataset]] </li>
<li style="display: inline-block;">[[File:quantitative ratp output.png|thumb|none|500px| Results for the performances of geolocations extraction based on the RATP data]] </li>
</ul></div>
As we can see from these charts, there is a great difference between both sets. The idf file is exclusively related to the Metropolitan stations and the extraction from the old maps cannot be considered as successful. On the other hand, the RATP file provides many different positions for each station and also includes bus stations. As it is very difficult to differentiate them, see [[#Challenges encountered | Challenges encountered]], all locations were used. Thanks to these many locations, the data extracted from the maps seem more successful, with a total of 67% correct entries. However, if we expand the perimeter around the geolocated points of the idf database to 100 meters, the following results are obtained.
<div><ul>
<li style="display: inline-block;">[[File:Quantitative_idf_100_output.png|thumb|none|500px| Results for the performances of geolocations extraction based on the idf database and a 100 meter perimeter]] </li>
</ul></div>
The success percentage increases from 47 to 70%, which shows that most entries extracted from the map suffer from a lack of accuracies rather than from a big mispositioning.
In a general way, we can say that either the way we extracted the data by georeferencing the maps in QGIS was not accurate enough, or that the maps themselves were not very precise. But as we checked the accuracy of the different entries, it seemed that there was proportionally no fewer mistakes with the map of 1950 (52 incorrect entries out of 156 entries) than with the oldest one from 1908 (40 incorrect entries out of 118 entries). We can thereby not conclude that the maps were not accurate, as one could consider the map from 1950 as being more precise than the first one. However, we saw that the greater mistakes, namely stations that were located way much more away than the official sources, were all peripheral stations and that the center of the maps generally benefited from a better accuracy. Another hypothesis that could explain these mistakes and the correlation with their position on the maps is the Coordinate Reference System (CRS) that was used for the project. Indeed, as we were working with french documents, we used the RGF93, which is the french standard. However, the RGF93 system was officially adopted in 1993<ref>Institut National de l'Information Géographique et forestière (IGN), [https://geodesie.ign.fr/index.php?page=rgf93 ''Le RGF93 | Géodésie''], last accessed on 2018-12-14</ref> (hence 93), which means that the three maps, as they were created before the convention, do not follow in the main place the french CRS that we used for the project. Therefrom, it is important to make a good georeferencing, with sufficient points. After a brief analysis of the file containing the different points used for the georeferencing, we noticed that most corresponding points used to accurately place the old maps in RGF93 space were located at the center of the maps. The reason we worked with central locations is due to the fact that it is where most famous buildings are and they are easier to match with the current representation of the city. The combination of these elements could thus explain the misposition of the peripheral stations and emphasize on the importance of a good georeferencing, with points from various positions on the old map.<br><br>
The accuracy of the Paris Metropolitan station names was determined by taking a random sample of 45 stations from the 1950 map and compare them to the names of that in the database. When following a strict comparison of the station names, the accuracy is 71.1%. The relatively low accuracy is due various reasons:
*The map might have shortened certain names: ''Saint-Martin'' as ''St-Martin'', or ''Notre-Dame des Champs'' as ''N.D. des Champs''.
*Certain stations have accents on top of certain letters while the database does not, and vice versa.
When shortened names and accents are ignored, the accuracy increases to 97.8%.
 
== Motivation and description of the demonstration of the data collected ==
The service provided with this extraction of data and the creation of a database are two interactive maps, available on a website, that enable the user to see the evolution of the Metropolitan of Paris based on the three key dates that are 1908, 1915 and 1950. Indeed, as we have seen on the three maps, there is a quick evolution of the Metropolitan in-between 1908 and 1915, as these years correspond to the period before the first World War and already reflects the first changes due to it, such as the name change of the Jean-Jaurès station. The last year studied, 1950, is interesting as it shows how the network evolved and grew after undergoing two World Wars and the german occupation during the last one. Names of stations changed in honor of soldiers or because of the aftereffects of political dissatisfactions. These changes in time can also be well seen thanks to the third map that displays chronologically the creation of the stations. The Metropolitan of Paris is moreover a reflection of the evolution of the city itself and the rest of the information provided by the maps has to be taken into account as well, hence the possibility to have a side-by-side synchronized visualization of the actual map of Paris and the three old maps.<br>
===Creation of interactive maps===
The creation of interactive maps is made possible thanks to the structure of the database, which emphasize on storing the year of the maps from which each information was retrieved. A total of three maps were generated in Python with the Folium library, supported by leaflet, and later edited in Javascript for an html integration. Indeed, the final goal was to display the maps on an html page and Javascript code had to be generated in order to display the interactive information on a web browser.<br><br>
As explained above, the first leaflet map generated enables to overlay the three networks retrieved from each historical document over an openstreet map. The user can thus select a specific network from either 1908, 1915 and 1950, and also superimpose them in order to compare the growing networks, determine which stations closed or if a path changed. In order to differentiate them, each network has a specific color. Green for the network of 1908, blue for 1915 and red for 1950. These colors were randomly given and simply help a better visual separation of the three sets of data. In order to display the information, each station is represented as a marker on the map and has a popup window that becomes visible when the user clicks on it. The information displayed on the popup window is the name of the station, the name of the line it is attached to, its official opening date, which does not correspond to the map date, the historical information of the station if it exists and the corresponding ratp identification number.<br><br>
The second map is built based on the first map. It is in fact perpendicularly split in two, with, on the left side, the integration of the overlay of networks and, on the right side, the possibility to display one of the three historical map. Both windows work as one in term of interactions as the zooming on one of the map will affect both, as well as the movements of exploration within a map. With this map, the user can thus explore an old map and compare in real time the data it contains with the contemporary map of the left window and the different information retrieved from all other maps. In the specific case of the maps chosen, this feature is really interesting, as the old maps contain a city plan of Paris. In addition to the possibility to see the evolution of the Paris Metropolitan, the user can as well explore the evolutions that occurred within the buildings of the city.<br><br>
The last third map created was made thanks to a plugin called TimestampedGeoJson, which enables to reveal in a chronological order the different stations. The map is provided with a slider at the bottom left of the window, which displays the current date of the information displayed. There is a play button that launches the progression through time and a slider bar that the user can move according to the date he wants to reach. As the plugin is not working very well yet, the database had to be chronologically sorted and only the name of the stations are displayed in the popup window. The date presented at the bottom left of the window only includes the year and the month, as the display of the day was not working. It is important to acknowledge that the dates used here are not the dates retrieved from the maps, but information that was added a posteriori to the database and based on Wikipedia sources. The accuracy of this information is not certified, but the overall display enables to better understand major periods of constructions, such as the first years before the first World War and different intervals of years in-between the two World Wars. It also helps visualize the way the network grew and which parts of the city were prioritized.


== Quantitive analysis of the performances of extraction ==
===Creation of a website===
Thanks to the creation of a website, anyone can access the data extracted and, as a further possible development of the project, immerse himself in the history of the Paris Metropolitan. The website is also a good platform to exchange and bring interest to the public. A future goal would be to find new collaborators and to extend the database with other more recent maps and further information for the existing stations. The project is as such not finished and what we propose here is a foundation that is waiting for more contributions.


== Motivation and description of the services ==
The website is thus articulated around four pages. The index page has a direct link to this wiki page, describing in depth the project. A second page, called '''An Evolution''' displays the first map, which shows the three networks as overlays or separately. The third page, '''Navigation Through Time''', proposes on a full page the second map that is divided in two synchronized windows, with, on the right, the possibility to see one of the old map. Thanks to the possible comparison with the contemporary map of Paris that is displayed on the left window, the visitor can easily determine the changes that occurred through time with a Metropolitan perspective, but also with other urban changes, such as buildings, as the old maps also contain this information. The last page, called '''A Growing Network''', shows the last map with the apparition of the stations through time. This last page can be considered as a nice conclusion to the overall project and previous observations of the visitor made from his navigation through the previous maps. It is the possibility to step back after having been able to navigate through old and recent maps, and to simply contemplate the evolution in a passive way. Certain stations are displayed more than once to reflect multiple lines passing through the station.<br><br>
The website is now [http://valentinebernasconi.ch/paris_metropolitan/index.html online].


== Selected maps ==
== Selected maps ==
Line 70: Line 167:


== References ==
== References ==
* Michel Dansel, ''Paris-Metro'', Editions du Dauphin,  1975.
* Julian Pepinster, ''Le métro de Paris'', Editions La Vie du Rail,  2010.
* Armand Bindi, Daniel Lefeuvre, ''Le Métro de Paris, Histoire d'hier à demain'', Editions Ouest-France,  1990.
* Le, Nevez C, Christopher Pitts, and Nicola Williams. Paris , 2017.
* Robb, Graham. Parisians: An Adventure History of Paris. WW Norton & Company, 2010.
* Ladonne, Jennifer, Linda Hervieux, Nancy Heslin, Victoria Tang, and Jack Vermee. Fodor's 2015 Paris, 2014
* RATP, METRO.PARIS. Retrieved from http://metro.paris/en/
* Régie autonome des transports parisiens (RATP). Découvrez notre patrimoine. Retrieved from https://www.ratp.fr/lignesdhistoires/
* New York Times. 100 Dead in the Paris Disaster. New York Times, 1903. Retrieved from https://timesmachine.nytimes.com/timesmachine/1903/08/12/102017333.pdf
*  "Historique du métro parisien" [archive], on histoire-en-ligne.com via web.archive.org, article of October 22, 2002, modified on September 17, 2006. Retrieved from https://web.archive.org/web/20071014110917/http://www.histoire-en-ligne.com/spip.php?article17

Latest revision as of 20:38, 14 December 2018

Definition of the project

The group first selected a range of different maps showing the Paris Metropolitan, also simply called metro, at different years of the last century. In total, we collected from Gallica a set of two maps of the planning of the metro, from the definition of the routes to the addition of stations, a first map from 1908 of the actual metro after its construction in 1900, a second map from 1915, with already visible impacts of the first World War, and a third map from 1950, a more contemporary look at the metro as we know it today. Our first idea was to analyze these maps in order to understand the evolution of the Paris Metropolitan, how different areas of major cultural attractions evolved around or hand in hand with the metro stations and how it was impacted by catastrophic events such as wars. However, as the goal of the project is to produce a working interface within a short amount of time, we decided to reduce the work of extraction of data to three maps, which are the first map from 1908, the map from 1915 and the last one from 1950. Based on these maps, we thus intend to build an overlay of a current map from Paris and the metro network extracted from the old maps. From this visual display, will be able to see the evolution of the Paris Metropolitan from 1908 until nowadays and important historical explanations will be linked to corresponding stations. The result should take the form of a website page that displays the interactive map. The users would have the possibility to display the different layers, namely the layer of the old maps, their metro networks, with stations and lines, and this on top of a current map of Paris. Popup windows on each stations would display specific historical information about the station, which would be linked to the sources.
It is based on this prototype that the possibility to conduct similar data extractions on other maps can be considered. The project thus intend to propose a solid foundation, with a database and a display framework, to future similar data extractions from old metro maps.

GitHub repository

In order to achieve this project, a GitHub repository was created as well as a website to display the results. However, as the project handles very large files, namely the three old maps, some files used for the data extraction in QGIS could not have been added to the GitHub repository.

Main steps

  • Download in high resolution the different maps
  • Create a list of all stations from the first map of 1908
  • Determine the main cultural attractions around these stations
  • Georeferencing the map from 1908
  • Create maps alignment with a contemporain map of Paris
  • Extract paths and stations from the map
  • Determine or extract coordinates of the different stations
  • Create a Database with all the information gathered for each station from 1908
  • Compare the first path and stations from 1908 with later representations of the Paris Metropolitan
  • Create a website in order to display the maps in an interactive way
    • pop up windows on specific points, such as stations with strong historical backgrounds
    • an overlay of maps in order to better see the evolution from 1908 to more recent years

Milestones

Week 9 (14.11 - 16.11):

  • Georeference, alignment with contemporain map and extraction of the paths and stations
  • Preparation of the structure of a GeoJSON database
  • Preparation of the midterm presentation
  • Finalisation of milestones

Week 10 (21.11 - 23.11):

  • Analyse of the evolution of the Paris Metropolitan based on the information gathered
  • Finish Database in GeoJSON
  • Finish writing the description of the extraction methods
  • Planning for the creation of a website

Week 11 (28.11 - 30.11):

  • Creation of the website to display the data
  • Implementation of an interactive map

Week 12 (5.12 - 7.12):

  • Finalization of the project
  • Finish writing the report (historical introduction to the map, analysis of the performance of extraction, motivation and description of the services)

Historical introduction to the map

The maps used for the project were all published by A. Taride. The publishing house was founded by Alphonse Taride in 1852 [1] in Paris and was one of the first to create road, tourist and school maps. The group grew in 1895 when, helped by the "Union vélocipédique de France" and engineers from "Les ponts et chaussées", they printed their first maps at the scale of 1/25 000 000 [2]. These maps were considered as a reference in Europe and North Africa until 1930 [3]. Among their collection of vélocipédiques maps, they also proposed a range of Paris metropolitan maps and touristic guides, translated in different languages [4] . Unfortunatelly, the firm is on a wane after the second World War and reduce his work to Paris maps and globes. The firm was apparently later redeemed by the last french producer of globes.
Based on the maps, we can see that the publishing house was first located at 18 and 20 Boulevard Saint Denis (described on the 1908 and 1915 maps) and then moved to 154 Boulevard Saint Germain, as we can see on the 1950 map. They are all printed in color and display the city of Paris overlaid by the metropolitan network. They all provide a legend with the definition of the different paths and stations. The map from 1908 is at a scale of 1/8000, the map from 1915 of 1/21 000 and the one from 1950, at a scale of 1/33 000.

All the maps chosen represent the path of the Metropolitan of Paris overlaid on the city map. The first line of the Metropolitan of Paris was built in 1900 for the Paris Exposition Universelle. However, the race to build a network of railway transports started way before, around 1845, as many capital cities were considering the possibility to develop such transport systems, such as London, who built the world's first underground railway in 1890. Unfortunately, the project remained a subject of discussions during several years as the city of Paris wanted to build his own network for its inhabitants and the national railway services wanted to extend existing transports for people around the big metropole. The upcoming Exposition Universelle and the developments of such innovative means of transports in big capitals finally weighted on the balance to create an underground railway network specifically for the city of Paris [5]. The construction of the network was given to the engineer Fulgence Bienvenüe, the "Père Métro", who almost devoted his entire career to the project. The first planifications of the Metropolitan provided a total of six lines, labelled from A to F, and 18 stations[6]. The construction of the first line, from Porte de Vincennes to Porte Dauphine and which was called line A at the time of the planification, started in November 1898 [7]. According to the law from March 1898 enabling the construction of the Metropolitan[8], the tracks had to cross one above the other, the total length of the trains was set to a maximum of 72 meters and the platforms to 75 meters. The city transformed into a big construction site and first trains were already tested in December 1899. However, it is only the year after, on July 19th 1900, and after the first months of the Exposition Universelle that the inauguration of the first line discreetly took place. Despite its official introduction away from the crowds, the press craze the morning after enabled the Metropolitan to benefit from a successful start. At rush hours, the underground network had a frequency of one train every 10 minutes with a constant speed of 25km per hours, whereas other means of transport at the surface could not go faster than 10 km per hours[9] ! The construction of the following lines from B-1, B-2 to line C, actual line 3 who was at the time passing through different business districts, and the remaining stations of the first plans took place in the following years and finished before the first World War. Unfortunately, due to the latter, further improvements of the network were slowed down and constructions only restarted from 1925[10]. Another big upheaval, is the creation of the Régie Autonomie des Transports Parisiens (RATP) on January 1st 1949[11]. The creation of the famous control office that is still in charge of the network nowadays is due to important political and social changes that occurred at the end of the second World War.

The chart below displays the construction of the Paris Metropolitan from its foundation to the last map selected in 1950. It can be observed that the vast majority of the metro stations were constructed before the first World War as well as in between the two wars. The construction during the wars was significantly reduced as resources might have gone into military.

  • Distribution of the construction of the Paris Metropolitan based on major historical events like WW1 and WW2

Detailed description of the extraction methods

In order to extract information from the map, we decided to work with the open source software QGIS3.
This software is a powerful GIS platform which enables to georeference maps, draw points, lines and polygons in order to create new and interactive maps or to enhance and analyze existing ones. It benefits from a good community of users, with even a specific associative platform for Switzerland.
The extraction of the information from the Paris map of 1908 required different steps, from the georeferencing to the extraction of the metropolitan network itself.

Georeferencing

The first step of the work with the 1908 map consists of georeferencing it, namely register it with a coordinate system in order to map it with a location on the surface of the earth [12]. Indeed, the basic map of the Metropolitan of Paris is just an image file that does not contain any geographic information such as coordinates. In order to create these coordinates, we had to work with a shapefile of the buildings of the city of Paris. These resources are easily found "online", as their are freely made available by the city of Paris [13]. This shapefile is first imported in the new project created in QGIS. Then, the CRS (Coordonate Reference System) we are working with in the context of the project has to be correctly set to the french CRS, called RGF93. After having synchronized the layers with this CRS, the plugin Géoréférenceur GDAL[14] is used. It consists of a window, from which our image, the Paris map of 1908, is imported as a raster and displayed. A raster is an image file to which is added georeferencing information [15]. In order to determine the right coordinates on the image of 1908, a tool from the plugin enables to find matching points from the image to the shapefile of the city of Paris. The work consists of selecting a point on the image and to then find its visual correspondence on the shapefile of the buildings of Paris. We chose these points according to recognizable buildings which exists on both maps. In the end, a total of 6 coordinate points in order to launch the georeferencing. The transformation we then performed was generated with a Polynomiale 1 and a .tiff file was created. This .tiff file contains both the image and the coordinate information of the raster and can be used as a foundation for the following steps of the project.

Overlaying the old map on a contemporary basemap

Now that we have our old map as a raster, its image can be easily placed on top of a contemporary map with the help of its coordinates. QGIS3 proposes a feature called XYZ Tiles and which enables to easily overlay the raster on a tile layer, a corresponding basemap, from the "openstreetmap website".

Extracting information

Based on the raster created, the extraction of information can be performed. In order to extract the different stations, a shapefile has to be created. A shapefile, already introduced previously for the georeferencing, is a set of different files, .shp, .dbf, .shx, .prj, which all contains information that enables to create a map, with a coordinate system and a visual display via points, vectors or polygons. When creating a new layer shapefile with QGIS, the name has to specified as well as the type of geometry aforesaid that can either be points, lines or polygons. Then, a list of fields can be define. This list of field is later filled for each point created and can constitute a database.

Railway stations

For the layer containing information about the stations, the type of geometry used is the point. We also added different fields of information for a future database discussed below. Among others, a first string field called stop_name, in order to store the name of the station, and a second string field called stop_line, in order to store the name of the line the station belongs to. Then, each station visible on the old Metropolitan map from Paris was selected with a point and given a name and a line accordingly.

Railway lines

We then created a second layer of shapefile for the lines of the metro. The geometry used for this purpose is the line and the string field name was created. In order to map the vectors of the line with the exact coordinate of the stations previously set and create a coherent path, the snapping feature of QGIS3 [16] was enabled. The snapping enables to select a perimeter of anchor of a point on a layer. When drawing a line, existing points are then working as magnets and attract the line in order to make it pass through it. Thus, the 6 different lines of the Metropolitan of Paris from 1908, namely line 1, 2 Nord and 2 Sud, 3, 4, 5, 6, were created and passing through the exact coordinates of the stations previously established.

Note that only metro lines were extracted from the maps even if other train lines are present in the maps. For instance, the 1915 map displays a newly constructed Nord-Sud line but was not part of the Paris Metropolitan until it merged later - visible in the 1950 map. Also, certain lines displayed in the 1915 map are still under construction and hence are not yet displayed in the corresponding map on the website.

Implementing a process for further maps

As the extraction method presented above is not an expensive process in terms of time, mainly due to the scale of the data, the information from the last two maps from 1915 and 1950 presented in Selected maps were also extracted. In order to create a coherent database and a reusable process for later maps, as discussed below, the new stations and lines of both maps were extracted in a chronological way. We thus first started with the map from 1915 and determined the new stations that were created in-between 1908 and 1915 thanks to the georeference of the map and the possibility to overlay the previous data extracted. The process consists of three steps :

  1. Create a new shapefile layer for the stations of the map (here we started with 1915)
  2. Look for the name changes or station mergers. Add the points of the new stations as well as the different fields of information already used for the first extractions. Same applies with new lines.
  3. Then, for each station which name changed in-between the current map and the previous georeferenced one (here in-between 1908 and 1915), a new point is added to the current shapefile layer with the information change in the corresponding fields of information.

Thus, each shapefile layer corresponding to each different map (and consequently to different years) contains both the geolocalisation of the newly built stations and the stations whose name changed compared to the last georeferenced map. Layer after layer, we can thus see the metropolitan network growing.

Creation of a database

A significant aspect of this project is to create a database in order to further work with the data extracted, such as creating interactive maps, and to set the foundation of a reusable tool for later maps and contemporary changes of the Paris Metropolitan. In order to create this database, we used the information added manually for each point entries in QGIS. It was therefore important to create from the beginning the required fields of information in preparation of the database. For the purpose of consistency, the naming convention of the RATP GeoJSON database was implemented and extended for the database of this project. In order to create a coherent database that can handle the adding of new stations, name or line changes and stations merging, the following fields were created:

  • stop_id <int>: this corresponds to the identification number of each platform of a station. Indeed, when a station is affiliated to multiple metro lines, a new entry is created for each line, decomposing the station in different platforms. When a station point is created on a metro line, a stop_id is given with the following format : YYYYE, YYYY being the year of the map, E the unique number of entry of the specific point in the shapefile. When the name of the station is changed, the platform keeps its identification number.
  • stop_name <string>: this corresponds to the name of the station as written on the map the station was extracted from. When the name changes on a new map, the entry corresponding to the station is duplicated, the stop_id remains the same and the stop_name field is changed to the corresponding information
  • stop_line <string>: this corresponds to the name of the line to which the station is affiliated. If there are multiple lines passing through a station, then an entry is created for each platform within the station, as explained above, with a unique stop_id for each of them.
  • stop_info <string>: historical information about the station
  • start_map <int>: this number is under the form YYYY and corresponds to the year of creation of the map from which the station was retrieved. If the station changed name, this field corresponds to the year of the map where the name changed occurred.
  • end_map <int>: this number is under the form YYYY and corresponds to the year of the map where the station name disappears, either because of a merged or because of a name or a line change.
  • open_date <date_format>: this is the actual date of opening of the station, independently from the map it was retrieved. It is under the format YYYY-MM-DD.
  • close_date <date_format>: this is the actual date of closure or name change of the station, if it occurred, and this independently from the map it was retrieved. It is under the format YYYY-MM-DD or NaN, if the station is still in use.
  • ratp_id <int>: this id corresponds to the identification number of the station according to the database provided by the RATP. By adding this identification, we enable the project to be used in varied contexts and to be affiliated to the RATP itself.
  • coordinates [<float>, <float>]: this corresponds to the coordinate of the georeferenced point of the maps with respect to present coordinates of Paris.

Thanks to the software QGIS, we were able to store the data for each station when creating its corresponding geolocated point on the shapefile. The whole can be then exported in various format, such as GeoJSON and csv, two formats widely used in the case of databases containing geographic references. We thus decided to export the database of each shapefile in both GeoJSON and csv formats in order to enable various use of the data collected. As we had a set of three different databases, each corresponding to one of the three maps from which the data was extracted, we decided to manually group them in one single file and operate a few changes either manually or with the help of a short program created with Python. Indeed, there were important name changes for a few lines that occurred from 1908 to 1950, for example the line 2 Sud, which became line 5 in-between 1908 and 1915 and line 6 before 1950. As we did not want to manually recreate a point at each location of the stations on the line for each map, the entries corresponding to that line were duplicated and the corresponding start_map and end_map were corrected with a few lines of code in Python.

As a result, we obtain two databases, one for the stations and one for the lines, which is implemented in the same way, and which exist only in a GeoJSON, as it handles better polylines. The database works as following: Each station was given a stop_id when first appearing on a map. The start_map corresponds to the year of the map where the information was retrieved and the end_map, the year of the last map where the information was seen. Each station is affiliated to a unique line which name is stored under stop_line. If multiple lines pass through a station, then there is a new entry for each lines, thus decomposing the station in different platforms. This is why it is said that the "stop_id" corresponds to a platform for a specific metro line within a station.

Adding new entries to the database

If new entries are added to the databases, then they must take into account the following rules:

  1. If the station is completely new, then a new entry is created with a new stop_id and stop_name. The start_map corresponds to the map from which the information is retrieved. The end_map corresponds to the last map from which the information is retrieved. This field should be updated to the other existing entires if the stations with the same lines still appear on the last map.
  2. If the station is still in operation today, the ratp_id, when found, is added to the station. If the station is a ghost station - not in operation anymore - the ratp_id entry would be empty.
  3. If the name of a station is changed, then the existing entry is duplicated. The stop_id remains, the stop_name is changed, the start_map is set to the current map from which the information is retrieved and end_map to the last map in which the information appears.
  4. If the name of the metro line is changed, then same apply as above, except that it is the stop_line that changes.
  5. If a station was merged, then the end_map of the station remains as the year of the last map from which the information was seen. The end_map of the main station which absorbed the other is changed to the last map from which the information was retrieved. The close_date, if known, should be set to the accurate year the station was merged.

The whole process might be more understandable with the following graph:

  • Database graph.png

Challenges encountered

While trying to be consistent with the RATP GeoJSON database, there were some challenges that needed to address between the GeoJSON database and the maps used for georeferencing.

  • The stop_name was not always identical between RATP database and the station name displayed on the maps. This is partially due to the fact that some stations have either changed names, or existing names received add-ons due to possible merging of stations. Another reason is that the RATP database not only includes the Paris Metropolitan but also its RER stations that today go far beyond the Paris city limits into the city's suburbs, and also all the bus stops. While the bus stops in the database are written in capitalized letters (ex. OPERA), the metro stations are written ordinary (ex. Opéra). However, some metro stations had add-ons that had no real logic and given that there was no legend attached to the database, there was no way to explain the choice of station name. For instance, the metro station St. Paul on line 1 had the stop_name: Saint-Paul (Le Marais). It is not clear why (Le Marais) was added to the station name, since all maps display St. Paul only. The different naming conventions used by RATP were challenging as it was not trivial to retrieve the ratp_id for this project's database. Stations that did not have identical names between the RATP database and the maps needed to be searched manually to retrieve the corresponding ratp_id .
  • The RATP database uses a distinct stop_id for each direction and each line number of the same station. This means, that if a metro station has two lines passing through it, it would have four (4) distinct stop_id, depending on what direction the train is heading (east-west, or north-south). Given the systematic approach to retrieve the RATP's id to map to our ratp_id, one of the distinct IDs was used without specific order. Reason of the ratp_id is to be able to link our database to RATP's station, and hence any ID would link the ratp_id to the corresponding RATP metro station.
  • There are metro stations that have been closed and never reopened, as well as metro stations that changed their name. In this project's databased, it was decided to avoid complexity by adding rename_date in addition to close_date. Instead, if a station is renamed, the old-named station would receive a close_date that corresponds to the date it was renamed, and the renamed stop_name would receive the open_date corresponding to the date it was renamed - both of the stations would keep the same stop_id and coordinates. If a station closed and never reopened, it would receive the corresponding closing date with no duplicate stop_id.
  • The RATP database allocates various coordinates to the same metro station as there are multiple entry points to the station. However, on metro maps, there is only one single point for each station. Therefore, for the project's database, the coordinate of the georeferenced point of the maps is taken for the coordinates entry.

Quantitative analysis of the performances of extraction

In order to determine the accuracy of the information retrieved from the three maps, two contemporary databases containing geolocation information of RATP stations were used. The first one is a database from the open data of the département Ile-de-France which is lead by the communication du Conseil régional d'Île-de-France. The second set of data was retrieved from the RATP open data, managed by the RATP itself. We decided to use two sets of data because both contains different geolocations for the stations of the Metropolitan. Furthermore, it seems that the database from the RATP also contains bus stops and all entrances of each Metropolitan station. In order to better determine the accuracy of our results, we compared with both sources.

The comparison was made with the help of the software QGIS. Both data set are under the form of geoJSON files and can be imported as shapefiles. Then, based on these shapefiles, a new layer made of buffers was generated for each of them with the MMQGIS plugin. These buffers are in fact circles that are created for each point on the source layer. The radius of these circles was set to 50 meters in order to have a margin of error, as geolocation points already differ from one source to another. Then, the comparison was made manually with the help of a excel file to store the results. If the station extracted from the old maps is located inside the perimeter of 50 meters generated around the station location from the official database, then the extraction is considered as successful. Otherwise, the extraction is incorrect. From this information gathered for each official dataset, the two below sector charts were generated. Correct represents the percentage of stations that were successfully geolocated, whereas incorrect are the station that were not within the perimeter created around the official sources. Impossible to check are the stations retrieved from the old maps that did not have a matching station in the idf or RATP databases. This is mainly due to the fact that some stations closed or were merged with other stations.

  • Results for the performances of geolocations extraction based on the idf dataset
  • Results for the performances of geolocations extraction based on the RATP data

As we can see from these charts, there is a great difference between both sets. The idf file is exclusively related to the Metropolitan stations and the extraction from the old maps cannot be considered as successful. On the other hand, the RATP file provides many different positions for each station and also includes bus stations. As it is very difficult to differentiate them, see Challenges encountered, all locations were used. Thanks to these many locations, the data extracted from the maps seem more successful, with a total of 67% correct entries. However, if we expand the perimeter around the geolocated points of the idf database to 100 meters, the following results are obtained.

  • Results for the performances of geolocations extraction based on the idf database and a 100 meter perimeter

The success percentage increases from 47 to 70%, which shows that most entries extracted from the map suffer from a lack of accuracies rather than from a big mispositioning. In a general way, we can say that either the way we extracted the data by georeferencing the maps in QGIS was not accurate enough, or that the maps themselves were not very precise. But as we checked the accuracy of the different entries, it seemed that there was proportionally no fewer mistakes with the map of 1950 (52 incorrect entries out of 156 entries) than with the oldest one from 1908 (40 incorrect entries out of 118 entries). We can thereby not conclude that the maps were not accurate, as one could consider the map from 1950 as being more precise than the first one. However, we saw that the greater mistakes, namely stations that were located way much more away than the official sources, were all peripheral stations and that the center of the maps generally benefited from a better accuracy. Another hypothesis that could explain these mistakes and the correlation with their position on the maps is the Coordinate Reference System (CRS) that was used for the project. Indeed, as we were working with french documents, we used the RGF93, which is the french standard. However, the RGF93 system was officially adopted in 1993[17] (hence 93), which means that the three maps, as they were created before the convention, do not follow in the main place the french CRS that we used for the project. Therefrom, it is important to make a good georeferencing, with sufficient points. After a brief analysis of the file containing the different points used for the georeferencing, we noticed that most corresponding points used to accurately place the old maps in RGF93 space were located at the center of the maps. The reason we worked with central locations is due to the fact that it is where most famous buildings are and they are easier to match with the current representation of the city. The combination of these elements could thus explain the misposition of the peripheral stations and emphasize on the importance of a good georeferencing, with points from various positions on the old map.

The accuracy of the Paris Metropolitan station names was determined by taking a random sample of 45 stations from the 1950 map and compare them to the names of that in the database. When following a strict comparison of the station names, the accuracy is 71.1%. The relatively low accuracy is due various reasons:

  • The map might have shortened certain names: Saint-Martin as St-Martin, or Notre-Dame des Champs as N.D. des Champs.
  • Certain stations have accents on top of certain letters while the database does not, and vice versa.

When shortened names and accents are ignored, the accuracy increases to 97.8%.

Motivation and description of the demonstration of the data collected

The service provided with this extraction of data and the creation of a database are two interactive maps, available on a website, that enable the user to see the evolution of the Metropolitan of Paris based on the three key dates that are 1908, 1915 and 1950. Indeed, as we have seen on the three maps, there is a quick evolution of the Metropolitan in-between 1908 and 1915, as these years correspond to the period before the first World War and already reflects the first changes due to it, such as the name change of the Jean-Jaurès station. The last year studied, 1950, is interesting as it shows how the network evolved and grew after undergoing two World Wars and the german occupation during the last one. Names of stations changed in honor of soldiers or because of the aftereffects of political dissatisfactions. These changes in time can also be well seen thanks to the third map that displays chronologically the creation of the stations. The Metropolitan of Paris is moreover a reflection of the evolution of the city itself and the rest of the information provided by the maps has to be taken into account as well, hence the possibility to have a side-by-side synchronized visualization of the actual map of Paris and the three old maps.

Creation of interactive maps

The creation of interactive maps is made possible thanks to the structure of the database, which emphasize on storing the year of the maps from which each information was retrieved. A total of three maps were generated in Python with the Folium library, supported by leaflet, and later edited in Javascript for an html integration. Indeed, the final goal was to display the maps on an html page and Javascript code had to be generated in order to display the interactive information on a web browser.

As explained above, the first leaflet map generated enables to overlay the three networks retrieved from each historical document over an openstreet map. The user can thus select a specific network from either 1908, 1915 and 1950, and also superimpose them in order to compare the growing networks, determine which stations closed or if a path changed. In order to differentiate them, each network has a specific color. Green for the network of 1908, blue for 1915 and red for 1950. These colors were randomly given and simply help a better visual separation of the three sets of data. In order to display the information, each station is represented as a marker on the map and has a popup window that becomes visible when the user clicks on it. The information displayed on the popup window is the name of the station, the name of the line it is attached to, its official opening date, which does not correspond to the map date, the historical information of the station if it exists and the corresponding ratp identification number.

The second map is built based on the first map. It is in fact perpendicularly split in two, with, on the left side, the integration of the overlay of networks and, on the right side, the possibility to display one of the three historical map. Both windows work as one in term of interactions as the zooming on one of the map will affect both, as well as the movements of exploration within a map. With this map, the user can thus explore an old map and compare in real time the data it contains with the contemporary map of the left window and the different information retrieved from all other maps. In the specific case of the maps chosen, this feature is really interesting, as the old maps contain a city plan of Paris. In addition to the possibility to see the evolution of the Paris Metropolitan, the user can as well explore the evolutions that occurred within the buildings of the city.

The last third map created was made thanks to a plugin called TimestampedGeoJson, which enables to reveal in a chronological order the different stations. The map is provided with a slider at the bottom left of the window, which displays the current date of the information displayed. There is a play button that launches the progression through time and a slider bar that the user can move according to the date he wants to reach. As the plugin is not working very well yet, the database had to be chronologically sorted and only the name of the stations are displayed in the popup window. The date presented at the bottom left of the window only includes the year and the month, as the display of the day was not working. It is important to acknowledge that the dates used here are not the dates retrieved from the maps, but information that was added a posteriori to the database and based on Wikipedia sources. The accuracy of this information is not certified, but the overall display enables to better understand major periods of constructions, such as the first years before the first World War and different intervals of years in-between the two World Wars. It also helps visualize the way the network grew and which parts of the city were prioritized.

Creation of a website

Thanks to the creation of a website, anyone can access the data extracted and, as a further possible development of the project, immerse himself in the history of the Paris Metropolitan. The website is also a good platform to exchange and bring interest to the public. A future goal would be to find new collaborators and to extend the database with other more recent maps and further information for the existing stations. The project is as such not finished and what we propose here is a foundation that is waiting for more contributions.

The website is thus articulated around four pages. The index page has a direct link to this wiki page, describing in depth the project. A second page, called An Evolution displays the first map, which shows the three networks as overlays or separately. The third page, Navigation Through Time, proposes on a full page the second map that is divided in two synchronized windows, with, on the right, the possibility to see one of the old map. Thanks to the possible comparison with the contemporary map of Paris that is displayed on the left window, the visitor can easily determine the changes that occurred through time with a Metropolitan perspective, but also with other urban changes, such as buildings, as the old maps also contain this information. The last page, called A Growing Network, shows the last map with the apparition of the stations through time. This last page can be considered as a nice conclusion to the overall project and previous observations of the visitor made from his navigation through the previous maps. It is the possibility to step back after having been able to navigate through old and recent maps, and to simply contemplate the evolution in a passive way. Certain stations are displayed more than once to reflect multiple lines passing through the station.

The website is now online.

Selected maps

The different maps selected for the project are the following:

References

  1. Babelio, Taride Babelio, last accessed on 2018-11-13
  2. Corpus Cartographique Etampois, Alphonde Taride, Carte routière de l'Etampois en 1914, last accessed on 2018-11-13
  3. Corpus Cartographique Etampois, Alphonde Taride, Carte routière de l'Etampois en 1914, last accessed on 2018-11-13
  4. Babelio, Taride Babelio, last accessed on 2018-11-13
  5. Peter Hall, Underground as City Maker: London Versus Paris, 1863–2013, 2013, pp. 177-183, last accessed on 2018-11-13
  6. Michel Dansel, Paris-Metro, Editions du Dauphin, 1975, p. 23
  7. Michel Dansel, Paris-Metro, Editions du Dauphin, 1975, pp. 23-24
  8. Michel Dansel, Paris-Metro, Editions du Dauphin, 1975, p. 23
  9. Michel Dansel, Paris-Metro, Editions du Dauphin, 1975, p. 28
  10. Michel Dansel, Paris-Metro, Editions du Dauphin, 1975, p. 54
  11. Michel Dansel, Paris-Metro, Editions du Dauphin, 1975, pp. 57-58
  12. GIS Resources, What is Georeferencing?, last accessed on 2018-11-05
  13. Open Data "Open Data | Volumes bâtis - Données géographiques", last accessed on 2018-11-05
  14. QGIS 2.14 documentation "Extension de géoréférencement", last accessed on 2018-11-05
  15. EMSE "Glossaire des SIG - Raster (Format)", last accessed on 2018-11-05
  16. QGIS 2.18 Documentation "Editing", last accessed on 2018-11-05
  17. Institut National de l'Information Géographique et forestière (IGN), Le RGF93 | Géodésie, last accessed on 2018-12-14