Switzerland and the Transatlantic Slavery: Difference between revisions
Yichen.wang (talk | contribs) |
Amina.matt (talk | contribs) |
||
(94 intermediate revisions by 2 users not shown) | |||
Line 2: | Line 2: | ||
In the last decade, the narrative that Switzerland has nothing to do with slave trade, slavery and colonialism has been severely challenged.<ref> David, Thomas, Bouda Etemad, and Janick Marina Schaufelbuehl. 2005. La Suisse et l'esclavage des Noirs. Lausanne (Suisse): Société d'histoire de la Suisse romande. </ref> <ref>Fässler, Hans, and Hans Fässler. 2007. Une Suisse esclavagiste: voyage dans un pays au-dessus de tout soupçon. Paris: Duboiris.. </ref> | In the last decade, the narrative that Switzerland has nothing to do with slave trade, slavery and colonialism has been severely challenged.<ref> David, Thomas, Bouda Etemad, and Janick Marina Schaufelbuehl. 2005. La Suisse et l'esclavage des Noirs. Lausanne (Suisse): Société d'histoire de la Suisse romande. </ref> <ref>Fässler, Hans, and Hans Fässler. 2007. Une Suisse esclavagiste: voyage dans un pays au-dessus de tout soupçon. Paris: Duboiris.. </ref> | ||
Between the 16th and the 19th centuries, there were a number of Swiss involved in slavery, the slave trade, and colonialism activities. Swiss trading companies, banks, city-states, family enterprises, mercenary contractors, soldiers, and private individuals participated in and profited from the commercial, military, administrative, financial, scientific, ideological, and publishing activities necessary for the creation and the maintenance of the Transatlantic slavery economy. In this project, focusing on the [https://en.wikipedia.org/wiki/Caribbean_Community Caribbean Community (CARICOM)] member states, we are interested in discovering the details of the colonial past of Switzerland. | |||
Our primary source is the '''[https://louverture.ch/cca/ CARICOM Compilation Archive]''' written by Hans Fässler, MA Zurich University, a historian from St.Gallen (Switzerland). | Our primary source is the '''[https://louverture.ch/cca/ CARICOM Compilation Archive]''' written by Hans Fässler, MA Zurich University, a historian from St.Gallen (Switzerland). | ||
==Motivation== | ==Motivation== | ||
The CCA(CARICOM Compilation Archive) archive is a single-page website with contents categorized by | The CCA (CARICOM Compilation Archive) archive is a single-page website with contents categorized by colonial location. In the body of the text, each entry concerns a different actor and starts with an arrow. The author Hans Fassler started the compilation about all the Swiss involvements to convince the [https://caricomreparations.org CARICOM Reparations Commission (CRC)] with arguments and material. In June 2019, the CRC was convinced to recommend the heads of the Caribbean Community to add Switzerland to the list for reparation of the colonial activities. | ||
Hans continues updating information about CARICOM and expanding his research to North America and East India and other places. He discussed with us the issue that the website provider is warning about the growing content of CCA. Although the archive is a very informative source about the colonial past of Switzerland, it certainly creates an obstacle for potentially interested readers to learn from it in depth. | |||
The motivation of this project is to discover the previously less known history of Switzerland and provide a framework to visualize the content of the archive in a more accessible and more interactive way. The creation of a structured dataset path the way to quantitative analysis of the data provided by the archive. | |||
In our project, we will extract the following information about each entry in the archive: | |||
* Person's name | * Person's name | ||
* City of origin in Switzerland | * City of origin in Switzerland | ||
Line 16: | Line 18: | ||
* Date of birth and death of the person or the active date in the location | * Date of birth and death of the person or the active date in the location | ||
* Colonial activities that this person was involved | * Colonial activities that this person was involved | ||
The above set of properties has been validated as relevant and valuable information by Hans Fassler. | |||
As we discussed with Hans, he keeps the full content of each entry because it contains more detailed information. We would like to build the map visualization based on the information we extract. This would allow the entries to be easily understandable and interpretable since the map provides geographic information to help readers identify the places. The reader can also have the visual connection between the origin in Switzerland and the colonial locations. Also, based on the information we would like to extract, we can analyze the involvement of the Swiss in the colonial era. | As we discussed with Hans, he keeps the full content of each entry because it contains more detailed information. We would like to build the map visualization based on the information we extract. This would allow the entries to be easily understandable and interpretable since the map provides geographic information to help readers identify the places. The reader can also have the visual connection between the origin in Switzerland and the colonial locations. Also, based on the information we would like to extract, we can analyze the involvement of the Swiss in the colonial era. | ||
Line 26: | Line 29: | ||
Step II : Visualize the connection between Switzerland and Caribbean colonies | Step II : Visualize the connection between Switzerland and Caribbean colonies | ||
Step III : Highlight the material traces | <s>Step III : Highlight the material traces </s> (not enough time to work on) | ||
{|class="wikitable" | {|class="wikitable" | ||
Line 67: | Line 70: | ||
* Set up a website for visualization, mapping information based on geographic location. | * Set up a website for visualization, mapping information based on geographic location. | ||
* Look at pattern in relationship btw colonial and Swiss location over time | * Look at pattern in relationship btw colonial and Swiss location over time | ||
| align="center" |✓ | |||
| align="center" | | |||
|- | |- | ||
|By Week 12 | |By Week 12 | ||
(09.12) | (09.12) | ||
| | | | ||
* Discuss the project with the author of this archive, historian Hans Fässler, get feedback from him. | |||
Step II | Step II | ||
* Work on visualization, link individual/companies colonial location with origin | * Work on visualization, link individual/companies colonial location with origin | ||
* Add a feature to see only the items with material traces | * <s>Add a feature to see only the items with material traces</s> | ||
| align="center" |✓ | |||
| align="center" | | |||
|- | |- | ||
|By Week 13 | |By Week 13 | ||
Line 91: | Line 91: | ||
* Assess the approach of the project. | * Assess the approach of the project. | ||
* Write the report. | * Write the report. | ||
| align="center" | | | align="center" |✓ | ||
|- | |- | ||
|By Week 14 | |By Week 14 | ||
Line 98: | Line 98: | ||
Overall | Overall | ||
* Finish project and website, final presentation | * Finish project and website, final presentation | ||
| align="center" | | | align="center" |✓ | ||
|- | |- | ||
|} | |} | ||
==Methodology== | ==Methodology== | ||
The methodology of our project is divided into three steps: text processing, data enrichment with geographical | The methodology of our project is divided into three steps: text processing, data enrichment with geographical databases and data visualization and analysis. | ||
===Text processing=== | ===Text processing=== | ||
The [https://louverture.ch/cca/ text source] is organized into sections, and within each section the reader can find a list of entries. Most items are separated by a return and an arrow as starting string (=>). Each item references a different actor of colonial entreprise. The first step is to retrieve each item separately and appends its section index. This index is used for colonial location retrieval. Indeed the table of contents is mainly organized by colonial location (some sections don't refer explicitly to geographical location, and are treated separately). | |||
[[File:caricom_sample.png|700px|center|thumb|Extract of the CCA archive webpage. A section title and several items are shown. ]] | |||
The | The processing of the text item itself is done with Natural Language tools as '''NLTK''' for tokenization and '''Stanford NER''' for Named Entities recognition and BIO taggings. | ||
NER-tagging | '''Named Entities recognition (NER)''' is a text processing method that recognizes and tags words referring to named entities. In our case we are interested in using the 'PERSON' tag for (person name or person last name), as well as location (city, region, country) for place of origin and date. In addition, we run [https://en.wikipedia.org/wiki/Inside–outside–beginning_(tagging) '''BIO tagging'''] where the NE are labeled based on their position ('BEGING-INSIDE-OUTSIDE') with respect to others NE. This allows grouping of successive similar tags into a single string. An example for both steps is given below. | ||
'''NER-tagging''' | |||
('=', 'O'), | ('=', 'O'), | ||
('>', 'O'), | ('>', 'O'), | ||
Line 130: | Line 132: | ||
BIO-tagging | '''After BIO-tagging''' | ||
('=', 'O'), | ('=', 'O'), | ||
('>', 'O'), | ('>', 'O'), | ||
Line 143: | Line 145: | ||
The NER isn't completely reliable, and we can already notice some mislabeling, the limitations of NER are discussed in the limitations below. | The NER isn't completely reliable, and we can already notice some mislabeling, the limitations of NER are discussed in the limitations below. | ||
Retrieving relevant informations requires to define which of the persons, locations and date tags are related to the main | Retrieving relevant informations requires to define which of the persons, locations and date tags are related to the main protagonist. Indeed, in the description multiple persons, as relatives or bosses are mentioned, and multiple locations as the location of origin but also brother's baptized location or other less relevant places can be found. In order to sort amongst the possibilities we use pattern matching to match the structure of the different words and tags to a syntax pattern (note that we don't rely only on the tags for pattern matching as their accuracy is low). Our model contains the two schemas described below. With these two schemas we can recognized around 75% of the item retrieved. | ||
[[File:Schemas_pic.png.001.png]] | [[File:Schemas_pic.png.001.png|800px|center|thumb|Syntax structure for our two schemas. The location and date indices are retrieved with pattern matching, the person index is found using NER tag. ]] | ||
The pattern matching is so efficient that it is used solely to retrieve the origin location information. We find the first occurence of the word 'from' and retrieve the next strings as location. Our model accounts for several variation (e.g. | The pattern matching is so efficient that it is used solely to retrieve the origin location information. We find the first occurence of the word '''from''' and retrieve the next strings as location. Our model accounts for several variation (e.g. Lausanne, the City of Geneva, Le Locle). In a similar manner the extraction of the date is also easier with pattern matching. Either we have schema II and the date is the second string in the text, or we have schema I and the date is in between parenthesis between person and location. For both schemas, we retrieve date indicated as range, starting date or ending date. Our model works for a large range of date formatting, 1850-1855, born 1790, after 1878, b. 1989 etc.. | ||
The other information relevant to our dataset is the activities | The other information relevant to our dataset is the activities of the main protagonist. The categorization is difficult as many characters are involved in multiple activities and often their relatives' activities are also related in the description. Family indicators are used to skip irrelevant activities information. The following categories are pertinent, based on our discussion with Hans Fassler and the study of our primary sources : | ||
trading = ['company', 'companies', 'merchants', 'merchant'] | trading = ['company', 'companies', 'merchants', 'merchant'] | ||
Line 157: | Line 159: | ||
slave_owner = ['slaves', 'slave', 'slave-owner'] | slave_owner = ['slaves', 'slave', 'slave-owner'] | ||
racist = ['racism', 'racist', 'races'] | racist = ['racism', 'racist', 'races'] | ||
''Activities categories and their related words'' | |||
The last category is related to the structural contributions | The last category is related to the structural contributions, it includes participation in ''Anti-Black Racism and Ideologies Relevant to Caribbean Economic Space | ||
'', | '', ''Marine Navigation'' and ''African and European Logistics''. The Marine Navigation section concerns primarly the development of navigation tools for colonial powers and the logistics contributions are related to banking or insurance companies. | ||
Finally, the description contains many detailed that are worth keeping | Finally, as the full description contains many detailed that are worth keeping, we attach it to the dataset once the relevant information for data analysis and visualization are extracted. An example is given below. | ||
[[File:dataset_example_1.png|1300px|center|thumb| Sample of the structured dataset extracted from the CCA.]] | |||
====Levels of confidence==== | |||
For origin, | For the origin location, the date and the person name we calculate an accuracy value that indicates what is the level of confidence we have in the retrieved attribute. Note that there isn't any confidence level for the colonial location property as it comes directly from the table of contents and is unambiguous. | ||
'''Origin accuracy''' The origin location is found according to the schemes presented above. However, multiple locations exist in the same portion of text thus the actual location that we are looking for might be further away in the text. By counting the total of Swiss cities present in the text we can compute a level of confidence inversely proportional to it. | '''Origin accuracy''' The origin location is found according to the schemes presented above. However, multiple locations exist in the same portion of text thus the actual location that we are looking for might be further away in the text. By counting the total of Swiss cities present in the text we can compute a level of confidence inversely proportional to it. | ||
'''Date accuracy''' The date accuracy is calculated by counting how many instances of date (aka 4 digits string) there is in total in the text. | |||
'''Person accuracy''' For both the data and person, retrieved based on the NER tags, the accuracy levels are calculated using the tags occurrences. Following the argument presented above accuracy is calculated as the inverse of tags occurrences. | |||
Mostly, the accuracy levels give an indication of the intricacy of the text with respect to people, dates and locations. In this sense, it tells us how complex is the description and how much intricacies exist and might confuse our interpretation. | |||
===Dataset enrichment with geographical databases=== | ===Dataset enrichment with geographical databases=== | ||
We add geographical information for both colonial and | One of the goal of this work is to visualize the archive content on a geographical map. We add geographical information for both colonial and origin location (Switzerland) using the following methods. | ||
'''For colonial location''' Colonial locations are retrieved from the table of contents which organises the corpus mainly by '''countries'''. A few exceptions are '''regions''' from the Caribbean economic space, '''states''' for North America and other indications for '''structural contributions'''. Our model geolocalizes countries and US states based on their capitals' geographical coordinates. Two different datasets are used respectively for [https://github.com/yaph/geonamescache countries]and [https://www.britannica.com/topic/list-of-state-capitals-in-the-United-States-2119210 US states]. For Caribbean economic spaces regions, the French West Indies are mapped to Guadeloupe and the Danish West Indies to the U.S. Virgin Islands. Based on the content of Southern Africa section, we used South Africa as reference region and finally the East Indies are mapped to Indonesia. The structural contribution are more difficult to map, indeed as mentioned by the archive author ''they cannot be assigned to one single Caribbean country''. We decided to map them to Switzerland, in order to highlight that some contributions didn't take place abroad but where still part of the European colonial project. A finer grain retrieval would allow extraction of more specific locations for descriptions in sections concerning several locations. | |||
'''For origin locations (Switzerland)''' | |||
The geolocalization of origin locations is made at the level of cities. We use an [https://simplemaps.com/data/world-cities additional dataset] to map each origin location to a Swiss city with its geographical coordinates. | |||
===Data Visualization=== | |||
[[File:map01.png| | We used '''Javascript''', '''HTML''', and '''CSS''' to implement the visualisation. In order to display the map and draw the connections between places, Javascript library '''Leaflet.js''' is used. We store the extracted information in '''GeoJSON''' format for the map implementation because is a simple open standard format to store both geographical and non-spatial features ([https://ych-wang.github.io/Colonial-heritage-in-Switzerland/ see website]). | ||
[[File:map01.png|700px|center|thumb|This visualization screenshot shows all the connections between Switzerland and colonial locations in our dataset .]] | |||
When the "Show All" button is clicked, the map displays all the connections, then click on each line, key information about the entry will popup. | When the "Show All" button is clicked, the map displays all the connections, then click on each line, key information about the entry will popup. | ||
With the dropdown on the text panel on the left, user can filter the list based on the origin city in Switzerland. | With the dropdown on the text panel on the left, the user can filter the list based on the origin city in Switzerland. Click on the name of the two places will zoom the corresponding location on the map. Click on the arrow will draw the line between the two places. | ||
[[File:Mao03.png| | [[File:Mao03.png|700px|center|thumb|This visualisation screenshot shows one connection of a given entry.]] | ||
==Results== | ==Results== | ||
Overall, we extract '''464 text items''' from the division of the initial page. | Overall, we extract '''464 text items''' from the division of the initial page. | ||
With the combination of NER and BIO tagging with syntax structure pattern matching, we can retrieve '''75% of the entries'''. | |||
Precisely, on this set, 117 items have no person's name or location which makes them irrelevant (precisely 49 entries have no person defined, 16 entries where neither the person nor the location could be defined and in 52 cases the person and location are in the wrong order with respect to our schema). We are left with '''327 entries.''' | |||
The average confidence levels are respectively 52%, 52%, and 38%, for person, origin, and date. The date average is low but this means that many date occurs in the text. Indeed, as we used the syntax matching we are pretty confident that this number is an indication of high occurrences of dates and of the text complexity instead of bad text processing. This argument is valuable for the other indicators too. | |||
With this data, we can highlight some features of Swiss involvement in transatlantic slavery. Important cities as '''Zurich''' and '''Bâle (Basel)''' have many involvement in activities participating in transatlantic slavery. Surprisingly, smaller localities as Neuchâtel and Le Locle (which is also in the Canton of Neuchâtel) have major contributions too. Interestingly enough on the colonial side, the third country with the most contributions is Switzerland, which as we defined earlier, highlight contributions that cannot be localized in a single country: '''structural contributions'''. | |||
[[File:origins_dist_barh.png| 800px|center|thumb|Origin locations (Switzerland) distribution for all structured data.]] | |||
[[File:colonial_locs_dist_barh.png| 800px|center|thumb|Colonial locations distribution for all structured data.]] | |||
The extraction of information can further investigate of prominent figures of the transatlantic business from Switzerland. A analysis of most occurring last names reveals that '''Flournoy''', '''Zinzendorf''' and '''Zollicoffer''' '''families''' are the most cited in the archive. The archive author insists on ''how interconnected the slavery-economies of North America, the Caribbean and Brazil (and beyond) were, is also demonstrated by the fact that several Swiss families globalized into more than one space'', in this citation the Flournoy family is cited. | |||
[[File: dataset_flournoy.png|1300px| center|thumb| Information extracted concerning Flournoy family.]] | |||
Finally another important data extracted from the archive is the activities in which people were involved. The distribution describes well the context of plantations in the America and how Swiss person were usually owner of such enterprises and thus owner of slaves. | |||
[[File: activities_dist_barh.png|800px| center|thumb| Activities distribution for all structured data.]] | |||
At the end, on the website 106 entries are visualized, the significant drop is due to the lack of geographical coordinates for some Swiss locations. This would be the first step needed to significantly increase the data for visualisation. | |||
However, the structured data definitely shows the numerous connections that Switzerland has across the Atlantic in the context of translatlantic slavery and makes possible analysis and further study of the data. | |||
==Limitations== | |||
The limitations presentation follows the methodology steps. | |||
===Text processing=== | |||
The complete archive has 464 items, i.e. entries about different actors. However, retrieving information such as the name and origin of the actor, as well as his activities and the location of the activities is difficult. The texts can be pretty complex and intricated, "''as were the implications of Switzerland in Black Slavery'' "<ref>Hans Fässler</ref>. | |||
*''David Louis Agassiz (1737–1807), uncle of the racist and glaciologist Louis Agassiz (1807–1873), was a financier who left Switzerland for France in 1747 with his friend Jacques Necker in order to work in the Parisian branch of the Thellusson et Vernet bank (investments in colonial companies, links with the slave trade). Until 1770, David Louis Agassiz cooperated with Pourtalès of Neuchâtel via the company «Joseph Lieutaud et Louis Agassiz». Necker was to become Louis XVI’s Minister of Finance, whereas David Louis Agassiz left for Britain where he acquired a considerable fortune and anglicised his name to Arthur David Lewis Agassiz. He was naturalised by a private Act of Parliament in 1766. Agassiz dealt in cotton, silk, sugar, cocoa, coffee, tobacco, and cochineal and had business relations with France, Spain, Portugal, Italy, Germany, Belgium, Denmark, the Netherlands, Sweden, Switzerland, Russia, North and South America and the East and West Indies. In 1776, Francis Anthony Rougemont (1713–1788) from a Neuchâtel family joined the partnership under the name of «Agassiz, Rougemont et Cie.», a company which had close ties with «MM Pourtalès et Cie.» from Neuchâtel (ownership of plantations on Grenada, indiennes industry, banking). Arthur David Lewis Agassiz’s son Arthur Agassiz (1771–1866), cousin of the racist Louis Agassiz, took over the family business, and later formed a company «Agassiz, Son & Company». In 1823, Arthur Agassiz was working in Port-au-Prince (Haiti) with «Jean Robert Bernard et Cie.».'' | |||
'''Limitation of NER versus pattern recognition .''' The results of NER processing are not reliable for all tags. For person name, Stanford NER performance is reliable and visual inspection shows good results. However, the Stanford NER is missing a lot of locations, most of them are either not recognized or miscategorized as organizations. In a similar way, the dates aren't well recognized. The limitations of the tools made it worth it to use pattern matching and develop our own model. This is required to match the author's style but makes our model sensitive to change in authoring. | |||
===Dataset enrichment with geographical databases=== | |||
The colonial locations aren't always the same level of definition, some are regions, countries, or states even using only the TOC. Our method introduces some artifacts links to the models decisions: for example, East Indies definitely covers more than just Indonesia. If we wanted to overcome the limitations we would need to retrieve colonial location with another method. We suggest that a list of countries and cities could be looked for in each text, and a default value assumed based on the TOC. | |||
For origin countries, a lot of values have no geographical coordinates because there are too small cities (Saint-Aubin, Bournens, Bourmens), this could be fixed by using an additional dataset. | |||
===Data analysis and visualisation=== | |||
It is worth noting that this data comes from observational studies. Therefore we have no control over the database constitution and it can not be taken as representative of all transatlantic slavery implications. However, this definitely shows the numerous connections Switzerland has across the Atlantic. | |||
==Links== | ==Links== | ||
Line 238: | Line 251: | ||
Github repository: '''[https://github.com/ych-wang/Colonial-heritage-in-Switzerland Colonial-heritage-in-Switzerland]''' | Github repository: '''[https://github.com/ych-wang/Colonial-heritage-in-Switzerland Colonial-heritage-in-Switzerland]''' | ||
Primary source: '''[https://louverture.ch/cca/ | Primary source: '''[https://louverture.ch/cca/ Caricom Compilation Archive]''' | ||
Website : [https://ych-wang.github.io/Colonial-heritage-in-Switzerland/ Switzerland and the Transatlantic Slavery Website ] | |||
Secondary sources: | Secondary sources: geonamescaches, uscapitals. | ||
==References== | ==References== | ||
<references /> | <references /> |
Latest revision as of 15:32, 22 December 2021
Introduction
In the last decade, the narrative that Switzerland has nothing to do with slave trade, slavery and colonialism has been severely challenged.[1] [2]
Between the 16th and the 19th centuries, there were a number of Swiss involved in slavery, the slave trade, and colonialism activities. Swiss trading companies, banks, city-states, family enterprises, mercenary contractors, soldiers, and private individuals participated in and profited from the commercial, military, administrative, financial, scientific, ideological, and publishing activities necessary for the creation and the maintenance of the Transatlantic slavery economy. In this project, focusing on the Caribbean Community (CARICOM) member states, we are interested in discovering the details of the colonial past of Switzerland.
Our primary source is the CARICOM Compilation Archive written by Hans Fässler, MA Zurich University, a historian from St.Gallen (Switzerland).
Motivation
The CCA (CARICOM Compilation Archive) archive is a single-page website with contents categorized by colonial location. In the body of the text, each entry concerns a different actor and starts with an arrow. The author Hans Fassler started the compilation about all the Swiss involvements to convince the CARICOM Reparations Commission (CRC) with arguments and material. In June 2019, the CRC was convinced to recommend the heads of the Caribbean Community to add Switzerland to the list for reparation of the colonial activities.
Hans continues updating information about CARICOM and expanding his research to North America and East India and other places. He discussed with us the issue that the website provider is warning about the growing content of CCA. Although the archive is a very informative source about the colonial past of Switzerland, it certainly creates an obstacle for potentially interested readers to learn from it in depth. The motivation of this project is to discover the previously less known history of Switzerland and provide a framework to visualize the content of the archive in a more accessible and more interactive way. The creation of a structured dataset path the way to quantitative analysis of the data provided by the archive.
In our project, we will extract the following information about each entry in the archive:
- Person's name
- City of origin in Switzerland
- Colonial location
- Date of birth and death of the person or the active date in the location
- Colonial activities that this person was involved
The above set of properties has been validated as relevant and valuable information by Hans Fassler.
As we discussed with Hans, he keeps the full content of each entry because it contains more detailed information. We would like to build the map visualization based on the information we extract. This would allow the entries to be easily understandable and interpretable since the map provides geographic information to help readers identify the places. The reader can also have the visual connection between the origin in Switzerland and the colonial locations. Also, based on the information we would like to extract, we can analyze the involvement of the Swiss in the colonial era.
Project Plan and Milestones
Base on the feedback of the midterm presentation the objectives have been revised. The material traces have been left for further work and some data analysis on the existing dataset has been suggested instead.
Step I : Information extraction with NLP tools(Stanford NER, NLTK)
Step II : Visualize the connection between Switzerland and Caribbean colonies
Step III : Highlight the material traces (not enough time to work on)
Date | Task | Completion |
---|---|---|
By Week 4
(07.10) |
|
✓ |
By Week 6
(21.10) |
|
✓ |
By Week 10
(25.11) |
Step I
Step II
|
✓ |
By Week 11
(02.12) |
Step I
Step II
|
✓ |
By Week 12
(09.12) |
Step II
|
✓ |
By Week 13
(16.12) |
Step II
Overall
|
✓ |
By Week 14
(22.12) |
Overall
|
✓ |
Methodology
The methodology of our project is divided into three steps: text processing, data enrichment with geographical databases and data visualization and analysis.
Text processing
The text source is organized into sections, and within each section the reader can find a list of entries. Most items are separated by a return and an arrow as starting string (=>). Each item references a different actor of colonial entreprise. The first step is to retrieve each item separately and appends its section index. This index is used for colonial location retrieval. Indeed the table of contents is mainly organized by colonial location (some sections don't refer explicitly to geographical location, and are treated separately).
The processing of the text item itself is done with Natural Language tools as NLTK for tokenization and Stanford NER for Named Entities recognition and BIO taggings.
Named Entities recognition (NER) is a text processing method that recognizes and tags words referring to named entities. In our case we are interested in using the 'PERSON' tag for (person name or person last name), as well as location (city, region, country) for place of origin and date. In addition, we run BIO tagging where the NE are labeled based on their position ('BEGING-INSIDE-OUTSIDE') with respect to others NE. This allows grouping of successive similar tags into a single string. An example for both steps is given below.
NER-tagging
('=', 'O'), ('>', 'O'), ('Jean', 'PERSON'), ('Huguenin', 'PERSON'), ('(', 'O'), ('1685–1740', 'O'), (')', 'O'), ('from', 'O'), ('Le', 'O'), ('Locle', 'ORGANIZATION'), ('(', 'ORGANIZATION'), ('Canton', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('Neuchâtel', 'ORGANIZATION'), (')', 'O'),
After BIO-tagging
('=', 'O'), ('>', 'O'), ('Jean Huguenin', 'PERSON'), ('(', 'O'), ('1685–1740', 'O'), (')', 'O'), ('from', 'O'), ('Le', 'O'), ('Locle ( Canton of Neuchâtel', 'ORGANIZATION'),
The NER isn't completely reliable, and we can already notice some mislabeling, the limitations of NER are discussed in the limitations below.
Retrieving relevant informations requires to define which of the persons, locations and date tags are related to the main protagonist. Indeed, in the description multiple persons, as relatives or bosses are mentioned, and multiple locations as the location of origin but also brother's baptized location or other less relevant places can be found. In order to sort amongst the possibilities we use pattern matching to match the structure of the different words and tags to a syntax pattern (note that we don't rely only on the tags for pattern matching as their accuracy is low). Our model contains the two schemas described below. With these two schemas we can recognized around 75% of the item retrieved.
The pattern matching is so efficient that it is used solely to retrieve the origin location information. We find the first occurence of the word from and retrieve the next strings as location. Our model accounts for several variation (e.g. Lausanne, the City of Geneva, Le Locle). In a similar manner the extraction of the date is also easier with pattern matching. Either we have schema II and the date is the second string in the text, or we have schema I and the date is in between parenthesis between person and location. For both schemas, we retrieve date indicated as range, starting date or ending date. Our model works for a large range of date formatting, 1850-1855, born 1790, after 1878, b. 1989 etc..
The other information relevant to our dataset is the activities of the main protagonist. The categorization is difficult as many characters are involved in multiple activities and often their relatives' activities are also related in the description. Family indicators are used to skip irrelevant activities information. The following categories are pertinent, based on our discussion with Hans Fassler and the study of our primary sources :
trading = ['company', 'companies', 'merchants', 'merchant'] military = ['soldier','captain','lieutenant','commander','regiment', 'rebellion', 'troops'] plantation = ['plantation', 'plantations'] slave_trade = ['slave ship', 'slave-ship'] slave_owner = ['slaves', 'slave', 'slave-owner'] racist = ['racism', 'racist', 'races']
Activities categories and their related words
The last category is related to the structural contributions, it includes participation in Anti-Black Racism and Ideologies Relevant to Caribbean Economic Space , Marine Navigation and African and European Logistics. The Marine Navigation section concerns primarly the development of navigation tools for colonial powers and the logistics contributions are related to banking or insurance companies.
Finally, as the full description contains many detailed that are worth keeping, we attach it to the dataset once the relevant information for data analysis and visualization are extracted. An example is given below.
Levels of confidence
For the origin location, the date and the person name we calculate an accuracy value that indicates what is the level of confidence we have in the retrieved attribute. Note that there isn't any confidence level for the colonial location property as it comes directly from the table of contents and is unambiguous.
Origin accuracy The origin location is found according to the schemes presented above. However, multiple locations exist in the same portion of text thus the actual location that we are looking for might be further away in the text. By counting the total of Swiss cities present in the text we can compute a level of confidence inversely proportional to it.
Date accuracy The date accuracy is calculated by counting how many instances of date (aka 4 digits string) there is in total in the text.
Person accuracy For both the data and person, retrieved based on the NER tags, the accuracy levels are calculated using the tags occurrences. Following the argument presented above accuracy is calculated as the inverse of tags occurrences.
Mostly, the accuracy levels give an indication of the intricacy of the text with respect to people, dates and locations. In this sense, it tells us how complex is the description and how much intricacies exist and might confuse our interpretation.
Dataset enrichment with geographical databases
One of the goal of this work is to visualize the archive content on a geographical map. We add geographical information for both colonial and origin location (Switzerland) using the following methods.
For colonial location Colonial locations are retrieved from the table of contents which organises the corpus mainly by countries. A few exceptions are regions from the Caribbean economic space, states for North America and other indications for structural contributions. Our model geolocalizes countries and US states based on their capitals' geographical coordinates. Two different datasets are used respectively for countriesand US states. For Caribbean economic spaces regions, the French West Indies are mapped to Guadeloupe and the Danish West Indies to the U.S. Virgin Islands. Based on the content of Southern Africa section, we used South Africa as reference region and finally the East Indies are mapped to Indonesia. The structural contribution are more difficult to map, indeed as mentioned by the archive author they cannot be assigned to one single Caribbean country. We decided to map them to Switzerland, in order to highlight that some contributions didn't take place abroad but where still part of the European colonial project. A finer grain retrieval would allow extraction of more specific locations for descriptions in sections concerning several locations.
For origin locations (Switzerland) The geolocalization of origin locations is made at the level of cities. We use an additional dataset to map each origin location to a Swiss city with its geographical coordinates.
Data Visualization
We used Javascript, HTML, and CSS to implement the visualisation. In order to display the map and draw the connections between places, Javascript library Leaflet.js is used. We store the extracted information in GeoJSON format for the map implementation because is a simple open standard format to store both geographical and non-spatial features (see website).
When the "Show All" button is clicked, the map displays all the connections, then click on each line, key information about the entry will popup.
With the dropdown on the text panel on the left, the user can filter the list based on the origin city in Switzerland. Click on the name of the two places will zoom the corresponding location on the map. Click on the arrow will draw the line between the two places.
Results
Overall, we extract 464 text items from the division of the initial page. With the combination of NER and BIO tagging with syntax structure pattern matching, we can retrieve 75% of the entries.
Precisely, on this set, 117 items have no person's name or location which makes them irrelevant (precisely 49 entries have no person defined, 16 entries where neither the person nor the location could be defined and in 52 cases the person and location are in the wrong order with respect to our schema). We are left with 327 entries.
The average confidence levels are respectively 52%, 52%, and 38%, for person, origin, and date. The date average is low but this means that many date occurs in the text. Indeed, as we used the syntax matching we are pretty confident that this number is an indication of high occurrences of dates and of the text complexity instead of bad text processing. This argument is valuable for the other indicators too.
With this data, we can highlight some features of Swiss involvement in transatlantic slavery. Important cities as Zurich and Bâle (Basel) have many involvement in activities participating in transatlantic slavery. Surprisingly, smaller localities as Neuchâtel and Le Locle (which is also in the Canton of Neuchâtel) have major contributions too. Interestingly enough on the colonial side, the third country with the most contributions is Switzerland, which as we defined earlier, highlight contributions that cannot be localized in a single country: structural contributions.
The extraction of information can further investigate of prominent figures of the transatlantic business from Switzerland. A analysis of most occurring last names reveals that Flournoy, Zinzendorf and Zollicoffer families are the most cited in the archive. The archive author insists on how interconnected the slavery-economies of North America, the Caribbean and Brazil (and beyond) were, is also demonstrated by the fact that several Swiss families globalized into more than one space, in this citation the Flournoy family is cited.
Finally another important data extracted from the archive is the activities in which people were involved. The distribution describes well the context of plantations in the America and how Swiss person were usually owner of such enterprises and thus owner of slaves.
At the end, on the website 106 entries are visualized, the significant drop is due to the lack of geographical coordinates for some Swiss locations. This would be the first step needed to significantly increase the data for visualisation.
However, the structured data definitely shows the numerous connections that Switzerland has across the Atlantic in the context of translatlantic slavery and makes possible analysis and further study of the data.
Limitations
The limitations presentation follows the methodology steps.
Text processing
The complete archive has 464 items, i.e. entries about different actors. However, retrieving information such as the name and origin of the actor, as well as his activities and the location of the activities is difficult. The texts can be pretty complex and intricated, "as were the implications of Switzerland in Black Slavery "[3].
- David Louis Agassiz (1737–1807), uncle of the racist and glaciologist Louis Agassiz (1807–1873), was a financier who left Switzerland for France in 1747 with his friend Jacques Necker in order to work in the Parisian branch of the Thellusson et Vernet bank (investments in colonial companies, links with the slave trade). Until 1770, David Louis Agassiz cooperated with Pourtalès of Neuchâtel via the company «Joseph Lieutaud et Louis Agassiz». Necker was to become Louis XVI’s Minister of Finance, whereas David Louis Agassiz left for Britain where he acquired a considerable fortune and anglicised his name to Arthur David Lewis Agassiz. He was naturalised by a private Act of Parliament in 1766. Agassiz dealt in cotton, silk, sugar, cocoa, coffee, tobacco, and cochineal and had business relations with France, Spain, Portugal, Italy, Germany, Belgium, Denmark, the Netherlands, Sweden, Switzerland, Russia, North and South America and the East and West Indies. In 1776, Francis Anthony Rougemont (1713–1788) from a Neuchâtel family joined the partnership under the name of «Agassiz, Rougemont et Cie.», a company which had close ties with «MM Pourtalès et Cie.» from Neuchâtel (ownership of plantations on Grenada, indiennes industry, banking). Arthur David Lewis Agassiz’s son Arthur Agassiz (1771–1866), cousin of the racist Louis Agassiz, took over the family business, and later formed a company «Agassiz, Son & Company». In 1823, Arthur Agassiz was working in Port-au-Prince (Haiti) with «Jean Robert Bernard et Cie.».
Limitation of NER versus pattern recognition . The results of NER processing are not reliable for all tags. For person name, Stanford NER performance is reliable and visual inspection shows good results. However, the Stanford NER is missing a lot of locations, most of them are either not recognized or miscategorized as organizations. In a similar way, the dates aren't well recognized. The limitations of the tools made it worth it to use pattern matching and develop our own model. This is required to match the author's style but makes our model sensitive to change in authoring.
Dataset enrichment with geographical databases
The colonial locations aren't always the same level of definition, some are regions, countries, or states even using only the TOC. Our method introduces some artifacts links to the models decisions: for example, East Indies definitely covers more than just Indonesia. If we wanted to overcome the limitations we would need to retrieve colonial location with another method. We suggest that a list of countries and cities could be looked for in each text, and a default value assumed based on the TOC. For origin countries, a lot of values have no geographical coordinates because there are too small cities (Saint-Aubin, Bournens, Bourmens), this could be fixed by using an additional dataset.
Data analysis and visualisation
It is worth noting that this data comes from observational studies. Therefore we have no control over the database constitution and it can not be taken as representative of all transatlantic slavery implications. However, this definitely shows the numerous connections Switzerland has across the Atlantic.
Links
Github repository: Colonial-heritage-in-Switzerland
Primary source: Caricom Compilation Archive
Website : Switzerland and the Transatlantic Slavery Website
Secondary sources: geonamescaches, uscapitals.
References
- ↑ David, Thomas, Bouda Etemad, and Janick Marina Schaufelbuehl. 2005. La Suisse et l'esclavage des Noirs. Lausanne (Suisse): Société d'histoire de la Suisse romande.
- ↑ Fässler, Hans, and Hans Fässler. 2007. Une Suisse esclavagiste: voyage dans un pays au-dessus de tout soupçon. Paris: Duboiris..
- ↑ Hans Fässler