Social Dynamics in Urban Context
Open tools, models and data - Paris and its suburbs, 1789-1950

Tools and Web services

Semi-automatic annotation tool

A semi-automatic text annotation tool is developped by the project. It takes PDF documents as input and processes them automatically by applying the three following steps:

  • layout detection,
  • optical characters recognition (with PERO OCR),
  • named entities recognition (fine-tuned CamemBERT model).

Users can then check and manually correct each automatically detected and processed text section.

SODUCO corpus of directories
The trade directories from the 19th century are a challenging dataset with very heterogeneous layouts, fonts, and contents. Source: / Bibliothèque nationale de France
SODUCO text annotation tool
Geohistorical geocoding tool

SODUCO geohistorical geocoding tool
The historical geocoder takes both addresses and dates into account

Old maps vectorisation tool

SODUCO vectorisation tool
A sample vectorisation output

Vector data validation and correction tool

A collaborative tool to validate and edit geospatial data and more is developped to improve data quality by getting a human validation of any type of geospatial data. It allows users to improve this quality by creating, removing, modifying or validating any feature (geometry and attributes).

SODUCO validation tool SODUCO validation tool
General view of the tool with uploaded data Edit mode, creation of the geometry of a new feature
SODUCO validation tool SODUCO validation tool
Edit mode, change of attributes of an existing feature Status mode to see what features were created, removed or modified

Data and historical sources catalogue

A catalog has been developped to store, reference and retrieve archival records and digital data used and produced throughout the project.

SODUCO catalogue
SODUCO catalog