{ "cells": [ { "cell_type": "code", "execution_count": 1, "id": "569a0682", "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "try:\n", " import IPython\n", "except:\n", " !pip install IPython\n", " import IPython \n", "from IPython.core.display import HTML\n", "# add stylesheet for notebook\n", "HTML(\"\"\"\"\"\")" ] }, { "cell_type": "markdown", "id": "302c49d7", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "\n", "

Enriching Collections Data with Linked Art

\n", "\n", "The Linked Art reconciliation exemplar provides a step through the process of reconciling geographical place names that occur in the title of artworks by the artist, John Ruskin, to enrich collections data with an additional representation of the geographical coordinates of the place depicted in the artworks. \n", "\n", "From Wikipedia:\n", "\n", "
\"John Ruskin (8 February 1819 – 20 January 1900) was an English writer, philosopher, art critic and polymath of the Victorian era. He wrote on subjects as varied as geology, architecture, myth, ornithology, literature, education, botany and political economy.\"
\n", "\n", "John Ruskin travelled extensively in Europe and was a prolific artist, creating drawings of paintings whose titles often included place names for the locations depicted. \n", "\n", "#### Artwork Title contains Place Name\n", "The title of the artworks has been recorded in the title field in many of the collection data records , and this has been used as the basis for the reconciliation process shown here.\n", "\n", "#### OpenRefine Tool to Reconcile Data\n", "The place names are reconciled with the Getty Thesaurus of Geographic Names (TGN), using the Open Refine tool.\n", "\n", "#### The Getty Thesaurus of Geographic Names (TGN)\n", "Reconciliation with the Getty Thesaurus of Geographic Names (TGN) has allowed additional information to be associated with the artwork: \n", "- an authoritative global identifier for the geographical location depicted \n", "- geographical coordinates\n", "\n", "#### Input Data Files\n", "\n", "The input files are Linked Art files created with the `01-06-Transform-John-Ruskin` Jupyter notebook.\n", "\n", "\n", "### Further Reading\n", "\n", "- The Getty Thesaurus of Geographic Names® Online (TGN) http://www.getty.edu/research/tools/vocabularies/tgn\n", "- John Ruskin Wikipedia entry https://en.wikipedia.org/wiki/John_Ruskin\n", "\n" ] }, { "cell_type": "markdown", "id": "348d9614", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Reconciliation Process\n", "\n", "1. Create CSV file from Linked Art JSON-LD\n", "2. Identify place name in title\n", "3. Use OpenRefine to reconcile place names\n", "4. Define geolocation representation in Linked Art\n", "5. Add place name and coordinates into Linked Art JSON-LD files" ] }, { "cell_type": "markdown", "id": "aec2095c", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## 1. Create CSV file from Linked Art JSON-LD\n", "\n", "To reconcile the place names in the artwork titles\n", "- create a CSV file from the JSON-LD Linked Art files\n", "- CSV contains `id` and `_label` properties\n", "\n", "The script gets a list of all files in a selected directory using `os.listdir()` and iterates over them.\n", "\n", "- `json.load` is used to deserialize the Linked Art JSONLD file to a Python dictionary object. \n", " - json.loads uses the following conversion table https://docs.python.org/3/library/json.html#json-to-py-table \n", "\n", "Finally, the script uses `csv.DictWriter` \n", "- to create an object that maps the Python dictionary onto output rows. \n", "- `Dictwriter.writeheader()` writes a row with the field names (as specified in the constructor) to the writer’s file object. \n", "- `Dictwriter.writerows()` writes all elements in rows to the writer’s file object.\n", "\n", "\n", "#### Further Reading\n", "- os Python library https://docs.python.org/3/library/os.html\n", "- os.listdir() tutorial https://www.tutorialspoint.com/python/os_listdir.htm\n", "- json Python library https://docs.python.org/3/library/json.html\n", "- csv Python library https://docs.python.org/3/library/csv.html" ] }, { "cell_type": "code", "execution_count": 2, "id": "81e0aaea", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# import relevant Python libraries\n", "\n", "try:\n", " import os\n", "except:\n", " %pip install os\n", " import os\n", "\n", "try:\n", " import json\n", "except:\n", " %pip install json\n", " import json \n", " \n", "import csv\n", "\n", "# list holding a\n", "artworkCSV = []\n", "\n", "# Linked Art JSON-LD file location\n", "artworkFileDir = \"./data/ruskin/output/json/\"\n", "artworkFileList =os.listdir(artworkFileDir)\n", "\n", "# iterate over Linked Art JSON-LD files\n", "for artworkFile in artworkFileList:\n", " # read file and append to \n", " with open( artworkFileDir + artworkFile) as artworkFileContents: \n", " \n", " # create json object `artwork` from file\n", " artworkObjJSON = json.load(artworkFileContents)\n", " \n", " # check for \"_label\" property \n", " if \"_label\" not in artworkObjJSON:\n", " continue\n", " \n", " # append artwork properties to artwork JSON object\n", " artworkCSV.append( { \n", " \"id\": artworkObjJSON[\"id\"], \n", " \"place\" : artworkObjJSON[\"_label\"], \n", " \"place_modified\": \" \", \n", " \"coords\": \" \"\n", " })\n", "\n", "# end loop\n", " \n", " \n", "# create CSV file\n", "artworkCsvFile = \"./data/ruskin/ruskin-places.csv\" # file location\n", "\n", "with open(artworkCsvFile, 'w') as f: \n", " # write column headings\n", " w = csv.DictWriter(f, [\"id\",\"place\",\"place_modified\",\"coords\"])\n", " w.writeheader()\n", " # write rows with artwork properties\n", " w.writerows(artworkCSV)" ] }, { "cell_type": "markdown", "id": "5825b0e1", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Result - CSV File with Place Names\n", "\n", "The contents of the resulting CSV file are shown below for illustration.\n", "\n", "The CSV file is read into a `pandas` dataframe. \n", "
`Pandas` is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
\n", " \n", " \n", "A `pandas dataframe` is a pandas data structure containing \n", "
two-dimensional, size-mutable, potentially heterogeneous tabular data.
\n", "\n", "The pandas dataframe allows easy manipulation of two-dimensional tabular data. \n", "\n", "The `IPython` library is also used to display the contents of the CSV file \n", " \n", "#### Further Reading \n", "- pandas https://pandas.pydata.org/\n", "- pandas dataframe https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html\n", "- IPython use in Jupyter notebooks https://coderzcolumn.com/tutorials/python/how-to-display-contents-of-different-types-in-jupyter-notebook-lab" ] }, { "cell_type": "code", "execution_count": 3, "id": "85365962", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idplaceplace_modifiedcoords
0https://collections.ashmolean.org/collection/1...Engraving of Ruskin's Drawing of the Petal Vau...
1https://collections.ashmolean.org/collection/1...Enlarged Study of a Prawn's Rostrum
2https://www.harvardartmuseums.org/collections/...Study of a Venetian Capital
3https://collections.ashmolean.org/collection/1...Autumnal Cloud filling the Valley of Geneva, t...
4https://collections.ashmolean.org/collection/1...Axmouth Landslip from Dolands Farm
...............
274https://collections.ashmolean.org/collection/1...The Head of a Kite, from Life
275https://www.harvardartmuseums.org/collections/...Part of a Sketch of the Northwest Porch of St....
276http://www.rijksmuseum.nl/nl/collectie/nl-RP-T...Gezicht op S. Anastasia te Verona, over de Adige
277https://collections.ashmolean.org/collection/2...Architectural detail: stone bracket
278https://collections.ashmolean.org/collection/1...Study of the Marble Inlaying on the Front of t...
\n", "

279 rows × 4 columns

\n", "
" ], "text/plain": [ " id \\\n", "0 https://collections.ashmolean.org/collection/1... \n", "1 https://collections.ashmolean.org/collection/1... \n", "2 https://www.harvardartmuseums.org/collections/... \n", "3 https://collections.ashmolean.org/collection/1... \n", "4 https://collections.ashmolean.org/collection/1... \n", ".. ... \n", "274 https://collections.ashmolean.org/collection/1... \n", "275 https://www.harvardartmuseums.org/collections/... \n", "276 http://www.rijksmuseum.nl/nl/collectie/nl-RP-T... \n", "277 https://collections.ashmolean.org/collection/2... \n", "278 https://collections.ashmolean.org/collection/1... \n", "\n", " place place_modified coords \n", "0 Engraving of Ruskin's Drawing of the Petal Vau... \n", "1 Enlarged Study of a Prawn's Rostrum \n", "2 Study of a Venetian Capital \n", "3 Autumnal Cloud filling the Valley of Geneva, t... \n", "4 Axmouth Landslip from Dolands Farm \n", ".. ... ... ... \n", "274 The Head of a Kite, from Life \n", "275 Part of a Sketch of the Northwest Porch of St.... \n", "276 Gezicht op S. Anastasia te Verona, over de Adige \n", "277 Architectural detail: stone bracket \n", "278 Study of the Marble Inlaying on the Front of t... \n", "\n", "[279 rows x 4 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "try:\n", " import pandas as pd\n", "except:\n", " %pip install pandas\n", " import pandas as pd\n", "\n", "try:\n", " import IPython\n", "except:\n", " %pip install IPython\n", " import IPython \n", " \n", "from IPython.display import display, HTML, Javascript\n", "\n", " # CSV file location\n", "artworkCsvFile = \"./data/ruskin/ruskin-places.csv\"\n", "\n", "# read CSV file into pandas dataframe \n", "dataFrame = pd.read_csv(artworkCsvFile,low_memory=False)\n", "\n", "# define how many columns and rows to display == all\n", "pd.options.display.max_columns = len(dataFrame.columns)\n", "#pd.options.display.max_rows = len(dataFrame.index)\n", "\n", "# use IPython display to display the contents of CSV file\n", "display(dataFrame)" ] }, { "cell_type": "markdown", "id": "931ce294", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## 2. Identify Place Names in Title\n", "\n", "Next, get place name from artwork title\n", "- extract possible place names from the artwork title field, to help with the reconciliation process. \n", "- a list of possible place names is used to help identify place names in the field. \n", "- add extracted place names to `place_modified` column \n", "- update CSV file\n", "\n", "A list of place names `placeNames` is created to help with extracting place names from the artwork title. This was produced further to a review of the values in the place column.\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 4, "id": "bdcc6a8f", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "artworkCsvFile = \"./data/ruskin/ruskin-places.csv\" # file location\n", "\n", "\n", "# read CSV file into pandas dataframe\n", "dataFrame = pd.read_csv(artworkCsvFile,low_memory=False)\n", "\n", "\n", "# A list of place names `placeNames` is created to help with extracting place names from the artwork title.\n", "placeNames = [\n", "\"Florence\",\"Bologna\",\"Lucca\",\"Alps\",\"Oxford\",\"Rome\", \"Venice\",\"Fribourg\",\"Neuchâtel\",\"Sestri\",\"Visp\",\"Chamonix\",\n", "\"Abbeville\",\"Schaffhausen\",\"Verona\",\"Vorarlberg\",\"Baden\",\"Schaffhausen\",\"Faido\",\"Normandy\",\"Genève\",\"Geneva\",\n", "\"Gloucester\",\"Basel\",\"Luzern\",\"Padua\",\"Habsburg\",\"Rhine\",\"Zug\",\"Aix-la-Chapelle\",\"Siena\",\"Mont Blanc\",\"Lago di Como\",\n", "\"Bellinzona\",\"Lake of Lecco\"\n", "]\n", "\n", "places = {\"Venezia\":[\"Venice\",\"Venetian\",\"St Mark\",\"St. Mark\"]}\n", "\n", "\n", "# iterate over dataframe\n", "for index,row in dataFrame.iterrows():\n", " \n", " # iterate over place names\n", " # check if any place name in placesNames is present in row\n", " for place in placeNames:\n", " # if place name found, add to place_modified column\n", " if place in row[\"place\"]:\n", " dataFrame.at[index,\"place_modified\"] = place\n", " \n", " # iterate over place names for Venice\n", " for place in places[\"Venezia\"]:\n", " # if place found add `Venezia` to place_modified column\n", " if place in row[\"place\"]:\n", " dataFrame.at[index,\"place_modified\"] = \"Venezia\"\n", "\n", " \n", "# remove records where place_modified is blank\n", "dataFrame = dataFrame[dataFrame.place_modified != \" \"]\n", "dataFrame.to_csv(artworkCsvFile, index=False) \n" ] }, { "cell_type": "markdown", "id": "d0814adc", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Result - CSV File containing Place Name for use in OpenRefine\n", "\n", "The result of this process is a CSV file with the column `place_modified` containing a place name string that will be used for reconciliation in the OpenRefine tool.\n", "\n", "Records where a place name has not be identified have been removed from the CSV file." ] }, { "cell_type": "code", "execution_count": 5, "id": "5dfbd5dd", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idplaceplace_modifiedcoords
0https://www.harvardartmuseums.org/collections/...Study of a Venetian CapitalVenezia
1https://collections.ashmolean.org/collection/1...Autumnal Cloud filling the Valley of Geneva, t...Geneva
2https://www.harvardartmuseums.org/collections/...Tom Tower, Christ Church, OxfordOxford
3https://www.harvardartmuseums.org/collections/...Study of a Venetian CapitalVenezia
4https://www.tate.org.uk/art/artworks/13033View of BolognaBologna
...............
102https://collections.ashmolean.org/collection/1...Sketch of the Oak Spray in Mantegna's Fresco o...Padua
103https://www.nga.gov/collection/72870The Garden of San Miniato near FlorenceFlorence
104https://www.harvardartmuseums.org/collections/...Part of a Sketch of the Northwest Porch of St....Venezia
105http://www.rijksmuseum.nl/nl/collectie/nl-RP-T...Gezicht op S. Anastasia te Verona, over de AdigeVerona
106https://collections.ashmolean.org/collection/1...Study of the Marble Inlaying on the Front of t...Venezia
\n", "

107 rows × 4 columns

\n", "
" ], "text/plain": [ " id \\\n", "0 https://www.harvardartmuseums.org/collections/... \n", "1 https://collections.ashmolean.org/collection/1... \n", "2 https://www.harvardartmuseums.org/collections/... \n", "3 https://www.harvardartmuseums.org/collections/... \n", "4 https://www.tate.org.uk/art/artworks/13033 \n", ".. ... \n", "102 https://collections.ashmolean.org/collection/1... \n", "103 https://www.nga.gov/collection/72870 \n", "104 https://www.harvardartmuseums.org/collections/... \n", "105 http://www.rijksmuseum.nl/nl/collectie/nl-RP-T... \n", "106 https://collections.ashmolean.org/collection/1... \n", "\n", " place place_modified coords \n", "0 Study of a Venetian Capital Venezia \n", "1 Autumnal Cloud filling the Valley of Geneva, t... Geneva \n", "2 Tom Tower, Christ Church, Oxford Oxford \n", "3 Study of a Venetian Capital Venezia \n", "4 View of Bologna Bologna \n", ".. ... ... ... \n", "102 Sketch of the Oak Spray in Mantegna's Fresco o... Padua \n", "103 The Garden of San Miniato near Florence Florence \n", "104 Part of a Sketch of the Northwest Porch of St.... Venezia \n", "105 Gezicht op S. Anastasia te Verona, over de Adige Verona \n", "106 Study of the Marble Inlaying on the Front of t... Venezia \n", "\n", "[107 rows x 4 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "artworkCsvFile = \"./data/ruskin/ruskin-places.csv\" # file location\n", "\n", "dataFrame = pd.read_csv(artworkCsvFile,low_memory=False)\n", "\n", "# display table for illustration\n", "display(dataFrame)" ] }, { "cell_type": "markdown", "id": "a10d8ccc", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## 3. Use OpenRefine to Reconcile Place Names\n", "\n", "Next\n", "- use OpenRefine to match the values in the place_modified field, with place names in the name authority, The Getty Thesaurus of Geographic Names® Online (TGN).\n", "\n", "`OpenRefine` is a tool for working with messy data, and includes support for reconciliation with external data such as name authorities.\n", "\n", "`The Getty Thesaurus of Geographic Names® Online (TGN)` is one of several Getty Vocabularies that provide a structured resource that can be used to improve access to information about art, architecture, and material culture. From the website:\n", "\n", "
Through rich metadata and links, the Getty Vocabularies provide powerful conduits for knowledge creation, research, and discovery for digital art history and related disciplines.\n", "\n", "TGN is a thesaurus. TGN is not a geographic information system (GIS), although it may be linked to existing major, general-purpose, geographic databases and maps. While most records in TGN include coordinates, these coordinates are approximate and are intended for reference (\"finding purposes\") only (as is true of coordinates in most atlases and other resources, including NGA (formerly NIMA) databases).\n", "\n", "
\n", "\n", "\n", "#### Further Reading\n", "\n", "- OpenRefine https://openrefine.org/\n", "- The Getty Thesaurus of Geographic Names® Online (TGN) http://www.getty.edu/research/tools/vocabularies/tgn" ] }, { "cell_type": "code", "execution_count": 6, "id": "fce5be0d", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "

Open Refine Website

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from IPython.display import IFrame, HTML\n", "\n", "display(HTML(\"

Open Refine Website

\"))\n", "\n", "display(IFrame('https://openrefine.org/documentation.html', '100%', '600px'))" ] }, { "cell_type": "markdown", "id": "1ad62f9a", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Method\n", " \n", "The method used to reconcile the place names:\n", "- Download and install OpenRefine https://openrefine.org/download.html\n", "- Open OpenRefine and create a project\n", "- Upload the places CSV file\n", "- Reconcile place names in `place_modified`\n", "- Choose the TGN service to reconcile data with\n", "- Review Reconciliation Search Results\n", "- Add a Column Containing Entity TGN Identifiers further to Reconciliation Process\n", "- Manual Reconciliation\n", "\n", "#### Further Reading\n", "- OpenRefine https://openrefine.org\n" ] }, { "cell_type": "markdown", "id": "0b6973cb", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Download and Install OpenRefine\n", "\n", "- Download OpenRefine at https://openrefine.org/download.html\n", "- Installation instructions at https://docs.openrefine.org/manual/installing" ] }, { "cell_type": "markdown", "id": "f8f3537a", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Open OpenRefine and Create a Project\n", "\n", "The following video illustrates how to create a project in OpenRefine using a CSV file on the local drive." ] }, { "cell_type": "code", "execution_count": 7, "id": "0b70b3c9", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "

Video - OpenRefine - Create Project

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/jpeg": "", "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from IPython.display import YouTubeVideo, HTML\n", "\n", "display(HTML(\"

Video - OpenRefine - Create Project

\"))\n", "\n", "\n", "# video of project creation in OpenRefine\n", "\n", "YouTubeVideo('h1aLc5uvdck', width=1024, height=576)" ] }, { "cell_type": "markdown", "id": "955da136", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Reconcile Place Names in `place_modified`\n", "\n", "- Right-click on `place_modified` column header\n", "- Select `Start reconciling`\n", "\n", "" ] }, { "cell_type": "markdown", "id": "4a068754", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Choose the TGN Service to Reconcile Data With\n", "\n", "- Choose the Getty Voculabary Reconciliation Service that includes the TGN \n", "\n", "The video shows the following process:\n", "- select a column to reconcile \n", "- select a service to reconcile with\n", "- review options\n", "- start reconciliation\n", "\n", "#### Further Reading \n", "- Reconciliation services known to Wikidata - https://reconciliation-api.github.io/testbench/\n", "- TGN https://www.getty.edu/research/tools/vocabularies/tgn/" ] }, { "cell_type": "code", "execution_count": 8, "id": "d7e1102b", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "

Video - OpenRefine - Start Reconciliation

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/jpeg": "", "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from IPython.display import YouTubeVideo, HTML\n", "\n", "display(HTML(\"

Video - OpenRefine - Start Reconciliation

\"))\n", "\n", "\n", "YouTubeVideo('Zm0woMobjpI', width=1024, height=576)\n", "\n" ] }, { "cell_type": "markdown", "id": "8cc9e363", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Review Reconciliation Search Results\n", "\n", "Once the reconciliation process has completed it is necessary to review the results. Using the TGN it is noticeable that there are many places in the United States of America that have the same names as locations in Italy. \n", "\n", "A review of each match is necessary. Once a correct match has been identified, this match can be applied to all cells with the same place name. " ] }, { "cell_type": "code", "execution_count": 9, "id": "ab8d959a", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "

Video - OpenRefine - Review Reconciliation Results

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDBoYFhoaGRoeHRsfIygmIyIhJC4qKCcvLi0zMC0yLS41SFBCNzhLOi4yRWFFS1NWW11bMkFlbWRYbFBZW1cBERISGRYZLxsaMF09Nz1XV1dXV1dXXVdXV1dXV1ddV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXXVddV//AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAwEBAQEAAAAAAAAAAAAAAQMEBQIGB//EAEYQAAEDAgIFBgwEBQUAAQUAAAEAAhEDIRIxBBNBUWEFFCJxkaEWMlJUYoGjscHR0vAjQnKSFTNTguEGQ6Ky8TQkY3OTwv/EABkBAQEBAQEBAAAAAAAAAAAAAAABAgMEBf/EACYRAQACAgEEAQUAAwAAAAAAAAABEQISUQMTIWFBFCIxkeGh0fD/2gAMAwEAAhEDEQA/APz9ERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREF3NX+T3hOav8AJ7wuvo2mCmHiGuxti8WvPZvCpxiZkKjnc1f5PeE5q/ye8Lp1KwdGQgQtB5Q/CNMEiWhpGIYIBmYix9aDhnR3jMd4Ual25dDGA5rrGCDEr1U0iXNIvh8oyT9/NRHN1Ltyg0nbu8Lqc5aQZY3IxltjgrdH5Q1eTSWmJAdhNiDYgWy2byg42qP2Qmqdu7wu43lVweHkF0CAC/e/Gdm2AIEDDI2yvLeUcIogM/lTHSzBzBtu77orjal27vCal27vC6bdJhoEbCM9/qXrnYObZuDmDwPaEHK1Ltyal27vXTbpLZ/lt27vv/3qhzkAyGAdUb53Ijmal25NS7cumdIb5DY9XyR2ktmQxo7PkiuZqXbu8KNS7d3hdSnpOH8u2bGBkp55aMIyaM4yRHL1Lt3eFGpdu7wuodLJiQDvvmZB+HevR0z0R+bbvRXJ1Lt3eFOpduXW59fxBnv96yIMmpduTUu3LWitIyal25NS7ctaIMmpduTUu3LXCQgyal25NS7ctcJCDJqXbk1Lty1pCDJqXbk1Lty1pCDJqXbk1Lty1pCDJqXbk1Lty1pCDJqXbk1Lty1ooMmpduTUu3LWiDJqXbk1Lty1ogyal25NS7ctaIMmpduTUu3LWiDJqXbk1Lty1ogyal25NS7cunolcUyZ2xkAZF5F9hkdippODXNJAIBEgwQRtF0GIaO47O8L1zWp5PeF0K9YVH4oDbCbiSd5MCV1eQ+Wm6IaksFQPABBIAi/CZy7+CD5rmr/ACe8JzV/k94XV0fSdXUDxhkEmNnerNN07XFhywtg9KZMkl3CSclRyDoFWMWrdG/Z2r1S5OrPnCwmM8l26nKY1LqbXPh7KTXAkYRqwIw9cSqdG0zALOAvO8H5KSsOMdEqCxb3hOav8nvC6ulVxUe51hPyhRpFcPeXTnvMnKM0H63pVejSjGAJ4LzzqhqTW6OrAJJjdmI38FGnaPSrtwv2ZEZhS6jRNLU4Rq8JbhFrfe1dJjHXx+XKN95uq+FLOVdEJgvY1wJBa4QQWlwI65Y7rwmFFTlfRACQ5jow2aL9ItA/7t6sQVDuRNFdUe5wxNc0DDJsZeS6ZnETVdfZNlZU5I0Rzi5zSSYmXvvGDZP/ANtn7VipdXo8raLhaWjGXYIY1vS6YBEgxFiDfeN4U0uVtEc0OxsbLQ6DEgENOzOz25TmFFHkzRWeK1w6TXeO/Nohpz3AD1KtvIOhjDFNwwxh6b7QGARfdTYPUlSNlTSqDXhhiThM4CW9IkNlwECSIF1ZUfRbmWC07J7Fn0jQKNSprDOMBgDgBIDS6wJG3EQfndUHkikfGqVDIIPijEMOGDAGxKkdBr6JOEYJmIte0232OxUt0zRy/BYOki7CAS0w6CRBg5wqqPJ1JhkOeTLTeLYQ0AZbmhH8nUC1wEhzi8l4AxdIkm8ZdLuCVI1ayjEzTiMWYy39S8aRpFCnGIC4LpDCQAIkkgQBfMrGOSKWEjWVILsWycQcXgzE5kq2pybQc2mDM024WOgS2CCHC2ctHDglSNk0t9PPDsz3dfBQ19ExBpmZiCLxnHYua7k6gw3fUuCNh6JIJblwF8+Kg6Fo5D8T3uLwATAGQLRkIFj3K6zwlw6Ar0JzZlMwIi23L8w7Vc6mwZho6wFyKuhUHyXVKmIkmQALkAGwEZDvW59Wi5rWknC3r3Qmk8FtWqb5LewJqmeS3sC5+CjeXPMmbqGsogk46hJ3/wDnrV0lLdHVM8lvYE1LfJb2BZqGkU2MDQ5xA35qzn1PeexTSeFuFupb5LewKrSKbQ2zR2Jz6nvPYvL9Ia8Q034hcevjMdOfC4zFvntH5Q0rD+JofSMAQ1wEx0ic7Yu5aaemVXPYOaw2Yd0XAjpMaHCWxEOLo3DMEEKW09LlrXV2Aw3FdpN3XgYBcgGNlog5ry3R9NxNDqzL3Im5ggnCMIIAyz2rwTXr/Loh+nVmvcOaF4xODS1pFgXATbaAPshequk1tS9w0cNeHtAGFzwRiLSYAByE9RC9M0bTAQTXabNmwvBJdbDEX2QbBetGoaRU0cis5zKsnDDgPyiJwQImbKfb+fCqG6dWhs6IRAGLonawk5A/mjLFuzsobyq9zi1uiAubBc2TLZYXX6PCB1iYyXujoWmNaG65uG8y5zjt/MRJ2bRF85tDqenhzG42kGcThhhvROctk9KIgWGeLNX7fX7lHqrp1TVFzdGwvxuaGuaTiAYXAgATciPUV4p8qPfjwaKHBpc0kSbjDA8XOXXvsNytNDRa5f8AjVZY3CW4HYSSPKgCQdoyXqlQ0gPcTVbhOPC03iXEtJtNhAiYWbxVkrV9JYYFJjwTUuKLsmiATc7RIH5gRC90a9cnpUWgfhT+E4Ri8cDfFjbK4N1LtG0wgDWtFhtBuHA7GCberZG1etTppM61gueiIIjFa+GfFn1gK3HpFOj6TXIY91NgacIINFzHdKrhGZscMkjZa8L3zuuHOadFDukQ0gECMZaCThOwAn9VhElWVaWl4abW1ATLsbzAtiGH8ucTlHXv8NoacJGtp7Ym+yGz0bjbGd808TwNFV72OrfhBzWtYWQ0ySS4OmxmIBgCY3ysjtNrE9HRIhzQZaTiBImLCLTs9ViFoe3SjSawOw1ROKpDcLrGALb8M2GRVNfRdOc17W1mXbANgcomQ2xmST2Yc1Ir5pW7Q3ayk176Wrcc2uFxfqV+rb5I7AuQdB0xrnOZXEmbOJIAlsQI4O7Vu0ClWaahrPDi50tg2AgDKBHVfrKxlEfmJHQo0mx4rewKzUt8lvYFQ3SWssTfqU8+p7z2L6vRxmenHj4cpmLXalvkt7Ampb5LewKnn1PeexOfU957F11y4S4Xalvkt7Ampb5LewKnn1PeexOfU957E1y4LhdqW+S3sCalvkt7AqefU957E59T3nsTXLguF2pb5LewJqW+S3sCp59T3nsTn1PeexNcuC4Xalvkt7Ampb5LewKnn1PeexOfU957E1y4LhdqW+S3sCo0hzGXLRFvyySSYAACnn1PeexZ9Jq06lptbeDIMgghNcuC4S3TKB/NSGyDAIO4g3B4K2k+m8Ymatw3tghc3mdCZIm8iS4wYAMdcX37VezVt8UxlkXbBA7gE1y4Lhpr1qdNuJ4aAIE4ZzOEd6887oW6dIE3gwCqqzmPbhJkW8oXBkQQsztEoEyRMzIJdDpw3dvIwt7BuENcuC4beeUJIx0bNLjlYCJJOzMK4OYRIawgwQQLQQT8FipMpsMtsb3lxnFEzOc4QrjVbGfcePzU1ktDNP0dwkOpH1Z2BsNtiFooPp1ILdW5pyLYIXPdotAkEtEjLxtwHuAV+hGnRAaCYAi8knLMnqV1nguFulOrCmwUAC4vg4sgLyfcsn8U0jp//SO6JeG59INAIOVpk9lpW9+lMpkNJAmYmd4G61yO0KeesktxMkCT0tl/kexZlWfQ9Krvw46WCX3EGzdWHZmJOIwsbNL04U6ZdSaXODXPhphsh0iBeRAtfNdXnTcQbLcRm2K9s/eFc0yAUGHk3TK1UnW6OaIDZuZMkkRlsjPiIUVnlrQQCbtmASYm9hwW9zoErI2tTJLQRIt4xGRi2+9rLWM0kufzvSA1s0bnDOf9wjZt4K7X1DTkMh2F5iDmPFEGDf4LWalMRJbcx4+8x7zCgV6ZdhBbiktjEcxBI7x2re8cFOeNLrtOF1GfSExnwF7X9fAq2hpNUsJdSIcALXubyO4Hd0uBWt1am0gGATEXO0wLxAk2XrGze3f46TnHBSrG7EBFiT6oyP3vVi9Ym28W+XSzXgV6WLDLcU4YxHOAY7CO1Z2hKZtLzHUvm3nSgaQxVHfiPBMZ/itjFubq8fD1wvrXwHyB+SYzXPpcqE3dSt0RiGUkuzNxhGG5kx3Dc5+KKc2lrmUKrgXueXnA2pLoaHYRaxMgYvWqH6bpbXVBqg/BNgwiQHADC6YcXAudAyiCu/S5SDg86p7cDcRxNjq+PVC96FporGMBbZp6XET9/wCVO4avnaWm6ZipY6QAc7C4NY4kRhBMzABOIgnZGa7KP5Ywg/8A09U2mAASZIAt6x37l70blUVNZFGo3AwPGIQXAiRG48CrHVTV4RW/xISQaThBIvEWt/nquvLuU4g6owd8C0EyPWO8b103ng1eFbo7w0mdyu0nSMNEVGtF4MOtE75jvIWYcqQBioukzAAzAcWiAbztjcCVx6mcZ4zjMflYipt50zRqdZxJqPbLcJwxeA6MxmMZWd3JVAuJ1jxYiIBABcHRcZDCIGS6VHS8dN7xTIwgkB1iYnZukQqq+mvbSZUbSxzjkDPotcRhiZktgHiFxx6cYx4mf8f6buVzazAAJNhF5lTzhm/uKz8+dqqb8DcT8XRMicJiRwIuOsDbImrpzm02PFPEXYpAFzBgYd85ze11z+j6c8/v+G8r+cN39yc4bv7l4OlxTa/BikuFsoBPS6jE+tZ/4u2/4NQQASSLXjI7bOGWcxsKz9J0vf7/AIbS184bv7k5w3f3LRhG5IG5Ppel7/f8XaWfnDN/cnOG7+5aIG5IG5Ppel7/AH/DaWfnDd/cnOGb+4rRA3JA3J9L0vf7/htLPzhu/uKc4bv7logbkgbk+l6Xv9/w2ln5w3f3Jzhu/uWiBuSBuT6Xpe/3/DaXPrvBdIVa6kDckDcvbhnGGMYxH4c5xubctF1IG5IG5a73pNXLRdSBuSBuTvejVy0XUgbkgbk73o1ctF1IG5IG5O96NXLRdN8AExMbgqXVoI6EggHjdO96NWJFupVg4E4doHb8F5dpIE9CYn3x98E73o1ceo2sHvLIIMYZyFhO3r2eteWDSZuWwDtiTY/HuXa5wPIPYvVKuHOjDsmVO6auEOdR+SYPVPEdW5XUNbidjjDbCRHHPuXV0asXNl9PCcb2w24AaSASbZgd6y6Np9R9ZtN1ENnHJEkdFz22MC3Qb+8J3V1NL5yajjTqAMOQmNk7t7QM8qjtwXjHpdrtMBuRzIdJmdhbY7Z4K5/KIBg0X7cgLwJtvPBX6NXDwJZEkjfEXvuXPWYaZaFXSsbceHCScVxYZD17VdXc7A0OdiMm8R1WWf8Ai3R/knFhBjZJBMZTmOy63sqgvLMJsAcUdEzsB3qxE4zcpLxpWgtqPDngEtnCZuJIdb9oWZvItINgDYB4w2ODh3gKahGJ05yf8LLynpJZq9VSxFzsMAnyXO9WQG661202bKPJTKZBaMjI6QgZG3YunTMNAJGS+Up6XpEgOoH8suyzBJt1wPWrGaZUNIE0wKxI/DJuBME9l/Ur245Nn09Qy0gESuXV5IpvJJaJJLiQRcmc9/jOHr6lyed6R5v3+lE9l4zVLdJ0sQTSxdFvRiLnFiPqgWU7cGzt0+R6bXBwFxEdIbHYgrKfJrWvDwOkJPjZk5k8SuVotes6o4VKeBsCNt5dN+oN7VtWo6V/KbNOk8nMq3e0ExhmbgXmDsmSFUOR6cRG++ITeJ/6hVonZ9my3+EU8QdFwQ6cW0f5vZe2cmtDw8DpST42ZJJJ7SsBdUvDQfvrUY6kgYBx3K9j2butXqCm4F7mtAZckwM96tk71jr6K2s1rKklpZcAkbVsXOYVBcRuVY0lpxdNhwWdBBw/qvZe3LzVpB7XNdk6cUWmRGY4KeFe21JmCDBgxsKnEeCo0XRWUW4GCGzOc8Pgr1Ak8Ek70RAk8Ek8ERUJPBJPBEUDEeCSeCIgSeCYjwRECTw7Ek8OxEQJPDsVR0gAhpc0OdMNJAJjOBMlWrwG2HVuFpzhXwJNS8YmyZtttE+8doU4jvC57+RaLnOc4OLnOLicRGbsWzirtE5OpUXF1NpBLQ3MmwUGuTw7Ek8OxEQJPDsSTw7ERAk8OxJPDsRECTw7Ek8OxEQJPDsSTw7ERAk8OxJPDsRECTw7FBJ4dilQclR45wMQZibjInDIxRlMTML1rLxIncqXaXSaYdVphwtBLQRN43rmcx0E51Gkl2M/iASfUVJmEuHaxHeFBfAJJAAvOziVhZybRaKgbYVQBUBviAEeqxOW9eHcjaKQRhsQBZx2IrfT0lryQ17HEZgGSOu69tqYhIIIvccLFc13I2jEyW7Z8Y5xC26PTZTYGN8UTt3mT70F0nh2JJ4LzjG8JjG8J4HqTwVdXSAwS9zWiQJcQBJyFzmvWMbwkTxuDkDlcKxQnEd6h1SLlwHX1x71m0nk2lVdjeCXREzFpJ+JVP8ABNH8jIt2n8sx71BNRpxOsczs4rJplKq7V6qQQ4kk5RhcIPrIXSquqB/REtwk3iC7YN/w69mc1dJv+GDuyB956+7ivRtcMU5FfRNNAxNfidEQIH5c7iLOvtJysrToel7K+3a0ZQMrb5PYF09fpMj8JsWm/G+3cpr1K4qHAzEyLTAGz15z1bt+Zmlpg0XRa7amKo8vbBEZQejsjeHdq2YDuPYvRq6SBOBszlFgMLTvmxxdcZKx9Wtqg5lMaw3wO2WmCZzmyRmlKcB3HsTAdx7F5Gk6ZJ/AYBO1wkiTxtaFc2tX1bCaQDyDiAMxe20bL93FWOpZSvAdx7EwHcexenVtJItSaDBzIN9kX3LcrOZTn4DuPYmA7j2LoIpuU8UGSWC46HxWnUeke75Kuj/MH6D/ANlp2rjlPluFJoeke75KdR6R7vkrTsUrNytKdR6R7vkmo9I93yVyJclKdR6R7vkmo9I93yVyJclKdR6R7vkmo9I93yVyJclKdR6R7vkmo9I93yVyJclKdR6R7vkmo9I93yVyJclKdR6R7vkmo9I93yVyJclKdR6R7vkmo9I93yVyJclKdR6R7vkobQsOke75K5G5BLkpVqPSPd8k1HpHu+SuRLkU6j0j3fJNR6R7vkrkS5FOo9I93yTUeke75K5EuRTqPSPd8k1HpHu+SuRLkU6j0j3fJNR6R7vkrkS5FOo9I93yTUeke75K5EuRTqPSPd8k1HpHu+SuRLkU6j0j3fJQ6hY9I93yV68uyKXIx1NFbiPRBO8tb8lS6i0NnVAmYs1vG+WVvlK6RcFTpLiWdF7WwRLjkPs2VZqGQVWnJlTrwtQVWw4llQR6Iv3LRQbWDm43Nw3kbTu2ffcvZa+JxfcH/CTIyio0mzKhvHij5L3oxZULgA4RvA48FuCSAlrSnmrd57B8k5q3eewfJXKVLVRzVu89g+SCgN57vkr1DUsVaj0j3fJNR6R7vkrkS0pz69fA6NU5wgEkE7TFhtWdunki1Ag3gEuvBjdt911qr6VUYYbSxNgdKTmSZsAbCO9eaenVS9jXaO5ocTeZgTF7Z7fitx/3kUc+MwKDgZaJJMXz+9q3NgujDaYmTulTSrPLoLIF7q9ZmVeNS3d3lNS3d3le0UuR41Ld3eU1Ld3eVYiXIr1Ld3eU1Ld3eVYiXIr1Ld3eU1Ld3eVYoS5GdgiqP0n/ALLRtVDf5v8Aaf8Asr9qSkB2KVB2KVFEREBERAREQEREBERAREQEREEI3IIjcgglERAREQEREBERAREQEREBERAXl2RXpeXZFBTUw47zO74x8VS7V4CS4tAcL7psLbr/ABV73HFGGVU5/RM0wb7ctt9vu2rTKg6NRgxVI6iLServXujQpGWtqF0tNsWwxPuheHPpNEahxIgDo57rqzWU8OIUXbRZuzLL1/cJsUroU6RLS2q43BaN/A7/AP1aarWSZJ25Dqn3KkVGdJ4ouxNANxGeUcVYyu1zo1br7SLZwpM2sQ00h0RC9qGiBAUqKKGqVDUEoiIMjq9RpszE2QBGeVz1KBpVQ4vwiIaSJm5iQMl4qaS2T0JufzR3KH1w3OmR/cumvpm1vOnlpIpmQYj1Tu6vWdijnVSY1R74933wVDdNYQCGSDkQ6ynnbfIP7ldJ4LX86fhadU6TMi9rxsH3xUO0upFqLrRn37FTztvkH9yrZynScQG4SSJAD5JG/uKaTwW6dN0tBIgkAwdnBelzeeskjDcAEjFeDMHuPYjdMYQCGSDcEOsVO3JbpIubz1meG36lPPGeR/yTtybOii53PG+R/wAlHPGeR/yTtybQ0l0VR+k/9kr6SWNLoxRFhmZMACV4DgXtIsCz4qxZyWHo1TnZYm8tUzE4hO9q1qCwEgkCRlbJSK+VZ38r0wA4nokAzGU7/XbrIUDlmkTGIzIF2nM5D4LVhG5TCt4jL/F6c4b4t2EkqDy1SvBLoGKw2da1QNyYRuCXjwMo5ZpnIki9w0xb7lHcs0hEuzmOieqOtaoG5MI3BLx4Gb+MUvKJjOGmy9HlWmMN/GbiEA5Zq+BuUFgIIIBBEERsTwMx5YpxMmIkWzsoHLVMjbO6PjltWprAAAAABYADKMlOEbgl48DL/GaUTiMCJOE2nL74IeWaYIBJynLjH31LVhG4KCwHMA7cvvcl48DN/GacWk55iMomZjYZ9S8jllvkuFpvA2E7+Hu3rZhG4JhG4JePAzUuVg6oKeFwcS4XAsWi834HsVlTlFjJxGMJDcp2A7OtWwgA3KTXwMo5ZpyRJtM9E7In74FencqsAaZMOJAtuJHvELRhG5eWUmtAa1oAGQAtfNQVUeVab3BrSSTI8U7M14p8sMIaekC7IYb8VrWd1ZrcOJzW4jDZ2kAuMeoE+pUQ7lRoNwcJwgOi0ubiCj+M0r3NvRK9DSGED8RkESLjKAfcR2qHaQwNxmowNkiZ2iSR3FKE/wAVZ0rnokA2ymfkVWeW6PlZRPRO3JXB7TMOZlOYyjPqyXgaQwtxY2xxsdmwido7UoBysw4xeWAkyNjQCfevT+VKbThxSeA4x70fUDQXEgNDS4u2REk5LxzinDvxKcN8a4t0Q7/qQeopQv0fTW1QSwyAYy22PxCt1hWdlQFxa17S7OAb5x7wV4bpbCJFVpGEvmfyjM5ZWShr1hTWFZxWbJGsbIJBuLECSOwhQ2u0kgVGGIGYzIkDsSkadYUNQwVldpDQJxiJi17xNoF7XspdpDRANRl5i+4Se4pSpqcosaSHGCCRHVHzCUuUmPcGNdLjOzd9nsUxfZe+xAPuyuqWv1hTWFZDXqXimT681Dq1WxFPNoJG0HbdNZLbNYU1hWTX1Jb+EYJg3yvn2XVj3vkwO5SYpV+sKawrw0yFKgzDlE+SP3LWHlV6tvkjsCkKzXwkPesKawryiiubUd0ndZWDlWk2qKQqPLQKkiGyXHC4QLWzmdkbM106tIYrkyZMBpO3h1qqrotN4Ae1zgDImk437F6tsac6lwGUNExBwrOJ6NonxZaBGHOx42slKpo+pZRxu1Rgw5suyNTpEWA6JtEmCu2OTaAECnA3al2yY2cT2qP4ZQt+Fll+C61otbcs3CuI2hojiA2o8k+K0AXGLHIkXbcmMiAbGFPM9D1bTrSGvb0XbwzETNsrnNdxnJ9FpltMggRai7KI3bk/h9GA3V2EwNS6BOezal4nlytHbo1J7nCo7E6JDm32+jM9O+0AjZC8aNyPo9SkDTqPcwtwgggCx3RbK9uOa7DeT6IMinB36l3yVlPRmMaGtDmgZAUnAe5W8flPLHR0JjG4RJGIu2ZnPJejojL7/ctbmNAkl4G803fJGNa4S1ziN4puPwW4ziEplborA3DFj8oUDRGSDBtxWzVD0/8A9b/kmqbIu4TvY4DtKdyOTVoo5s/R8VoVNNsOaNzD71cvLl+XSEFSoKlZUREQEREBERAREQEREBERAREQECIEBERAWHStHY9o1kYQ4QS4NvER8FuWKvRxht4IJI6IcLtwmx4Eq/CKm8jtBkB02A6QtBkd69u5Pbgw4SBM+ML2IPcSqByb4xNSoSS4jZEwPh71Zo+hYHl+JxJ2HIWi3alD0zk9gc5waekCCJsQbLy/k1rnBzsTnDaXbsvect95W5FvWEtS6jLC05FpbnsIhZP4TT2BwicnbxHuXRRNYGNmgNbkDlE4rgBpaI6g4qG8nU24sLSA5pYQHWAOcbv8raiawMB5LpkyQSZm54Ae4KTyYy9nCRBh2zctyJrAx09BaxuFuIdIvBxXkgg9xPaq/wCFU5Jwm8nxp8bPPO5JvOa6CJrBbxRZhaG7AABJk23le0RUVaurseIvFuzZ9+/2WuwtBcJi5yk2WY0qRmXGZM2mN94/89d7wGYGAEluG3EW/wALOSwspAiZM+udp/x2KxVUS0yW/dz/AJVqwoiIgIEQICIiCinWA0gN2lrvgfgt65j61KnWa6o5rSWuguMZEfNbBplL+qz9wVlIXos402icqtP9wU88pf1WfuCir1CoGm0TlVp/uCp0rlKi1rvxmB2Ex0hnsViLGurUDGuccgCT6lXomlNrNLmzAMXXK0blFr9HrayqwuhwEuaD4qq0DlCmzR6g1rWvklvSE5CF07fieWNnX5R/kVP0qjkT+R/cVgZyox2jVG1KoLzMSepVaJyw2lSwDCTJIJcI7FvSdZx9ptF2+lWPlKsGNbO1w96o0Hlik9oD6jA8k2Bsvem1qNQNbjY52IQA4E8bLjOMxNS3dvY8cfpP/ZWqoeOP0n/srUyIQVKgqVlRERAREQEREBERAREQEREBERAQIgQEREBYdNpVHUwKTg14mCdnQcBsO0hblUDZbxSXLOiaUC8sqNBdOZPohp8XOGmetTzGuIw1GgCYvl0mHyZiAbfY6iLVI51HRq4cHPcHQ1wHSkgmLeKAcs7fE109G0qzw9ocRTxYjeAXEg9G/jfcLqolDnc30r+oB0RbFImZM9GfXIziNq6KEqARvQSiIqCIDOSICIiAiIgzF7bzRJzvhzz4fd/XbiGFnQiRZpGWVoUfjbMEbOrYvL217EFs4RIOWLaViYsX0ttgMsvX9+tWKuHwL3gzG/YlMOk4lhpYiIgIEUBBKIiCtFeaI4pqBvK6bQzShFfqBvKagbym0FOQKmkDFIxHVuiAAMU27d02he9GqV8TsQJbfDMCcsJttPSkbIGS6moG8pqBvKlwtOfpFSrq3FrYf+WCHSePBUCtpOEy04sZiw8XDYfugTu27V19QN5TUDeUuCmDQn1Cz8UQ6f8A+ROXpYgOAC0Sr9QN5TUDeU2hKUSiv1A3lNQN5V2gpnHjj9J/7K1eHiKgHofFe1nJYQVKgqVlRERAREQEREBERAREQEREBERAQIgQEREBc3lXRnVqTKYaHNNRheCY6LSHEDrIA9ZXSWLTaDqlMBjsLgQQZI4HLgT64WsUliq09Nc9zgWNbIhodPR6EgW8b+Zf9PFVVND0xz4L/wAOWj+ZBIDmk5NzIab+mdwWllDSWhoDxPRxOt1EgHbAHDNe9EoVm1QXnoBrwOlOZYRI32d6iOK0jDQ0PTgwNNRjThAJaRmcGJwERixGq79o2lXVtG0t1CkA5utGMuJdIDi12C+G8Eg7PFXXRWhx62gV6mjCjUcHk1gXOJE6sPxDZEkACMrnqVFPk3SmCWFge5z3PdaZfUBMWz1YAm3uXfRKHFq6Jpv5asy4zLgMIGENIOHMjE48YGUr2NG0wAnWYiWvcBiAaHknA3KS0CB2ncuuiUOHoOiaRR1dIPcW0g6xs0sGEUhiwxJwmReA48F2aTS1jQSSQACSZJ6yIXtEBZq7KuupOY6KYDhUaRnMQZ2EEd60ogIiKjMabb/jEG/5ss16dTENGtdcC4uTh2zuvmq9bSktwOJvMd5zUuqU4DcBIDC6OGEGOruss+UDRp2OuMC04uqbq6kzC2C/FLpBzkWA++KqqVKN24cQzMZGeM3yHv4q2jUY4dEGGnBB9Sk3Sw9UWCZaZ+4++pXKmk8TAbEgd4J+farlhoQIgQEREGhERAREQEREBERAREQEREGWr/NH6fivS81f5o/T8VKsgVKgqVAREQEREBERAREQEREBERAREQECKAglERAVDg6BhIB4ifiFeqCCQIMeqVvFJecNTy2fsPD0uv7z9DEAPFcdpu0Z7BfYowv8sft/yowP8sft/wArSLGzAmx2qVz9J0OoapqMc0HC0ZkTAqZwMpe0/wBqjRdHqsqDE+ZMvg2PQY0TkZlpOUQSg6KKJUqgiIgIiICIiAudyhUcNI0UAkAudIBsbDNdFczlL/5Oh/qd7goNoNWZwNAynbHapa+vaWt4wfd7lD9He6YqkA7Iy33XrUPwxrTN7xvj/PasTK09B9QsEth15AVjJI6Wf+VRzd+FwNUmQALZb1NOi8EE1CeEcVlWhERAQIoCCUREGhERAREQEREBERAREQEREGdwmsP0fFedLpVHUi2m7A8xB3Xkr3/vf2fFXbVZR5MwN654oaWC38RrthmN2eX3K6R2KUiaVh1WkdHptkATG07dn3CrFHS5nWM6iJHu+/f0kV2HPbR0qf5jYJvtIHZH32DT0rDAfTxSb7I2CIXQRNhzxR0qB+K2bk2kTe2WWXevJpaZ0oqU75SMjbh939XSRNhidT0iAA9vEx1bI6+711MZpRElzQcoJHbYEXv1WzuF0lCWOeyjpQJmowtPC+ewkG33xRlHSgI1jPf8Ovuy29FE2HOraLpF8FXYYLjl0pFgN0dnXJmi6Ri6dUFvCR+ebD9Mj18F0UTYQpRFkQsVSlX1mJjxhtYnO2WVtvFbUbkEGGkzSQ44nNMg32AxDR3z6uK9U6VcES9pl8unyYAgW69y2ogwClpPR6bAIAdtJjM3GZ+5QtJAhxaeEfFblhc0kCHFvVHxBW8Ul51Tv6juxvy+5TVO/qO7G/L7hNS7+o/sZw9H7lNS7+q//h9P3C0hq3f1HdjfksdTQHl73MeGFxnIm8ET3i0xabFbNS7+o/8A4fJS2mQZxuPA4YNtsAIMD9CqYi9r+lcNgk4Z1eZJuPwzO0yuk2MgcuMn1rlVtCfSGLWwzGKlQk4ACB0yTNmmx4XzmzmFWKYZWc5sXeHHyWjFmZyJ9cbZQdZFzXcmVD/vOH7jF2m0u2Qe1aaD2UxgdVaXTkXXEkQACZ/MP3Dgg0ovL3holxDRYSTAuYHevSoIiIC5vKX/AMnQ/wBTvcF0lz+UKbjpGikNJAc6SBYWGe5Qdml4o6l6lc7VUjM1TN5GLK17LQ5rIbLiQA2+eWRniucwsNIKlU6MGwcJkSrlFEREBQFKhqCUREBERAREQEREBERAREQEREFH+9/Z8VdtVP8Avf2fFXbVZQOxSoOxSooiIgIiICIiAiIgIiICIiAiIghG5BEbkEEoiIIWOpoznACXNja0gStqKxNDncxd/Uq/ubw4cO8pzF39Sr+5vHhx7guiiu0pTn8xd/UqfuG/qUcwd/Uq/ubujcuiibSU5z+Ty5tRjnPc17XNILhk6Zi2d1XX5IFTN1QQ3CC18ECCO3pFdVFNpKcl3IzTPjiXB1n5QIAG4betK3I7Xvc/ptc6MWB8AkRBjKRhic4susibFOQeRWE/n2WxWsZ94V+i6DqmlrcRBM9J07APh710FCu0lM+qduUat25aVDsim0lMpBmIPcvLnwCYMCy0Pa7FY2VT21Q0welPDL1/fWrsigVacXou4nCFY6u0R+GYAECL5WtltIz4KQ/SNzO3Ne3Pq4QWtGKMpEAzfuySRSzT2tF6b2yQIj1CFpbpEtBDTcEweCpq84xEtDCMwD1D4z2qQ7SJEtZEib7JM90KTEKvpVcU2iP8q1QpWVFDVKhqCUREBERAREQEREBERAREQEREFH+9/Z8VdtVP+9/Z8VdtVlA7FKg7FKiiIiAiIgIiICIiAiIgIiICIiCEbkERuQQSiIgIiICIiAiIgIiICIiAiIgKHZFSodkUHh1YAxtWfSHsew4i4AEZcbX7Va9xxRhniqjU6Jlgz9W0yc927NaQNWlYY9w4yPdmvdGrTaAA4HFEe74LO+tTAI1Djwwi8hexUphocKLujMQ0SIPzPvU8HlcNNpn8wHXZWNrtIJDgYjLjksYfThxFAy0A3aBM5QV7o1WTDaTmg5mABY2TweWltZpMfAqxeBTbMxde1FFDVKhqCUREBERAREQEREBERAREQEREGao6Ko/T8V40vTdUx1RwkNiY616q/wA0fp+KmFZSHo1dqxN5bp2nEJ3ha1BYCQSBIy4JFfKs7uV6YAcT0SAZjKd/rt6woHLVImMV5Au0i5yHwWqBuUwr9oy/xinOGTi3YSSvJ5apXg4oGKw2bLrXCQNyXjwMo5apnIki9w21vuUdy1SES7P0T1R1rVA3JhG4JePAzfxil5UgZw02Xo8rUxh6U4m4hAOWavhQWAgggQRHqTwMx5ZpxM2iRbOygctUztM7o+wtTWAAAAACwEZKYG5Lx4GX+NUonEYtJwm05ffBDyzTBAJOU5cY++paoG5QWA5gFLx4Gb+M0yLGc+GWecbDK8fxxmIAtcDE5cJW3CNwQAbkvHgZaHLDaj2tAMuJAMWsJ+BXt3KTWuc0g9GLx6OL3K+EAGcKTXwMreWaRi5BIJjCchn2L1/F6eF5kkMw4rZYiAPetEDckDcoM9TlVjZJnCGhxdhJFyWgW2yFX/G6ciZEicuMFbISBuQVUeUWveWDMbxxIjrsq6PK7HljRIc+7QRn68lpgbkhBlHLNPbIzi24SeyCPUo/jlK1zBEgxneOta4SBuQV6PyiyoYab3NwRkYPvWjWHgqtWMWKBiIidsL0g96w8E1h4LwiD3rDwTWHgvCIPesPBQ6oYOS8qDkgoq8qNY5zXSC3O09XbIXh3LNMGBJvBMWGfy2LQ6ZyVbsYaYaCZy+PuKtI80eVaby0NM4suiR97O1a9YVhL3ktJoAkEwZyPYpZXq2BpZzJmAFdS23WFNYVmp1nOYHYYJJsZ2T8l7pucTcR9j/PYsqv1hUawryiD1rCgqFeUCD1rCp1hXhEGhERAREQEREBERAREQEREGWr/NH6fipUVf5o/T8VKs/CBUqCpUUREQEREBERAREQEREBERAREQECIEBERAREQEREBERAREQEREBERAUHJSoOSDy5snOyqrUyGHp4bi8xt3+v/wAVzn3+/eqazmlvTbImINthzOS1CPLS8OBdUbG6Ym3/AKUwPmW1Rhnrtu67qgv0dpmDIjf6lbTqUWMJbIbMnM3Ef4VmYSkNZVMfjAngLXyttVlNlYObicHDbkPh615Gk0oLhJwgEi/UM+tWM0tjiACb8FNlpeiIsqIEQICIiDQiIgIiICIiAiIgIiICIoQZqn80fp+KmoQ1pcSYG4TtjYvR/nf2fFWYBlFoyVlFFGo2o3E02ndGyfcQfWqWafRMfiZ72kLY2k1vitAncIVbtCpEgmm2QZFgkV8qq5zT6PT8YAixyOS8c+ozGtA67LVzWnEYGx1BQNDpf02ftCeBRzylfpxBi4OfUpOlUg3FrG4ZInZIzVw0SkCCKbJGXRCczpf02ftCeBnGm0f6gFzmDs+HFOfUbzVAjObbvn79y0jRacAatkC3ij72rydCpERqmR+kK/aKzpNIAHWCDOzd/wCjtG9RzulPj9x22H3xG9XnRqdug22VgvDNAotECm2JJuJueJ+7KeBSzTqJJAqCd0XPUNoRum0SJ1g9YP3/AOhaOaUv6bP2hBolIZU2ftCv2jK/lGiJBfln0TZeqWm0nuDGv6RyEcJ+C0u0Wmc6bTs8UKWaNTaQQxoIygDdHxPangNVxU6rivalZFWr4rOdJptcWudBbEki1xPuWxV83YTiLGknMwN0e5BnOlU4BxeMYFjJM4feobpdIuLdYA7FhggiTuE5rSNGpjDDG9HKwtebetOb05nA2ZmYGe/rQZ3aVSGGagBdBAIzkwO9OdUsLXCoHBxhsCZvHvV3M6Uzq2T+kL0NGp9HoN6Pi2FttkFDtIpggF4BMwNpgxl1heqVVjzDXgmJ9X2R2jerDolK/wCGy5k9EbbqaejsYZaxrTlYAIJ1XFNVxViIK9VxTVcVYiCvVcU1XFWIgr1XFRquKtRBVquKOp2N1avLsigyVtJp0wS94aASJI3XKHSaWEu1gIBLTAmCNi9VaDC8nVNJzktH3K8BjAAHUm4cWWG0wb/CeKtJaXaRTF3PDbx0hF4B9xC9Mq03ENDxJmBHb7j2HcqdbTdLjo5k5yy5tCuZTYMDm0mgkTlBGQPvSYpbXarj3JquPclCqXC7YP8A78laoK9VxTVcVYiCvVcVAp8Vaoag8arimq4qxEBFClAREQEREBERARFCCVCIgp/3v7Pirtqp/wB7+z4q7arKB2KVB2KVFEREBERAREQEREBERAREQEREEI3IIjcgglERAREQEREBERAREQEREBERAXl2RXpeXZFBU8HFmAF5ZjGRxdK99l/8KagbiuT1ercqZaACHEdK1pzB2D1rXwy9NOkRcU59ce/7srunDcgYvHWPhKyNpCI5wTFpLv8AP37rajLAGrBALZOd4452SVhfQD46d/s/4ViwUKcdI15A9LeIE3+4VlBjWsHTxBzsUmb3hSYLa1Kz0KYBlrpt8PvsWhRRQ1SoaglERB+PeG3KXnPs6f0p4bcpec+zp/SuAiDv+G3KXnPs6f0p4bcpec+zp/SuAiDv+G3KXnPs6f0p4bcpec+zp/SuAiDv+G3KXnPs6f0p4bcpec+zp/SuAiDv+G3KXnPs6f0p4bcpec+zp/SuAiDv+G3KXnPs6f0p4bcpec+zp/SuAiDveGfKM4ucXiP5dP6V68NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IPoPDblLzn2dP6U8NuUvOfZ0/pXz6IO8f9Z8ozPOL/wD46f0ryf8AV/KBEc42z/Lpi/7Vw0Qdkf6q06Z1wmI/l08v2qX/AOrdPd42kTl+RmzLYuKitylOy3/VWnAECsADExTp7Mvyqwf6y5RsOcZWH4dP6eC4SKWrvN/1pyiMtI9nT+levDblLzn2dP6V8+iD6Dw25S859nT+lPDblLzn2dP6V8+iD6Dw25S859nT+lPDblLzn2dP6V8+iAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiD//2Q==", "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from IPython.display import YouTubeVideo, HTML\n", "\n", "display(HTML(\"

Video - OpenRefine - Review Reconciliation Results

\"))\n", "\n", "YouTubeVideo('pT0b0vsPRJ0', width=1024, height=576)\n", "\n", "\n" ] }, { "cell_type": "markdown", "id": "bb324586", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Add a Column Containing Entity TGN Identifiers further to Reconciliation Process\n", "\n", "- Create new column to hold the TGN identifiers\n", "\n", "" ] }, { "cell_type": "code", "execution_count": 10, "id": "81b01078", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "

Video - OpenRefine - Add Entity Identifier Column

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/jpeg": "", "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from IPython.display import YouTubeVideo, HTML\n", "\n", "display(HTML(\"

Video - OpenRefine - Add Entity Identifier Column

\"))\n", "\n", "YouTubeVideo('PNPhs_7MQ6o', width=1024, height=576)\n" ] }, { "cell_type": "markdown", "id": "2bc1b729", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Manual Reconciliation\n", "\n", "Some additional manual reconciliation was required using the TGN search form.\n", " \n", "\n", "\n", "#### Further Reading\n", "- TGN search form http://www.getty.edu/research/tools/vocabularies/tgn\n", " " ] }, { "cell_type": "markdown", "id": "cf10d529", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Result - CSV file with TGN Identifiers\n", "\n", "The result of the reconciliation process is a new column with TGN name authority identifiers for place names identified in the artwork title. \n", "\n", "A CSV file is created with the following steps:\n", "- export CSV file from OpenRefine\n", "- save as [data/ruskin/ruskin-places-rec.csv](data/ruskin/ruskin-places-rec.csv) `" ] }, { "cell_type": "code", "execution_count": null, "id": "b8e984f7", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "from IPython.display import YouTubeVideo, HTML\n", "\n", "display(HTML(\"

Video - OpenRefine - Export Results as CSV

\"))\n", "\n", "YouTubeVideo('0tBjqr5AEmA', width=1024, height=576)" ] }, { "cell_type": "markdown", "id": "bbd90fbb", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Remove rows without TGN identifier\n", "\n", "- A final step removes rows that do not have a TGN identifier.\n", "- The resulting dataset is shown in tabular format below." ] }, { "cell_type": "code", "execution_count": null, "id": "419bea9c", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "reconciledRuskinPlaces = \"data/ruskin/ruskin-places-rec.csv\" \n", "\n", "# read file into pandas dataframe\n", "df = pd.read_csv(reconciledRuskinPlaces,low_memory=False)\n", "\n", "# remove rows that have an empty tgn field value\n", "df = df[df.tgn != \"\"]\n", "\n", "# write dataframe to file \n", "df.to_csv(reconciledRuskinPlaces, index=False) \n", "\n", "# for illustration display dataframe\n", "\n", "display(df)" ] }, { "cell_type": "markdown", "id": "25a0c234", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Add Geographical Coordinates to CSV file\n", "The next step is to associate geographical coordinates with the Linked Art artwork representations, using the TGN identifiers to query the TGN web service and return geographical coordinates.\n", "\n", "Steps:\n", "- request JSON file from http://vocab.getty.edu/tgn/ using TGN identifier\n", "- extract geocoordinates from response\n", "- add geocoordinates to CSV file" ] }, { "cell_type": "code", "execution_count": null, "id": "cce77b47", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "import requests\n", "\n", "latprop = \"http://www.w3.org/2003/01/geo/wgs84_pos#lat\"\n", "lngprop = \"http://www.w3.org/2003/01/geo/wgs84_pos#long\"\n", "\n", "\n", "display(HTML(\"

Geographical coordinates retrieved from TGN web service

\")) \n", "\n", "reconciledRuskinPlaces = \"./data/ruskin/ruskin-places-rec.csv\" \n", "reconciledRuskinPlacesCoords = \"./data/ruskin/ruskin-places-rec-coords.csv\" \n", "\n", "# create dataframe from CSV file containing reconciled data including TGN identifiers\n", "dataFrameRuskinPlaces = pd.read_csv(reconciledRuskinPlaces,low_memory=False)\n", "\n", "# set type for column 'coords' as string in dataframe\n", "dataFrameRuskinPlaces['coords'] = dataFrameRuskinPlaces['coords'].astype(str)\n", "\n", "\n", "display(HTML(\"

Retrieving geocoordinates from vocab.getty.edu TGN API. Please wait for task to complete.

\"))\n", "\n", "# create dataframe to hold geographical coordinates with columns tng and latlng\n", "dataFrameGeo = pd.DataFrame({}, columns=['tgn', 'latlng'])\n", "\n", "# iterate through reconciled data containing place names and TGN identifiers\n", "for identifier_tgn in dataFrameRuskinPlaces['tgn'].unique():\n", " \n", " # print . to indicate progress\n", " print(\".\", end='')\n", " \n", " #create query string for web service - get tgn id using .split()\n", " query = \"http://vocab.getty.edu/tgn/\" + identifier_tgn.split(\"tgn/\",1)[1] +\"-place.json\"\n", " \n", " # use requests.get() to query TGN web service using TGN identifier to return geo coordinates \n", " resultsJSON = requests.get(query).json()\n", " \n", " # get lat lng from web service query results\n", " for record in resultsJSON:\n", " lat = resultsJSON[record][latprop][0][\"value\"]\n", " lng = resultsJSON[record][lngprop][0][\"value\"]\n", " \n", " # create string for lat lng\n", " latlng = str(lat) + \",\" + str(lng)\n", " \n", " \n", " # append TGN identifier and lat lng to the dataFrameGeo \n", " dataFrameGeo = dataFrameGeo.append(\n", " {\n", " 'tgn': identifier_tgn, \n", " 'latlng': latlng\n", " }, \n", " ignore_index=True)\n", " \n", "\n", "# for illustration display dataFrameGeo with addition of geo coords\n", "display(dataFrameGeo)" ] }, { "cell_type": "markdown", "id": "15c96dc7", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Update CSV File with Geographical Coordinates \n", "\n", "The following code \n", "- merges `dataFrameRuskinPlaces` with `dataFrameGeo` containing the geocoordinates.\n", "- removes `coords` from `dataFrameRuskinPlaces` \n", "- renames `latlng` to `coords` in `dataFrameRuskinPlaces` \n", "- writes `dataFrameRuskinPlaces` to a CSV file" ] }, { "cell_type": "code", "execution_count": null, "id": "107a351d", "metadata": {}, "outputs": [], "source": [ "reconciledRuskinPlacesCoords = \"./data/ruskin/ruskin-places-rec-coords.csv\" \n", "\n", "\n", "# merge dataframe with coords with dataframe from csv\n", "dataFrameRuskinPlaces = dataFrameRuskinPlaces.merge(dataFrameGeo, on='tgn') \n", "\n", "# drop column coords\n", "dataFrameRuskinPlaces = dataFrameRuskinPlaces.drop('coords', 1) # drop column coords\n", "\n", "# rename column latlng to coords\n", "dataFrameRuskinPlaces.rename(columns={'latlng': 'coords'}, inplace=True) # rename column tgn to coords\n", "\n", "# drop rows that have na value in coords column\n", "dataFrameRuskinPlaces.dropna(subset=['coords']) \n", "\n", "# write to CSV file\n", "dataFrameRuskinPlaces.to_csv(reconciledRuskinPlacesCoords, index=False)\n", "\n", "display(HTML(\"

CSV file with Geographical Coordinates

\"))\n", "# display dataframe\n", "display(dataFrameRuskinPlaces)\n" ] }, { "cell_type": "markdown", "id": "1ce006eb", "metadata": {}, "source": [ "## 4. Define Geolocation Representation in Linked Art\n", "\n", "The next step is to define a representation in Linked Art for geographical coordinates of place depicted in artwork. The relevant parts of the Linked Art model are: \n", " - Depiction\n", " - Geospatial approximation\n", " - Depiction of place with approximate location\n", "\n", "\n", "### Linked Art Data Model - Depiction\n", "\n", "From https://linked.art/model/object/aboutness/#depiction\n", "\n", "
Many sorts of artwork depict things that can be pointed out in the artwork. These could be identifiable entities, such as a known Person or Object with a name or identifier, or unidentifiable (perhaps fictional) instances of a class of entity, such as a depiction of a battle but not any particular battle. For example a portrait depicts the person sitting for it, or a sketch of a generic landscape depicts a place even if it's not a particular, known location. The depiction pattern describes what is in the artwork's image.
\n", "\n", "
This is modeled using the `represents` property on the VisualItem, which refers to the entity that is being depicted.
\n", "\n", "\n", "The following representation will be used for place depicted in Ruskin's artworks:\n", "\n", "\n", "`{\n", " \"@context\": \"https://linked.art/ns/v1/linked-art.json\",\n", " \"id\": \"https://linked.art/example/object/34\",\n", " \"type\": \"HumanMadeObject\",\n", " \"_label\": \"artwork title including place name\",\n", " \"shows\": [\n", " {\n", " \"type\": \"VisualItem\",\n", " \"represents\": [\n", " {\n", " \"type\": \"Place\",\n", " \"_label\": \"place name\"\n", " }\n", " ]\n", " }\n", " ]}`\n" ] }, { "cell_type": "markdown", "id": "1b781b3c", "metadata": {}, "source": [ "### Linked Art Data Model - Geospatial Approximation\n", "\n", "The Linked Art data model describes how to represent geospatial approximation.\n", "\n", "From https://linked.art/model/place/#geospatial-approximation\n", "\n", "
All recorded locations are approximate to some degree. It may be desirable to capture this approximation separately from the actual place, especially when that approximation is very uncertain. Especially if the place is the exact location of several events, and perhaps an address or other information is known, but not the exact geospatial coordinates.
\n", " \n", "\n", "
Secondly, as a place is defined by exactly one definition, but there might be multiple approximations such as a polygon as well as the central point, the real place that an activity occured at can be related to multiple approximate places to capture these different approximations.
\n", "\n", "\n", "Example Linked Art representation of geospatial approximation:\n", "\n", "`{\n", " \"@context\": \"https://linked.art/ns/v1/linked-art.json\",\n", " \"id\": \"https://linked.art/example/place/4\",\n", " \"type\": \"Place\",\n", " \"_label\": \"True Auction House Location\",\n", " \"approximated_by\": [\n", " {\n", " \"type\": \"Place\",\n", " \"_label\": \"Auction House Location Approximation\",\n", " \"defined_by\": \"POINT(-0.0032937526703165 51.515107154846)\"\n", " }\n", " ]\n", "}`\n", "\n", "\n", "### Linked Art Data Model - Depiction of Place with Approximate Location\n", "\n", "Relating the Linked Art model for geospatial approximation to the depiction of places in Ruskin's works, the following representation has been created:\n", "\n", "\n", "`{\n", " \"@context\": \"https://linked.art/ns/v1/linked-art.json\",\n", " \"id\": \"https://linked.art/example/object/34\",\n", " \"type\": \"HumanMadeObject\",\n", " \"_label\": \"artwork title including place name\",\n", " \"shows\": [\n", " {\n", " \"type\": \"VisualItem\",\n", " \"represents\": [\n", " {\n", " \"type\": \"Place\",\n", " \"_label\": \"Lucca\",\n", " \"approximated_by\": [\n", " {\n", " \"type\": \"Place\",\n", " \"_label\": \"Lucca - Location Approximation\",\n", " \"defined_by\": \"POINT(-0.0032937526703165 51.515107154846)\"\n", " }\n", " ]\n", " }\n", " ]\n", " }\n", " ]}`\n", "\n", "\n", "\n", "\n", "#### Further reading\n", "\n", "- Depiction https://linked.art/model/object/aboutness/#depiction\n", "\n", "- Geospatial approximation https://linked.art/model/place/#geospatial-approximation" ] }, { "cell_type": "markdown", "id": "c292ee0c", "metadata": {}, "source": [ "

Visualisation - Geographical Coordinates of Place Depicted in Artwork

\n", "

Below is a visualisation of the Linked Art JSON-LD representation of geographical coordinates of a place depicted in an artwork. \n", " \n", "Further information\n", "- explore the representation by clicking on nodes\n", "- SVG representation \n", "- uses \n", " - D3.js\n", " - is a modified version of code available in the JSON-LD Playground codebase\n", "\n", "\n", "#### Further Reading\n", "\n", "- d3.js https://d3js.org/\n", "- jsonld-vis https://github.com/science-periodicals/jsonld-vis\n", "- jsonld playground https://json-ld.org/playground and https://json-ld.org/playground/jsonld-vis.js \n" ] }, { "cell_type": "markdown", "id": "687ef899", "metadata": {}, "source": [ "

" ] }, { "cell_type": "code", "execution_count": null, "id": "d1ccb958", "metadata": {}, "outputs": [], "source": [ "from IPython.core.display import Javascript\n", "\n", "code2 = \"var file = './data/examples/geolocation.json';\"\\\n", " \"var selector = '#vis';\" \\\n", " \"visjsonld(file, selector); \" \n", "\n", "with open('./src/js/visld.js', 'r') as _jscript:\n", " code = _jscript.read() + code2\n", "\n", "Javascript(code)" ] }, { "cell_type": "markdown", "id": "1dfa884a", "metadata": {}, "source": [ "## 5. Add Place Name and Coordinates into Linked Art JSON-LD Files\n", "\n", "The final step is to add place names and geocoordinates to the original Linked Art files. \n", "\n", "The updated Linked Art files, including the geocoordinates, will later be used in a storymap visualisation of the artworks of John Ruskin, mapping the artworks to the locations that they depict, using the geocoordinates.\n", "\n", "The `cromulent` Python library is used to create the JSON-LD representation." ] }, { "cell_type": "code", "execution_count": null, "id": "d37dd072", "metadata": {}, "outputs": [], "source": [ "try:\n", " import cromulent \n", "except:\n", " %pip install cromulent\n", " import cromulent\n", " \n", "from cromulent.model import factory\n", "\n", "from cromulent.model import factory, Actor, Production, BeginningOfExistence, EndOfExistence, TimeSpan, Place\n", "from cromulent.model import InformationObject, Phase, VisualItem \n", "from cromulent.vocab import Painting, Drawing,Miniature,add_art_setter, PrimaryName, Name, CollectionSet, instances, Sculpture \n", "from cromulent.vocab import aat_culture_mapping, AccessionNumber, Height, Width, SupportPart, Gallery, MuseumPlace \n", "from cromulent.vocab import BottomPart, Description, RightsStatement, MuseumOrg, Purchase\n", "from cromulent.vocab import Furniture, Mosaic, Photograph, Coin, Vessel, Graphic, Enamel, Embroidery, PhotographPrint\n", "from cromulent.vocab import PhotographAlbum, PhotographBook, PhotographColor, PhotographBW, Negative, Map, Clothing, Furniture\n", "from cromulent.vocab import Sample, Architecture, Armor, Book, DecArts, Implement, Jewelry, Manuscript, SiteInstallation, Text, Print\n", "from cromulent.vocab import TimeBasedMedia, Page, Folio, Folder, Box, Envelope, Binder, Case, FlatfileCabinet\n", "from cromulent.vocab import HumanMadeObject,Tapestry,LocalNumber\n", "from cromulent.vocab import Type,Set\n", "from cromulent.vocab import TimeSpan, Actor, Group, Acquisition, Place\n", "from cromulent.vocab import Production, TimeSpan, Actor\n", "from cromulent.vocab import LinguisticObject,DigitalObject, DigitalService\n", "from cromulent import reader\n", "\n", "try:\n", " import pandas as pd\n", "except:\n", " %pip install pandas\n", " import pandas as pd\n", " \n", "try:\n", " import os\n", "except:\n", " %pip install os\n", " import os\n", " \n", "try:\n", " import json\n", "except:\n", " %pip install json\n", " import json \n", " \n", "artwork = {}\n", "cnt=1\n", "\n", "# directory that will contain updated Ruskin artwork representations including geo coords\n", "storyvisdir = \"data/ruskin/storyvis/json\"\n", "\n", "# file containing reconciled data with coordinates\n", "filecoord = \"./data/ruskin/ruskin-places-rec-coords.csv\" \n", "# open file containing reconciled data with geo coordinates\n", "dataframeGeo = pd.read_csv(filecoord,low_memory=False)\n", "\n", "\n", "# directory containing Rusking artworks represented in Linked Art JSON-LD\n", "ruskindir = \"data/ruskin/output/json\"\n", "file_list=os.listdir(ruskindir)\n", "\n", "# for each linked art json file\n", "for file in file_list:\n", " # open file\n", " with open( ruskindir + \"/\" + file) as json_file:\n", " \n", " # get json object from file object with json.load() https://www.geeksforgeeks.org/json-load-in-python/\n", " artwork = json.load(json_file)\n", " \n", " # if id field is in the id field of data file containing geographical coordinates, add update the file\n", " if artwork[\"id\"] in dataframeGeo[\"id\"].tolist():\n", " \n", " display(HTML(\"

\" + artwork[\"_label\"] + \"

\"))\n", " # get rows in dataframeGeo where id == artwork id from JSON-LD file\n", " # Access a group of rows and columns by label(s) or a boolean array\n", " # https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html\n", " rows = dataframeGeo.loc[dataframeGeo['id'] == artwork[\"id\"]]\n", " \n", " print(\"Matching row in geographical coordinates file for artwork\")\n", " display(rows)\n", " \n", " # get first row https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iloc.html\n", " row=rows.iloc[0]\n", " \n", " \n", " \n", " # get place name and coords from geocoords file \n", " placeName = row[\"place\"]\n", " coords = row[\"coords\"]\n", " # replace comma in coords with space\n", " coords = coords.replace(\",\", \" \")\n", " \n", " # increment counter\n", " cnt = cnt+1\n", " \n", " # use cromulent to create Linked Art representation of place depicted\n", " # https://github.com/thegetty/crom\n", " \n", " approx_place = Place()\n", " approx_place._label = placeName\n", " approx_place.defined_by = \"POINT(\" + coords + \")\"\n", " \n", " place = Place()\n", " place._label = placeName\n", " place.approximated_by = approx_place\n", " \n", " visualItem = VisualItem()\n", " visualItem.represents = place\n", " \n", " # append new representation to artwork json object\n", " artwork[\"shows\"] = factory.toJSON(visualItem)\n", " \n", " print(\"Geographical coordinates representation to be added:\")\n", " print(json.dumps(factory.toJSON(visualItem), indent=2))\n", " \n", " # open output file \n", " text_file = open(storyvisdir + \"/\" + str(cnt) + \".json\", \"wt\")\n", " \n", " # write to file and close\n", " n = text_file.write(json.dumps(artwork,indent=2))\n", " text_file.close()\n", " print(\"File updated\" )\n", "\n", "HTML(\"

Files updated

\")" ] }, { "cell_type": "markdown", "id": "3b0c6139", "metadata": {}, "source": [ "## Example Linked Art JSON-LD including Geographical Identifier and Coordinates \n", "\n", "The following is an example JSON-LD Linked Art representation, updated to include geographical coordinates.\n", " \n", "### Image Title: Study of the Marble Inlaying on the Front of the Casa Loredan, Venice\n", "\n", "\n", "\n", "\n", "\n", "Ashmolean Museum artwork page\n", "\n", "\n", "### JSON-LD Representation" ] }, { "cell_type": "code", "execution_count": null, "id": "09c99833", "metadata": {}, "outputs": [], "source": [ "print(json.dumps(artwork,indent=2))" ] }, { "cell_type": "markdown", "id": "9b5abb3e", "metadata": {}, "source": [ "

Visualisation - Artwork Description with Geographical Coordinates of Place Depicted

\n", "\n", "If you'd like to view a different file change the value of `file` filepath in the code below (examples: 1.json .. 89.json)" ] }, { "cell_type": "code", "execution_count": null, "id": "a89ebe43", "metadata": {}, "outputs": [], "source": [ "from IPython.core.display import Javascript\n", "\n", "code2 = \"var file = './data/ruskin/storyvis/json/3.json';\"\\\n", " \"var selector = '#vis2';\" \\\n", " \"visjsonld(file, selector); \" \n", "\n", "with open('./src/js/visld.js', 'r') as _jscript:\n", " code = _jscript.read() + code2\n", "\n", "Javascript(code)" ] }, { "cell_type": "markdown", "id": "815637d1", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "id": "9d3dcc2a", "metadata": {}, "source": [ "## Next Steps\n", "\n", "### Jupyter Notebooks\n", "\n", "Look at the Jupyter notebook that:\n", "- creates the input files\n", " - `01-06-Transform-John-Ruskin`\n", "- creates a StoryMap data visualisation from the updated JSON-LD files\n", " - `03-04-Visualise-John-Ruskin-Story-Map.ipynb`\n" ] }, { "cell_type": "markdown", "id": "d3fc4dc6", "metadata": {}, "source": [ "## View other Linked Art JSON-LD files from Ruskin dataset" ] }, { "cell_type": "code", "execution_count": null, "id": "2474c801", "metadata": {}, "outputs": [], "source": [ "import ipywidgets\n", "\n", "from ipywidgets import Layout, FileUpload \n", "from IPython.display import display, IFrame, HTML, Image\n", "import os\n", "import json\n", "\n", "\n", "\n", "# directory that will contain updated Ruskin artwork representations including geo coords\n", "dir = \"data/ruskin/storyvis/json\"\n", "\n", "file_list=os.listdir(dir)\n", "\n", "selectOptions = []\n", "selectOptions.append((\"Please select an artwork\", \"\"))\n", "\n", "\n", "# for each linked art json file\n", "for file in file_list:\n", " # open file\n", " with open( dir + \"/\" + file) as json_file:\n", " artwork = json.load(json_file)\n", " title = artwork[\"_label\"] + \" (\" + file + \")\"\n", " \n", " selectOptions.append((title,file))\n", " \n", "from IPython.core.display import Javascript \n", " \n", "def dropdown_eventhandler(change):\n", " with open('./src/js/visld.js', 'r') as _jscript:\n", " code = _jscript.read() + \"var file = './data/ruskin/storyvis/json/\" + change.new + \"';var selector = '#vis3';visjsonld(file, selector); \"\n", " display(Javascript(code))\n", " \n", " with open( dir + \"/\" + change.new) as json_file:\n", " \n", " artwork = json.load(json_file)\n", " if (\"representation\" in artwork):\n", " image = artwork[\"representation\"][0][\"id\"]\n", " display(Javascript(\"document.getElementById('artwork').src = '\" + image + \"';\"))\n", " else:\n", " display(Javascript(\"document.getElementById('artwork').src = '';\"))\n", " \n", " " ] }, { "cell_type": "code", "execution_count": null, "id": "ae6f10d3", "metadata": {}, "outputs": [], "source": [ "selectObject = ipywidgets.Dropdown(options=selectOptions)\n", "selectObject.observe(dropdown_eventhandler, names='value')\n", "\n", "display(selectObject)" ] }, { "cell_type": "markdown", "id": "bdfe0f81", "metadata": {}, "source": [ "
\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "8adc43a8", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "celltoolbar": "Slideshow", "finalized": { "timestamp": 1650979482926, "trusted": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" }, "require": { "paths": { "d3": "https://d3js.org/d3.v7.min" }, "shim": {} } }, "nbformat": 4, "nbformat_minor": 5 }