(Go: >> BACK << -|- >> HOME <<)

User:Jonathan Groß/tasks

Mix'n'match: matching external datasets with Wikidata items

edit

Catalogues which I created resp. feel responsible for maintaining:

Biography and Prosopography

edit

Heidelberg Academy for Sciences and Humanities

edit

Catalogus Professorum Halensis

edit

Teuchos-Prosopographie

edit

AcademiaNet

edit

Sächsische Biografie

edit

Database of Classical Scholars

edit

Still to check:

Magdeburger Biographisches Lexikon

edit

Greek myth and mythology

edit

Hederichs gründliches mythologisches Lexicon

edit

I proposed this property (Hederich encyclopedia article (P2272)) in October 2015 while I was working on my PhD thesis (link to discussion). At the time, this was the only comprehensive database of Greek mythical characters I knew of. Magnus Manske then created a mix'n'match catalogue using a webscraper (link). This catalogue has over 9000 entries to be matched with Wikidata items. Note that many of those are cross-references, which in my opinion are not useful to Wikidata (apart from providing alternative headings, maybe). I have classified these cross-references as "not relevant for Wikidata" (2811 so far).

To this day, this catalogue is only partially matched with Wikidata. About 3500 items (39%) are manually matched. The bulk of this work was apparently done by User:Melderick in 2018 with nearly 3200 matches; after that come User:Varina (687 matches), myself (300 matches) and User:Tusculum (211 matches).

There are still about 1400 automatically matched entries which need to be confirmed or corrected by a human, and 1267 entries not matched at all. So there is still work to be done. 13:57, 6 July 2023 (UTC) 15:09, 2 July 2023 (UTC)

EDIT, a few days later: I've found a few mistakes and realised that the manual matches will have to be reviewed as well. *sigh* Jonathan Groß (talk) 13:57, 6 July 2023 (UTC)

UPDATE: Single value and Unique value violations can be used to cross-check the matches. So far matching in the catalogue seems reliable. Jonathan Groß (talk) 10:40, 11 July 2023 (UTC)

UPDATE: While checking the +2800 entries marked as "not applicable to Wikidata", I found that most of them were actually applicable. So I undid a lot of work done by User:Melderick and opened these entries up for matching again. Jonathan Groß (talk) 07:31, 12 July 2023 (UTC)

Challenges and Solutions
  1. Hederich is very dated and although most learned and useful for its time, cannot be considered an authoritative source for Greek mythology any more. Using its data to create a Who's Who of Greek mythology is problematic.
    Solution: Let us consider Hederich, a standard work of its time, as a standard in and of itself. Even if its information is incomplete and skewed, it rest on solid study of Greek and Latin sources.
  2. Hederich uses Latin headings while Wikidata has sometimes Greek, sometimes Latin. As the lemmata use only capital letters, they do not differentiate between I and J or U and V. Hence, Zeno.org transliterated every I as I/i (no problem here) and every V as V/v (good heavens!).
    Solution: Be patient. Also when creating new items: Respect common practice ("To each their own"). Anglophone readers prefer Latin, Germans either Greek or Latin (or 'German' versions).
  3. Hederich takes care to differentiate between persons of the same name. However, Greek myth is notorious for having competing variants of personal names, places and events. This leads to ambiguity in attributing Hederich articles to Wikidata items.
    Solution: Carefully judge case by case. Document and discuss the problematic cases and use them as learning experiences for our ontology of myth.
  4. Lots remains to be done.
    Solution: Get to it! And make our progress visible.
Status
Issues

Greek and Latin literature

edit

...

Paulys Realenzyklopädie der klassischen Altertumswissenschaft

edit

I'm going to maintain the items relating to the over 17,000 articles from Paulys Realenzyklopädie der klassischen Altertumswissenschaft (RE) featured on the German Wikisource project.

After creating 15756 new items for these articles on May 27th and 28th, there's much to do (adding statements, descriptions and labels):

  1. English description: "article from Pauly-Wissowa’s RE, a comprehensive encyclopedia on classical antiquity";
  2. German description: "Artikel aus Paulys Realenzyklopädie der klassischen Altertumswissenschaft (Pauly-Wissowa)"
  3. instance of (P31) encyclopedia article (Q17329259)
  4. published in (P1433) Paulys Realenzyklopädie der klassischen Altertumswissenschaft (Q1138524)
  5. For cross references (not full articles), it would be better to add instance of (P31) cross-reference (Q1302249)
  6. main subject (P921) will be the most important qualifier, but it may be difficult to fill this automatically.

So far, this is only a ToDo list. Jonathan Groß (talk) 12:28, 28 May 2015 (UTC)

Done so far:

  1. instance of (P31) encyclopedia article (Q17329259) for all articles. Replaced with instance of (P31) cross-reference (Q1302249) for cross-references.
  2. published in (P1433) Paulys Realenzyklopädie der klassischen Altertumswissenschaft (Q1138524) for all articles and cross-references.

Left to do:

  1. Adding labels.
  2. Adding author (P50) according to the authors’ categories.
  3. Figuring out how to read the RE template to give volume and column numbers.
  4. Adding volume (P478) and page(s) (P304) (or section, verse, paragraph, or clause (P958)?) to published in (P1433).
  5. main subject (P921) ... maybe with Wikipedia articles from the RE template?

Jonathan Groß (talk) 09:15, 30 May 2015 (UTC)

To keep up with article creation, I'll keep on frequently:

  1. [1] Creating new items for entries in cross-reference and article categories.
  2. [2] Adding published in (P1433) Paulys Realenzyklopädie der klassischen Altertumswissenschaft (Q1138524) to all lemmata (articles and cross-references)
  3. [3] Adding instance of (P31) cross-reference (Q1302249) to cross-references
  4. [4] Adding instance of (P31) encyclopedia article (Q17329259) to articles

Jonathan Groß (talk) 09:14, 9 June 2015 (UTC)

Members of the Hellenic Philological Society of Constantinople

edit

I started adding Property:P463 (member of) with qualifiers "start time" and "as" (do qualify the type of membership, e.g. honorary members, corresponding members, ordinary members) to members of the Hellenic Philological Society of Constantinople (1861–1922). They are listed in the front matter of the society’s journal.

Most, but not all volumes of the journal are available online. Digitised volumes are listed on the Greek Wikisource page.

As most of the ordinary and corresponding members are not eligible for Wikipedia articles, I focus on the society’s honorary members who are highranking Ottoman civil servants, Greek Orthodox patriarchs, foreign diplomats and scientists, sometimes bankers and physicians from Constantinople.

The members’ names are given only by surname and initial, but in combination with the stated profession and workplace, identification is possible. Slight errors in the year of membership are to be expected (especially if the volume was published a long time after the election of a member). The main problem with identifying the members is that the lists in the journal have a lot of spelling errors, mainly with foreign names written in Latin letters.

For example: Conrad Bursian is listed in vol. 6 (1871/72) as Boursion C., Καθηγ. τοῦ Πανεπιστ. Βιέννης (= University of Vienna, which is an error for Jena). In the next volume, the surname is "corrected" to Boursian and the university is given as Ἰένη. More examples: Wilhelm Henzen is listed in vol. 6 (1871/72) as Hengen W., which has never been corrected; Hermann Sauppe is listed as "Hermann Sauppre" (under letter H) in vol. 10 (1875/76). Friedrich August Eckstein is listed as "Eckstreïz" in vol. 14 (1878/79).

Progress so far:

  •   OK 2015-01-30 checked volumes 6 and 10 for members.
  •   OK 2015-02-04 checked volumes 3, 7–9 and 11–12 for members.
  •   OK 2015-02-06 checked volume 14 for members.
  •   OK 2015-02-10 checked volumes 16–20 for members. Total number now: 330

Sodales Academiae Latinitati Fovendae

edit

The members of the Academiae Latinitati Fovendae are Latin scholars from all over the world. Many of them have articles on lawiki, some also on other Wikipedias. The ALF has a list of its members as of 2012-04-20. This list also has the names of the fouding members (1967-04-18) along with the people who were declared "founding members" in 1983, i.e. members who were elected into the ALF until 1983. Jonathan Groß (talk) 13:44, 6 February 2015 (UTC)

The following people I didn't find on Wikidata as of today:

  • Mauro Agosto (2007), Italy
  • Marco Buonocore (2012), Italy
  • Neil Coffee (2011), United States of America [5]
  • Lucienne Deschamps (1990), France [6]
  • Giorgio Di Maria (2013), Italy [7]
  • Gérard Freyburger (1998), France [8]
  • Dimitrios Koutroubas (1998), Greece VIAF
  • Bruno Luiselli (1984), Italy VIAF
  • José María Maestre Maestre (2004), Spain [9]
  • Piergiorgio Parroni (1982), Italy [10], VIAF
  • Françoise Licoppe-Deraedt (2013), Belgium, Honorary Member
  • Giancarlo Rossi (2011), Italy, Honorary Member
  • Jane O’Neil (2001), United States of America, Honorary Member
  • Nicolae Barbu (1967), Romania VIAF
  • Edoardo Coleiro (1967), Malta VIAF
  • Giuseppe Del Ton (1967), Vatican City VIAF
  • Walthère Derouau S.J. (1967), Belgien/Burundi
  • Κωνσταντίνος Γρόλλιος (1967), Greece VIAF
  • Alfons Isnenghi (1967), Austria, Dr. phil., teacher in Salzburg
  • Jan Kábrt (1967), Czechoslovakia
  • Stéphane Kresic (1967), Canada VIAF
  • William Stuart Maguinness (1967), United Kingdom
  • José Maria Mir (1967), Spain VIAF
  • Ottorino Morra (1967), Italy
  • Vandick Londres da Nóbrega (1967), Brasil VIAF
  • Guerino Pacitti (1967), Italy
  • Virgilio Paladini (1967), Italy
  • Faruk Zeki Perek (1967), Turkey [11]
  • Pierre Schmid (1967), Switzerland
  • Vincenzo Ussani d’Escobar (1967), Italy
  • Madeleine Bonjour (1982), France
  • José Jimenez Delgado (1983), Spain
  • Anton Daniel Leeman (prior to 1980), Netherlands
  • Alain Michel (1973), France
  • Isaj Mihajlovič Nahov (1980), USSR
  • Boleslaw Povsic (1982), USA
  • Michel Rambaud (1974), France
  • José Ruysschaert (prior to 1983), Belgium VIAF
  • Amleto Tondini (1969), Vatican City
  • Gavin B. Townend (1982), UK
  • Antonio Traglia (1976), Italy
  • Rodolphe Verdiere (1973), Belgium
  • Jan Wikariak (prior to 1983), Poland

Jonathan Groß (talk) 15:52, 6 February 2015 (UTC)