Main Page: Difference between revisions
OlafJanssen (talk | contribs) |
OlafJanssen (talk | contribs) |
||
(One intermediate revision by the same user not shown) | |||
Line 159: | Line 159: | ||
'''<big>The text below is a transcript of the presentation ''[https://commons.wikimedia.org/wiki/File:KB_Wikibase.cloud_Unboxing_Experience,_Netherlands_Wikibase_Knowlegde_Group,_22-07-2022.pdf KB's Wikibase.cloud Unboxing Experience]''</big>''' | '''<big>The text below is a transcript of the presentation ''[https://commons.wikimedia.org/wiki/File:KB_Wikibase.cloud_Unboxing_Experience,_Netherlands_Wikibase_Knowlegde_Group,_22-07-2022.pdf KB's Wikibase.cloud Unboxing Experience]''</big>''' | ||
In June-July | In June-July 2022 the KB created and configured this instance on Wikibase.cloud. Many of the steps we needed to take in the setup process were as to be expected when working with fresh MediaWiki instances in general, and Wikidata in particular. However, some steps were not immediately logical, manifest, clearly documented or simply beyond our (then) knowledge and experience. We created a '''[https://commons.wikimedia.org/wiki/File:KB_Wikibase.cloud_Unboxing_Experience,_Netherlands_Wikibase_Knowlegde_Group,_22-07-2022.pdf presentation]''' in which we share experiences and first impressions of unboxing, setting up, configuring and tweaking '''our [https://kbtestwikibase.wikibase.cloud/wiki/Main_Page Wikibase.cloud sandbox instance]'''. We also present solutions for most of the issues we encountered. | ||
Below is an overview of the most important issues we encountered. The numbering refers to the corresponding slides in the presentation: | Below is an overview of the most important issues we encountered. The numbering refers to the corresponding slides in the presentation: |
Latest revision as of 12:58, 19 September 2023
Unofficial Wikibase sandbox for the national library of The Netherlands.
Disclaimer
This Wikibase sandbox is for playing, experimenting, learning about the functionalities, possibilities and limitations etc. of Wikibase, and knowledge sharing activities resulting from these learnings. It is maintained by Olaf Janssen, the Wikimedia coordinator of the KB, mainly as a tool for experimentation, learning, skill-building, knowledge sharing and consultancy about Wikibase, both towards KB staff, (inter)national Wiki(base) communities, GLAMs, KB network partners, as well as other 3rd parties. It does not contain any offical data(sets) of the KB, and the data can be changed/deleted at any time. This instance is not supported by the KB, nor is it part of the official KB IT infrastructure. If you are interested in the official Wikibase instance of the KB, please visit xxx.
Quick links
Eigenschappen (properties) (namespace 122)
Basiseigenschappen voor personen
- is een uniek exemplaar van (waarde: Q-nummer van mens)
- NTA-label
- voornaam
- voornaam (als tekenreeks)
- achternaam
- achternaam (als tekenreeks)
- sekse of geslacht (waarde: Q-nummer van man of vrouw)
- land van nationaliteit
- land van nationaliteit (tweeletterige landcode als tekenreeks)
- geboortedatum (EDTF)
- geboorteplaats
- geboorteplaats (als tekenreeks)
- overlijdensdatum (EDTF)
- overlijdensplaats
- overlijdensplaats (als tekenreeks)
- beroep
- beroep (als tekenreeks)
- PPN identificatiecode voor persoonsnaam
- VIAF-identificatiecode
- DBNL-identificatiecode voor auteur
- ISNI-identificatiecode
- LCAuth-identificatiecode
- Wikidata URI
- Wikidata Qid
Basiseigenschappen voor werken
- is een uniek exemplaar van
- subklasse van
- titel
- beschrijving
- taal van werk
- taal van werk (als tekenreeks)
- genre
- genre (als tekenreeks)
- auteur
- auteur (als tekenreeks)
- illustrator
- illustrator (als tekenreeks)
- uitgever
- uitgever (als tekenreeks)
- plaats van publicatie
- plaats van publicatie (als tekenreeks)
- datum van uitgave (EDTF)
- in collectie
- in collectie (als tekenreeks)
- inventarisnummer
- gebruikt materiaal
- gebruikt materiaal (als tekenreeks)
- PPN identificatiecode voor het oorspronkelijke fysieke werk
- Wikidata URI
- Wikidata Qid
Items (namespace 120)
Basisitems algemeen
Basisitems voor personen
Basisitems voor werken
Terminologie
- https://www.wikidata.org/wiki/Help:Statements
- Entity: een Item (Qnummer) of een Property (Pnummer, Eigenschap)
- A statement consists of a property(P)-value(Q) pair, for example, "location: Germany."
- Statements can also be expanded upon, annotated, or contextualized with the addition of optional qualifiers, references, and ranks.
- The core part of a statement without references and ranks is also called claim. A claim without qualifiers is also referred to as snak. (= simplest P-Q pair)
Voorbeeld-items
om (het API-matig schrijven van) onze KB-datamodellen te testen en te verbeteren. Het zijn 1e, naieve schetsen, "praatstukken" om bij de datamodellering te helpen.
Centsprenten
- Q20 - Adam en Eva uit het paradijs verjaagd - basaal Qitem dat een fysieke KB-centsprent beschrijft.
- Voer voor discussie:
- Let op: Q20 beschrijft de fysieke centsprent , zie nbc.bibliotheek.nl/32226085X en KB-catalogus/32226085X.
- Er is ook een PPN voor de gedigitaliseerde centsprent, zie nbc.bibliotheek.nl/380352362 en KB-catalogus/380352362, maar dat is een ander, digitaal object, een digitaal bestand, een jpg.
- De conceptuele vraag die we i.h.k.v. EMMA moeten beantwoorden is hoe de fysieke en gedigitaliseerde objecten zich tot elkaar moeten verhouden, dus hoe we die relatie in onze Wikibase willen gaan vastleggen. Het aanmaken van 2 aparte Qitems , 1 voor de fysieke prent (PPN=32226085X), en 1 voor het jpg-bestand (PPN=380352362) - dus zoals het op dit moment ook in de GGC en KB-catalogus gebeurt, is mi (=Olaf) niet wenselijk, maar daar moeten we (mn. Danielle en ik) met de KB-metadata-experts een keer voor gaan zitten.
- De GGC/KB-catalogus is hierin ook niet zuiver. Als je bv naar de beschrijving van het digitale ding ("de jpg") kijkt op https://webggc.oclc.org/cbs/DB=2.37/XMLPRS=Y/PPN?PPN=380352362, dan staan daar nog allerlei velden in die met het fysieke ding te maken hebben. Bv bij "Jaar van vervaardiging" - het jaar waarin de jpg gemaakt is - zou ik "2009" ipv "[tussen 1869 en 1886]" verwachten, want in de 19e eeuw kon men nog helemaal geen jpgs maken. Bij de beschrijving van het digitale bestand zou ik properties als resolutie, kleurdiepte, bestandsformaat (jpg, png, tiff..), EXIF-info, URL etc. verwachten.
Personen
Basale Qitems die mensen van vlees en bloed beschrijven
- Q10 - Louis Auguste Gustave Doré
- Q29 - Theun de Vries
- Openstaande issues (Phabricator):
- Allow custom ordering of properties in wikibase.cloud instances --> Opgelost, sortering is nu aanpasbaar. Zie oplossing in https://phabricator.wikimedia.org/T310899 , Olaf heeft https://kbtestwikibase.wikibase.cloud/wiki/MediaWiki:Wikibase-SortedProperties aangemaakt en de paginacache geleegd via Special:Purge.
- Clustering of external ID properties in Qitems in wikibase.cloud (as in Wikidata) --> hangt met bovenstaande samen
Openstaand vragen:
- Welke datamodel zit er achter de personen die we in onze Wikibase gaan beschrijven? Moeten/kunnen we aansluiten bij (hergebruik maken van) reeds bestaande datamodellen over personen en persoonsnamen?
SPARQL
Setup
- SPARQL-queries zijn mogelijk via https://kbtestwikibase.wikibase.cloud/query/
- Er moeten voorlopig verplicht prefixen meegegeven worden (volgens de prefixen die in Wikidata gebruikt worden):
PREFIX wd: <https://kbtestwikibase.wikibase.cloud/entity/>
PREFIX wds: <https://kbtestwikibase.wikibase.cloud/entity/statement/>
PREFIX wdv: <https://kbtestwikibase.wikibase.cloud/value/>
PREFIX wdt: <https://kbtestwikibase.wikibase.cloud/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX p: <https://kbtestwikibase.wikibase.cloud/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
Queries
- Zie deze voorbeelden van SPARQL queries - de queries op deze pagina worden in de Voorbeelden-dropdown op https://kbtestwikibase.wikibase.cloud/query/ weergegeven.
Interactie via de API
Endpoint / base URL
Via URLs
- Een voorbeeld om item Q1 op te vragen als JSON: https://kbtestwikibase.wikibase.cloud/w/api.php?action=wbgetentities&ids=Q1&format=json
- Q1 en Q2 samen als JSON: https://kbtestwikibase.wikibase.cloud/w/api.php?action=wbgetentities&ids=Q1%7CQ2&format=jsonfm (of als XML)
Via Python
- https://github.com/KBNLresearch/wikibase-api/tree/master/examples-kb - Examples on how to read, write, delete, update Ps, Qs, claims, qualifiers and references in https://kbtestwikibase.wikibase.cloud using Python
- Docs/manual are on https://wikibase-api.readthedocs.io/en/latest/.
Nog verder bekijken, waarschijnlijk relevant
- https://github.com/LeMyst/WikibaseIntegrator. Zie ook de verdere import opties op https://www.mediawiki.org/wiki/Wikibase/Importing
Reconciliation service for this Wikibase
https://github.com/KBNLresearch/OpenRefine-Wikibase
- Recon service o.b.v. Docker: https://openrefine-wikibase.readthedocs.io/en/latest/install.html
- Example config file: https://github.com/KBNLresearch/OpenRefine-Wikibase/blob/main/config.py
- Configuration of OpenRefine: add URLs of recon service and manifest to OpenRefine
- Example manifest: https://github.com/KBNLresearch/OpenRefine-Wikibase/blob/main/kb-test-wikibase-cloud-manifest.json
Snelheid van bulk data-importeren
- Fast Bulk Import Into Wikibase - Importing a large amount of Items into Wikibase can be a challenge. This post provides a high level overview of different importing approaches and their performance.
Wikibase.cloud project updates
Wikibase.cloud documenation
Our Wikibase.cloud unboxing experience and first findings
The text below is a transcript of the presentation KB's Wikibase.cloud Unboxing Experience
In June-July 2022 the KB created and configured this instance on Wikibase.cloud. Many of the steps we needed to take in the setup process were as to be expected when working with fresh MediaWiki instances in general, and Wikidata in particular. However, some steps were not immediately logical, manifest, clearly documented or simply beyond our (then) knowledge and experience. We created a presentation in which we share experiences and first impressions of unboxing, setting up, configuring and tweaking our Wikibase.cloud sandbox instance. We also present solutions for most of the issues we encountered.
Below is an overview of the most important issues we encountered. The numbering refers to the corresponding slides in the presentation:
- We succesfully created our test instance at https://kbtestwikibase.wikibase.cloud via a personal Wikibase.cloud dashboard every user gets, as described here. This went without any problems, as expected.
- ISSUE: We would have liked to be able to create another Wikibase.cloud instance from this dashboard, but currently (15th Aug 2022) there is a hard limit of one instance per account.
- SOLUTION: There is no solution yet.
- As can be seen from the users list, there is a preset default admin User:Ookgezellig account for this instance, resulting from the global wikibase.cloud creation account (ookgezellig@gmail.com)
- Creating additional, regular user accounts on this Wiki instance was as expected. We created User:OlafJanssen with normal, restricted privileges.
- ISSUE: Regular user accounts with default privileges are not allowed to add URLs to items (see Phabricator T310419, T86453 and T310421).
- SOLUTION: In order to be able to add the Wikidata URL to (for instance) P3 ("is a unique exmple of"), we used the User:Ookgezellig admin account to temporarily upgrade the User:OlafJanssen account to admin privileges as well.
- IMPORTANT: the Wikidata URL in P3 must be formatted as http://www.wikidata.org/entity/P31, and NOT as https://www.wikidata.org/entity/P31, and NOT as https://www.wikidata.org/wiki/Property:P31, or any other mix of these incorrect URL syntaxes. This is important for correctly functioning, abbreviated display of these URLs in the Wikibase query service ("wd:P31", as can be seen in the query result https://tinyurl.com/2kd5dthd).
- ISSUE: When adding statements to a Q-item such as Theun de Vries, by default the statements are (and remain) displayed in the same order as you originally added them. This is different from Wikidata, where in the corresponding Wikidata item about Theun de Vries the statements are displayed in a custom, fixed order, with "Instance of" and "Image" on top, and a list of external identifiers at the bottom of the page, irrespective of the exacty order these statements were originally added to this Wikidata item.
- SOLUTION: This issue was solved by adding the page MediaWiki:Wikibase-SortedProperties to the Wikibase, in which the custom order of statements can be specified. This is the same as in Wikidata, see MediaWiki:Wikibase-SortedProperties and the manual on Wikidata. Please note that the required page cache purging needed for this functionality to kick in can be actively triggered via Special:Purge.
- We created a small number of SPARQL query examples on this Wikibase.
- ISSUE: the Wikidata SPARQL template for query syntax highlighting is not implemented in this Wikibase.
- SOLUTION: Instead one must use a <sparql tryit="1">...</sparql> wrapper, as demonstrated in this example.
- IMPORTANT: in Wikidata prefixes don't need to be specified explicitly in SPARQL queries. However, in our Wikibase.cloud instance, prefixes must be explicitly stated, as this example shows. Please note and mind the distinction between PREFIX kbwdt, kbwd, wdt and wd.
- IMPORTANT: The exact URL https://kbtestwikibase.wikibase.cloud/wiki/Project:SPARQL/examples, as well as the correct formatting of this page, are important to correctly display these query examples in the query services interface (click the Examples button at the top)
- Via the regular Wikimedia REST API the full item on Theun de Vries can be requested as JSON or as XML as well.
- An alternative way for requesting full Wikibase items directly from the Qnumber is via the Special:EntityData URL. The ouput can be obtained in seven different formats: HTML, JSON, RDF/XML, NT, TTL, N3 and PHP.
- Or equivalently, using a format argument, eg Special:EntityData?id=Q29&format=json for a JSON respons.
- For simple interaction with the Wikibase API via Python, the wikibase-api Python library can be used . Read the docs for more details.
- Examples on how to use this library are available at https://github.com/KBNLresearch/wikibase-api/tree/master/examples-kb + https://github.com/kbnlresearch/wikibase-api/blob/master/examples-kb/wikibase-api.py
- ISSUE: Regular user accounts with default privileges are not allowed to add URLs to items
- SOLUTION: See discussion above
- ISSUE: Creation of common.js and common.css pages on Wikibase.cloud instances is not allowed. On Wikidata, users (eg. User:OlafJanssen) can create personal common.js and common.css pages for adding custom functionalities and layout to their Wikidata interfaces. On Wikibase.cloud, this is currently not possible.
- SOLUTION: No solution yet, see Phabricator T310787.
- ISSUE: Default data import speed is rather limited. When importing items into a Wikibase using REST API-based tools such as QuickStatements, pywikibot or WikidataIntegrator, the maximum import speed is about 3-10 items per second, as explained in the article Fast Bulk Import Into Wikibase by The Wikibase Consultancy. This is an OK speed for small datasets, but if you (like the KB) plan to import millions of items, this is obviously too slow.
- SOLUTION: PHP scripts or direct SQL tools such as RaiseWikibase, as further outlined in the article. At the moment these solutions are not considered nor used by the KB.
- Wikibase.cloud issues on Phabricator: https://phabricator.wikimedia.org/tag/wikibase.cloud/
- Wikibase.cloud updates: https://meta.wikimedia.org/wiki/Wikibase/Wikibase.cloud
- Wikibase.cloud user documentation: https://www.mediawiki.org/wiki/Wikibase/Wikibase.cloud
- Telegram for quick help: https://t.me/joinchat/FgqAnxNQYOeAKmyZTIId9g
- Wikibase-cloud mailing list: https://lists.wikimedia.org/postorius/lists/wikibase-cloud.lists.wikimedia.org/