This is a guest blog by Axel Petterson from Wikimedia Sweden.
Do you have a bunch of monuments ready to use in a “Wiki Loves” competition? Great! You have organized them in some kind of list already? Even better!
But what does it take to go from that to getting the list to be structured data that can be added to Wikidata, to make it easier to create the list on all language versions of Wikipedia, and to put all monuments on a map to make it easy for photographers to find them?
You can do this yourself, but if you’d like help the Content Partnership Hub is available for you!
We would like to give you a short walk-through of the initial steps needed for importing a data set into Wikidata.
The first important thing is to make sure that your list is up to date, so that we don’t start with old, invalid data. We also need to make sure the license is compatible with Wikidata, that the data is available under the CC-0 license or in the Public Domain, to ensure that there are no issues later on that might force us or the community to remove the data.
Another thing you want to sort out early is what to do with the former Wikipedia monument lists – mostly these are available in the form of the monuments in templated tables on the wiki. Having monument data available on both Wikidata and Wikipedia often leads to updates in one but not the other, which makes it hard to keep track of the latest updates and correct information. It is possible to replace the static lists on Wikipedia by dynamic lists powered by Wikidata, like this one from Australia, but some Wikipedia versions do not allow such lists in the main namespace. Make sure to know what the community consensus is, or open it up for discussion before implementing new lists for the first time.
When that is out of the way the next step is to organize the list in some structured way: you can either do that yourself, or the Content Partnership team is happy to help. A spreadsheet like excel is fine, and so is JSON or some other output from a database. It is important to mention the source of the data, so that it can be added as a reference for each monument when adding the information to Wikidata.
For how much information on the monuments needs to be available, having more is usually better than less, but the minimum is:
- Name of the monument
- Unique identifier (often a number), preferably from the authority who defined the monument.
- What kind of monument is it – a building, a statue, a temple?
- Where is it located – city or other administrative unit, address, or even better: coordinates!
After the list is structured it will be matched with what is already available at Wikidata: again, you can either do this yourself, with help from the Content Partnership team, or we can do this for you. Depending on how much data is available the process may take a few hours, but it can take more time if a lot of ‘cleaning’ is needed or if there are open questions before the list can be turned into structured data. This might be the case when there already is a large number of monuments from your country available in Wikidata and we have to make sure not to create duplicates.
In this step language is an important factor: working on data that is not in an Indo-European language, especially if it uses a script other than Latin or Cyrillic, will require more assistance from you or other community members to help us match the data available and find the new information in your data set.
The sooner your lists can be made available for us, the more time there is to plan the work among other tasks, and to have time to sort out potential issues.
Even if the data is not yet perfect, you might still want to have it available on Wikidata and continue improving it directly there. For example: if you don’t have coordinates of all the monuments yet, you can invite the local Wikipedia community to fill in the gaps once the dynamic lists are up, and make the Wikidata items more usable for Wiki Loves Monuments.
The initial datasets uploaded do not have to be perfect or complete: the four bullet points mentioned above are a good way to get started.
For any questions about moving your monument data to Wikidata, please send an email to the Content Partnership Helpdesk on email@example.com. We’ll be happy to explore together what data you have and what we can do to support your team to add structured lists for Wiki Loves Monuments. Since the same process applies to other datasets, we could possibly also follow-up with nature reserves in your country for Wiki Loves Earth.
Read more about the Content Partnership Helpdesk on Wikimedia Foundation’s weblog Diff.