Data is created, analyzed, circulated and used in large quantities and types for a variety of purposes. Think of photo sharing on social media, but also registration of personal data by government agencies. When data is used, its origin and quality is not always known. But even when data is created, it is not always clear how it will be used. For a digitalizing constitutional state it is necessary for society and government to be able to trust in data and its quality-data lineage can contribute to this, according to a new WODC report.

Understanding the life cycle of data by recording origins, changes and end uses is called data lineage. This can be done by adding information about the data, also called metadata.
An example of data lineage is that when someone takes a photo on vacation, the photographer's name is added to the file, as well as the date and location of the photo. Before the photo is shared on social media, a photo editor is first used to remove in-the-way tourists. This operation is also added to the photo's metadata.
If someone then views this photo on social media and requests metadata, it becomes clear that the photo has been edited. A person can then make their own considerations as to whether or not the photo is trustworthy. Context also plays a role here. The photo is an appropriate source to see where someone has been on vacation, but not necessarily to estimate the crowds at this location because of the editing.
Because within the Dutch legal system there is increasing use of algorithmic and data-driven tools in shaping and implementing policy, data lineage is crucial to implement. Data generation and exchange within the legal system are fragmented, so it is not always easy to estimate where the data originated. Tools that offer insight into the life cycle of data are therefore necessary for users of this data to make the right considerations.
The WODC report outlines frameworks within which data professionals within the Dutch legal system should start thinking about solutions around data lineage. In addition, the report offers guidance as to which (technical) approaches, methods and instruments could be used for this purpose, and what the advantages and disadvantages of these are.
