What is a data infrastructure?

What is a data infrastructure?

A data infrastructure consists of data assets, the organisations that operate and maintain them and processes, policies and guides describing how to use and manage the data. A data infrastructure can be seen as an ecosystem of technology, processes and actors/organisations needed for the collection, storage, maintenance, distribution and (re)use of data by the different end users in the agricultural sector. As an analogy, a road infrastructure not only includes the road network (hardware and software), but would also include the resources, people and equipment that maintain the road, driving regulations, traffic control, emergency services, the drivers and their cars, and even the car dealer and his garage. In summary a road infrastructure is ALL that is needed to get from A to B in a car on the road safely. A reliable data infrastructure is sustainably funded and is designed in such a way that data use and value is maximised by meeting the needs of society.

One of the challenges in open data for agriculture is that datasets are often distributed across different ministries and agencies, including sometimes (semi-) privatized bodies. Government structures across the globe vary, but in general the relevant information for agriculture can be found in:

  • the ministry of agriculture, including associated extension, research or subsidy bodies;
  • other government agencies (which may be semi-privatised) including a meteorological agency for weather and climate data, a mapping agency providing geographical data, and statistical offices conducting population surveys and monitoring; and
  • ministries dealing with water, natural resources, infrastructure, spatial planning, trade and finance.

Box 2: Open Data Standards for Agriculture

There are many well established standards in use for the collection and management of agriculture-related data. The VEST / AgroPortal map of standards lists over 140 agriculture-specific vocabularies, classification schemes and metadata standards. Many of these have been developed in scientific and specialist domains to support data sharing within particular communities of practice.

VEST/AgroPortal Vocabulary Openness by Domain Diagram

When providing open data, publishers should maximise the number of potential reusers of a dataset. This benefits from accessible and open standards, with clear documentation and licensing.

Research by the GODAN Action programme has found that many current standards lack clear licensing and documentation. This points to the need for greater collaboration in the sector to:

  • identify and agree upon existing open standards for the key datasets listed above;
  • work to develop open standards, building on existing practice, where there are gaps.

As part of the further development of this resource, working with GODAN Action, we aim to contribute to that initial mapping.

Developing a data infrastructure for agriculture is therefore not a matter for a single ministry. Success depends on collaboration between actors and organisations, and aligning shared interests. However, the need for open collaboration may also increase the potential for innovation across multiple sectors. For instance, open weather data will be used by everyone from farmers to the transport industry to individual citizens.

A strong agriculture data infrastructure also requires that different datasets can communicate with each other. Adherence to common open data standards can help. A data standard is a guideline or series of guidelines that defines the way in which data should be collected or structured. By following the standard, similar data can be easily compared over time, across locations, and within and between organizations, as well as being easily manipulated to produce visualizations and identify trends. In other words, standards help to make reuse simple. For many of the data categories in this package, open data standards are still under development. For each data category we point to relevant initiatives hosting standards or working on enhancing interoperability.