What Is FAIR Data 101? Principles And Why Processes Matter

John Conway
Oct 7, 2019
8 min read

According to the 2021 State of Open Data report by Digital Science, only 28% of researchers are acquainted with the FAIR principles, despite the fact that the FAIRness of data is crucial.1 Even while this percentage has significantly increased over the past five years, it still only represents a quarter of researchers.

Advances in scientific knowledge are fueled by the facts that support scientific inquiry. These datasets can shed new light on earlier findings, helping to either confirm or refute the scientific record as a whole and creating opportunities for further study and understanding. They also hold crucial answers to many of the most important questions that scientists are currently grappling with.

The FAIR Data principles are increasingly being adopted by researchers, funding agencies, and institutions around the world, as a way to improve the transparency, accountability, and reproducibility of research, and to support the creation of new knowledge and innovations.

In this article, we’ll get to understand the FAIR data principles and why it is of great importance. We’ll get to know the four FAIR Data principles and also the FAIRification process. And finally, why it is important to FAIRify data, then some FAQs.

What Are The FAIR Data Principles?

Published in Scientific Data in 2016, the FAIR Data Principles (Findable, Accessible, Interoperable, and Reusable) are a collection of key principles put out by a group of scientists and organizations to enhance the reusability of digital assets.2 The principles were developed in an effort to address the growing problem of data silos, in which data is difficult to find and use because it is scattered across different locations, formats, and systems.

And because humans increasingly rely on computational support to deal with data as a result of the increase in volume, complexity, and creation speed of data, the principles emphasize machine-actionability. This means, the ability of computational systems to find, access, interoperate, and reuse data with neither full nor minimal human intervention.

Although the FAIR principles have their roots in the life sciences, they are applicable to all fields of study. Universities, national funders, the European Union, and others have said that they agree with and support the FAIR principles since they were first published. This includes everything from establishing guidelines for data handling to developing infrastructures and tools for data management. While some FAIR projects and implementations strictly adhere to the original definitions, others are motivated by the FAIR principles' ethos.

What Is FAIR Metadata?

In simple terms, metadata is data about data. Metadata is information about other information. They are crucial in making your data FAIR. Your study data must continually be updated with metadata, not simply at the start and conclusion of a project.

It is preferable to add metadata in accordance with a disciplinary standard, either manually or automatically (for instance, when you take a microscope image, the accompanying software often records metadata as part of it).

Since metadata links research data and publications in the Internet of FAIR Data and Services and are always freely available, they are more significant from a FAIR viewpoint than your data.

What Is The Difference Between FAIR Data And Open Data?

Although FAIR data and Open data may occasionally overlap, they are not always the same thing. While open data stresses the free and open access of data to the public, FAIR data concentrates on the technical elements of data management and interoperability.

But it is feasible for data to be both FAIR and open, in which case it would be both technically well-managed and publicly accessible. It is possible to have open data that is not collected or used in a fair manner, and vice versa.

Why Are The FAIR Data Principles Important?

Fair data principles are a collection of rules designed to guarantee that data is gathered, processed, and shared ethically, openly, and reasonably. These principles are crucial because they guarantee that data is handled properly and respect people's rights. They also support the development of trust between businesses and their stakeholders, such as clients, staff members, and the larger community.

In particular, the fair data principles emphasize the need for data to be collected and used in a way that respects people's privacy and autonomy. They also call for data to be collected and used for legitimate purposes, and for organizations to be transparent about how they are using data. Additionally, the principles advocate for the responsible sharing of data, including the appropriate use of data for research and innovation.

Overall, the fair data principles are crucial because they ensure that data is utilized in an ethical, transparent, and fair manner and that people's rights are protected.

FAIR Data Principles Explained

Here are the four FAIR data principles that you should know:

Findable

This implies that the data can be found by both individuals and machines. One way to do this is to make relevant, machine-actionable metadata and keywords available to search engines and research data catalogs.

The metadata contains the identification of the data they describe, and the data are referred to by distinct and unique identifiers (such as DOIs or Handles).

Accessible

This signifies that the data are available using regular technical processes and have been archived in long-term storage. However, information on how the data could be obtained or not must be made available.

This does not imply that the data must be publicly accessible to anyone. Data can be labeled, for instance, "Access only with specific permission from the author" and contain the author's contact information.

However, it would be ideal if the data accessibility information could also be read by machines, for instance through machine-readable standard license

Interoperable

This indicates that the data may be shared and utilized across many applications and systems, even in the future through the adoption of open file formats. Additionally, it indicates that the data may be combined with information from the same research field or from different fields of study.

In addition to employing regulated vocabularies, standard ontologies, and metadata standards, this is made feasible by using linkages between the data and associated digital research items.

Reusable

Data are well-documented, carefully managed, and offer significant details on the background of data creation. The data should be in accordance with community standards and contain explicit guidelines for how it may be accessed and reused, preferably through the use of machine-readable standard licenses.

This enables others to either evaluate and confirm the findings of the first research, so assuring data reproducibility, or to create new projects based on the findings of the initial research, or data reuse in the strictest sense.

FAIR-ification Process

A quick insight into the FAIRification process follows below:

Retrieve Data To Be FAIRified

First, you need to identify and gain access to the data to be FAIRified and determine its intended use. This is done during the pre-FAIRification stage of the workflow. This stage necessitates having access to the data.

Analyze The Retrieved Data

Analyze the data's content to see which concepts are represented. What kind of data structure is there? What connections exist between the data elements? Different techniques of identification and analysis are required for various data distributions.

If the dataset is stored in a relational database, for instance, the relational schema offers details about the dataset's structure, the types involved (the field names), cardinality, etc.

Define A Semantic Model

Semantic models frequently include several words from ontologies and vocabularies that are already in use. Establish a "semantic model" for the dataset that precisely, unambiguously, and in a form that can be used by machines, explains the meaning of the elements and relations in the dataset.

Depending on the dataset, even for seasoned data modelers, developing a suitable semantic model may take a lot of work. For a specific objective in a specific area, a good semantic model should represent the general consensus. Therefore, it is wise to look for pre-existing models.

Transform Data Into Linkable Data

By implementing the semantic model established in step 3 with Semantic Web and Linked Data technologies, the non-FAIR data can be converted into linkable data. As a result, interoperability and reuse are guaranteed, making it easier to integrate the data with other types of data and systems.

License Assignment

Focus on making sure data license information is provided; otherwise, data reuse may be hindered.

Define Metadata For The Dataset

To assist all parts of FAIR data evaluation, make sure the data is defined with appropriate and rich metadata.

Publish The FAIRified Data

Publish or deploy the FAIRified data together with pertinent metadata and a license to enable access to the data even when authentication and authorization are required and the metadata to be indexed by search engines.

Why FAIRify Data?

It is imperative to align FAIRified data at all levels and with the utmost urgency to:

Reduces Expenses

According to a recent EU research, the lack of FAIR data costs the European economy an estimated €10.2 billion annually.2 FAIRification is therefore critical to lower expenses and risks associated with data discovery, increase benefits, and produce a long-term return on investment (ROI).

Increases Strategic Value Addition

Strategic value addition in the context of the FAIR Data principles refers to the ways in which making data FAIR can increase the value of the data to an organization. making data FAIR can increase its value by making it more discoverable, accessible, interoperable, and reusable. This can lead to increased efficiency, innovation, and impact on the organization.

Making data FAIR, for instance, might make it easier for an organization to share that data with other organizations, which could result in new partnerships and collaborations. Additionally, by making data FAIR, the company could be better able to use it for research and analysis, which might result in new insights and possibilities.

Minimize Data Wrangling

Using FAIRified data could reduce the time, cost, and effort needed to gather, choose, clean, and transform raw data into high-quality, standardized analysis-ready formats.

Accelerates Scientific Discoveries

It is therefore imperative to adopt a cultural change at work where "my data" is replaced with "the corporation's valued asset" and stimulate FAIR data sharing across several beneficiaries for data transformation. Responses to scientific questions will come more quickly if they are flexible and ad hoc.

FAQs

Who created FAIR data principles?

Researchers and scientists under the direction of Mark D. Wilkinson, Michel Dumontier, and Christian Goble developed the FAIR Data principles. The first publication describing the ideas appeared in the scientific journal Scientific Data in 2016.

Is FAIR data in conflict with the GDPR?

No. In general, the principles of FAIR and the GDPR are not in conflict with each other. In fact, the principles of FAIR can be seen as consistent with the principles of the GDPR, which include fairness, transparency, and data protection.

The GDPR mandates that businesses disclose all of their data processing operations and guarantee that personal data is gathered and utilized fairly and legally. Organizations can comply with these standards and make sure that data is utilized in an ethical and responsible manner by using the FAIR principles.

Final Thoughts: A FAIR Future

The FAIR data principles are a collection of rules designed to improve the findability, accessibility, interoperability, and reuse of data. These guidelines are intended to assist researchers, scientists, and other data professionals in managing and disseminating data more effectively while maximizing its value and impact.

Following the FAIR principles can ensure that data is transparently, ethically, and responsibly documented, preserved, and shared. Researchers can encourage the sharing and reuse of data by adhering to the FAIR principles, which can in turn stimulate innovation, discovery, and scientific advancement.

If you’re looking to enhance your company’s offerings and strategy, we urge you to contact us at 20/15 Visioneers. With a Free 30 Minute Consultation, we’ll discuss best practices and help you map out the depth of your issues in a collaborative meeting.

Sources:

Why is fair data important in 2022? EUDAT. (2022, May 4). Retrieved from https://eudat.eu/news/why-is-fair-data-important-in-2022
Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., … Mons, B. (2016, March 15). The Fair Guiding Principles for Scientific Data Management and Stewardship. Nature News. Retrieved from https://www.nature.com/articles/sdata201618
Directorate-General for Research and Innovation (European Commission), & Services, P. C. E. U. (2019, January 16). Cost-benefit analysis for fair research data : Cost of not having fair research data. Photo of Publications Office of the European Union. Retrieved from https://op.europa.eu/en/publication-detail/-/publication/d375368c-1a0a-11e9-8d04-01aa75ed71a1