Filtering data by quality

The Finnish Biodiversity Information Facility (FinBIF) compiles datasets from many sources, including government, professional researchers and citizen scientists. Data accuracy varies significantly within and between datasets—and all data should not necessarily be used for all applications.

Data compiled by FinBIF sometimes include annotations that add extra indications of quality. Annotations add to the original data but do not replace the information originally supplied. FinBIF does not remove or hide data based on annotations but provides the means for data users to filter data based on annotations so they may find the data that is most appropriate for their intended purpose.

Using annotations data can be flagged as e.g.,

  • “check identification” if the identification is uncertain. An expert may classify the observation as confirmed or uncertain using an image or description attached to the observation.
  • “check location” if the location does not correspond to the known distribution of the species. The observer may be notified and might correct any errors.

Filtering observations using annotations

Data annotation can change a record’s taxonomic identification, and by default a record will use the last taxonomic identification it was given. However, if you want to search for occurrence records based on the identification provided by the original source uncheck “Corrected identifications” under the “Species/Taxa” subheading (with “Advanced” search engaged).

Quality filters

Under the “Quality” subheading there are several quality-based filters which allow you to filter your results (more quality filters are available if engage “Advanced” searching).

Observation Reliability

Observation reliability is based on the information from the original source and laji.fi annotations.

  • Expert verified: The record has been verified by an expert.
  • Community verified: The record has been verified by non-experts.
  • Unassessed: No reliability assessment. This is the default for all records.
  • Uncertain: The record has been flagged as uncertain by an expert, or the observation has been marked as uncertain by the original source.
  • Erroneous: The finding has been flagged as incorrect by an expert. Incorrect findings are not completely removed, this prevents them being re-added to the database.

Requires Review

Records flagged as “requires review” can be considered potentially uncertain or erroneous and in some case, the data user may want to filter them out.

Dataset Origin

Data comes from three broad sources:

  • Professionals: people employed as experts in their taxa of interest.
    • e.g., people collecting data for research and monitoring projects or museum collections.
  • Specialists: non-professional recognized for their expertise.
    • e.g., people collecting data for specialist volunteer monitoring projects.
  • Citizen Scientists: the wider community.
    • e.g., people submitting records to general nature observation systems such as iNaturalist.

Note that the data origin is not completely indicative of the quality of the records they produce. Observations made by the citizen scientists are often accurate while there may be errors even in the professionally collected data.

Records from many datasets are frequently subject to annotation and thus the data quality is raised over time (e.g., iNaturalist).

Laji.fi quality control tags

By using this filter, observations can be filtered using tags assigned in the process of annotation (e.g. “check identification” or “check coordinates”).

Quality issues

FinBIF includes some automated data validations to which all new records are subjected. If errors are found during automated validation records are flagged as having "quality issues". Examples of automated validations include:

  • coordinates match the municipality (for Finnish observations).
  • date valid and later than the 1599