1 What is data management?
Effective data management involves properly documenting, storing, and sharing data and information derived from the data. If the data aren’t usable by researchers, policymakers, or growers, then all the time, energy, and effort spent collecting and analyzing the samples may be wasted.
The guidelines detailed in this DMP help us achieve our data-driven goals, while also optimizing the value of the data by supporting information sharing and innovation. Our data management policies aim to implement FAIR (Findable, Accessible, Interoperable, Reusable) principles while also maintaining data privacy (Wilkinson et al. 2016).
1.1 Data life cycle
This graphic explains the data life cycle (U.S. Fish & Wildlife Service 2023), in which each step requires care to ensure transparency, quality, and integrity.
Our adaptation is outlined below and the following chapters detail our internal processes and standards to follow throughout each step in the data life cycle.
Plan
Planning includes decisions about data acquisition, management, and quality control, as well as regular examinations of ways to improve. For example, each year we provide an updated spreadsheet template to Soiltest lab to ensure that measurements are reported with correct units and in the correct format. Special projects that deviate from our standard operating procedures require additional planning.
Acquire
We acquire data by collecting and analyzing new samples, deriving new insights from existing samples, or accepting datasets from collaborators.
Maintain
Maintenance involves processing data for aggregation, analyses, and reporting. We create metadata that facilitates interpretation of the data. We also store a copy of our data in a format that is accessible to our collaborators and future selves.
Access
Access refers to data storage, publication, and security. Raw and processed data with accompanying metadata should be stored, backed up, and available for information sharing with our partners. With PI approval, anonymized and aggregated data that does not compromise growers’ personally identifiable information can be made publicly available in a data repository or data product/decision-support tool.
Evaluate
We evaluate data while processing and analyzing it to maximize accuracy and productivity, while minimizing costs associated with errors or tedious data cleaning labor. Evaluation workflows should be efficient, well-documented, and reproducible. Our evaluated data help us better understand how environmental factors and management decisions impact soil health.
Archive
Properly archiving our results supports the long-term storage and usefulness of our data. While similar to the Access element of the life cycle, archiving focuses on preserving data for historical and long-term access. For example, we archive each year’s raw data for long-term storage and set those files to Read-Only.
Quality Assurance / Quality Control (QA/QC)
Data quality management prevents data defects that hinder our ability to apply data towards our science-based conservation efforts. Defects include incorrectly entered data, invalid data, and missing or lost data. QA/QC processes should be incorporated in every element of the data life cycle.