Yield Calibration & Cleaning
Tutorial to run Harvesting data cleaning and calibration. Or just cleaning, or just calibration. It works great even for complex cases: multiple harvesters, no harvester id, etc.
Last updated
Tutorial to run Harvesting data cleaning and calibration. Or just cleaning, or just calibration. It works great even for complex cases: multiple harvesters, no harvester id, etc.
Last updated
Yield data holds immense potential for farmers, but its true power is unlocked only when its precision is perfect. The "Yield Calibration" module is designed to refine the raw Yield Dataset, aligning it with core mathematical tenets to uplift its quality. The end result is a dataset that's not only more robust but also primes it for in-depth, insightful analyses.
This calibration process is instrumental in:
Ensuring Data Consistency: It's not uncommon for multiple harvesters to work in tandem or across different days. This feature ensures that their data sings in harmony.
Homogenizing Data: Yield data can be varied; the calibration ensures it is smooth and consistent, without unwanted spikes or drops.
Filtering Out Noise: Like any data, yield data can have its share of 'noise' or irrelevant info. We make sure it doesn't muddy your insights.
Streamlining Geometries: Any turnarounds or odd geometric data patterns can skew real insights. The calibration is designed to iron these out, ensuring the data truly mirrors field realities.
Cropping by Field Boundary: Harvesters often operate across adjacent areas. For accurate analytical results, it's essential to consider only the data situated within the specified boundary.
The Yield Calibration interface leverages the associated GeoPard API (LINK), specifically integrating with the CALIBRATE
and CLEAN
operations. This functionality is accessible through the GeoPard User Interface and can be programmatically invoked via the GeoPard API.
In the realm of agriculture, corrupted yield datasets can pose significant challenges. Below, you can find real-world examples where such datasets were encountered. Through GeoPard's advanced calibration and cleaning algorithms, these datasets were effectively refined and optimized.
To address areas lacking logged Yield Data and achieve completeness of the Yield Map, consider utilizing the GeoPard Synthetic Yield Map approach. This method effectively restores missing data, ensuring a comprehensive yield analysis. Learn more about this technique HERE.
When dealing with complex scenarios, a two-step calibration process is recommended for optimal accuracy. Begin by running the initial calibration using the Machine ID attribute. Following that, proceed with a second round of calibration, this time utilizing the Simulated (Synthetic) Machine Paths tickbox. This layered approach ensures a thorough and precise calibration, essential for managing intricate cases effectively.
The Yield Dataset occasionally includes attributes with irregularities in Moisture, Speed, Elevations, or other secondary (non-yield) attributes. During the execution of Clean or Calibrate activities, it is essential to disregard these anomalies. This can be efficiently achieved using the GeoPard Yield Clean-Calibrate interface.
The "Yield Calibrate and Clean" module is initiated directly from the User Interface. The primary requirement is to have an uploaded Yield Dataset. Adjacent to each Yield Dataset, you'll find a button to commence the dataset adjustments.
From there, several options are available for proceeding:
Auto-Processing: Use the default, GeoPard-recommended settings for a one-click calibration.
Clean Only: Configure and execute only the CLEAN operation.
Calibrate & Clean: Choose the sequence of operations and customize the parameters.
Calibrate Only: Configure and execute just the CALIBRATE operation.
The Auto-Processing option is the preferred choice to utilize GeoPard's recommended settings for calibrating and cleaning the Yield Dataset. However, the configuration is always open for review and potential modifications.
Key configuration parameters include:
Choosing an attribute as a basis for calibration, typically represented by machines operating in the field or a timestamp.
Identifying attributes to be calibrated.
Selecting an attribute that represents target yield values for cleaning.
Hint for Abnormal Values Sometimes Inherent to Yield Datasets.
If an attribute selected for calibration or cleaning predominantly contains zero values across the majority of geometries, these geometries will be excluded from the final Yield Dataset.
To ensure integrity, attributes with such anomalies should be excluded from the list of attributes to be calibrated (2).
Once configured, click Run to apply the logic.
The processed results will be displayed alongside the original dataset, marked with Calibrated and/or Cleaned labels, accompanied by the version
of the algorithm used.
From version 3.0
of the Clean/Calibrate algorithm onward, GeoPard introduces the Crop by Field Boundary feature. This keeps only geometries located within the Field Boundary and results in more accurate statistical data distribution.
Starting with version 4.0
, the Clean/Calibrate algorithm in GeoPard now incorporates a feature for calibration based on Average or Total Values across any attribute. A prevalent application of this enhancement is the calibration of WetMass, which can now be adjusted by the known measured Average Yield for a specific Field.
The Calibrate & Clean option offers comprehensive manual configurations for the calibration and cleaning processes. It's ideal for those seeking full control over the algorithm, making the operations transparent as a white-box approach. The Calibrate Only and Clean Only alternatives are essentially individual components of the Calibrate & Clean process.
Specify the sequence you prefer: first "Calibrate", then "Clean" or the reverse.
Hint for Data Anomalies
If a user encounters anomalies in the data, such as values at or near zero, or unusually large values (for instance, an average of 10 with a maximum of 8000), the Clean & Calibration workflow is advised.
Prioritizing data Cleaning before Calibration ensures the removal of errors, missing values, or inconsistencies, thereby enhancing data quality and accuracy.
Hint for Data without Initial Errors
For datasets initially free from errors, missing values, or inconsistencies, and when multiple harvesters are known to be involved, consider the Calibration & Clean workflow.
Cleaning the data post-calibration helps to refine the dataset further by potentially eliminating any artifacts introduced during calibration.
For the Calibrate step, configuration parameters include:
A smoothing level to mitigate sudden fluctuations in values.
Choose a calibration type: Pathwise, Average/Total, or Conditional.
Attributes to calibrate.
The calibration basis attribute often relates to the machinery path in the field or timestamps. In the absence of real machinery paths, simulated paths can be utilized.
Option to manually input Average/Total or Conditional values.
Hint for Abnormal Values Sometimes Inherent to Yield Datasets.
If an attribute selected for calibration or cleaning predominantly contains zero values across the majority of geometries, these geometries will be excluded from the final Yield Dataset.
To ensure integrity, attributes with such anomalies should be excluded from the list of attributes to be calibrated (3).
The Clean step's configuration involves:
Attributes representing target yield values.
Exclusion parameters that determine attributes exempt from the cleaning operation (optional).
Setting conditions to discard attributes based on min/max thresholds (optional).
Hint for Abnormal Values Sometimes Inherent to Yield Datasets.
If an attribute selected for calibration or cleaning predominantly contains zero values across a majority of geometries, these geometries will be excluded from the final Yield Dataset.
To ensure integrity, attributes with such anomalies should be excluded from the list of attributes to be cleaned (2).
Click Run to initiate the process.
Post-processing, the outcomes are showcased adjacent to the original dataset, distinctly marked with Calibrate" and/or "Clean" labels, as well as the algorithm version utilized.
From version 3.0
of the Clean/Calibrate algorithm onward, GeoPard introduces the Crop by Field Boundary feature. This keeps only geometries located within the Field Boundary and results in more accurate statistical data distribution.
Starting with version 4.0
, the Clean/Calibrate algorithm in GeoPard now incorporates a feature for calibration based on Average or Total Values across any attribute. A prevalent application of this enhancement is the calibration of WetMass, which can now be adjusted by the known measured Average Yield for a specific Field.