Yield Calibration & Cleaning
Tutorial to run Harvesting data cleaning and calibration. Or just cleaning, or just calibration. It works great even for complex cases: multiple harvesters, no harvester id, etc.
Yield data holds immense potential for farmers, but its true power is unlocked only when its precision is perfect. The "Yield Calibration" module is designed to refine the raw Yield Dataset, aligning it with core mathematical tenets to uplift its quality. The end result is a dataset that's not only more robust but also primes it for in-depth, insightful analyses.
This calibration process is instrumental in:
Ensuring Data Consistency: It's not uncommon for multiple harvesters to work in tandem or across different days. This feature ensures that their data sings in harmony.
Homogenizing Data: Yield data can be varied; the calibration ensures it is smooth and consistent, without unwanted spikes or drops.
Filtering Out Noise: Like any data, yield data can have its share of 'noise' or irrelevant info. We make sure it doesn't muddy your insights.
Streamlining Geometries: Any turnarounds or odd geometric data patterns can skew real insights. The calibration is designed to iron these out, ensuring the data truly mirrors field realities.
Cropping by Field Boundary: Harvesters often operate across adjacent areas. For accurate analytical results, it's essential to consider only the data situated within the specified boundary.
Quick Overview






Real-World Examples
In the realm of agriculture, corrupted yield datasets can pose significant challenges. Below, you can find real-world examples where such datasets were encountered. Through GeoPard's advanced calibration and cleaning algorithms, these datasets were effectively refined and optimized.
Multiple Harvesters Working Together



J-turns, Stops, Half Equipment Width Used


Abnormally Large Logged Values





Data Beyond Field Boundary

Calibration Using Provided Average Yield Value

Clean Yield Attributes Ignoring Attributes with Anomalies
The Yield Dataset occasionally includes attributes with irregularities in Moisture, Speed, Elevations, or other secondary (non-yield) attributes. During the execution of Clean or Calibrate activities, it is essential to disregard these anomalies. This can be efficiently achieved using the GeoPard Yield Clean-Calibrate interface.


USDA Clean Protocol


Explanation of Calibration Logics
Pathwise Calibration
USE Pathwise Calibration when a field is harvested by multiple machines or over several days, specifically to correct systematic differences like striping or banding. It is ideal for scenarios where varying machine settings, operators, or environmental conditions cause consistent over- or under-estimation across different paths.
Crucially, the AI requires variation - such as distinct paths, machine IDs, or harvest dates - to learn and calibrate effectively.

DO NOT USE this method for single-machine harvests in one continuous session or if the yield map lacks visible spatial patterns. Additionally, avoid it if the data is sparse or if you only possess total field-level yield values without machine-level differences

Average or Total Calibration
Average/Total Calibration IS BEST USED when you have a high level of confidence in your overall field-level yield data, such as records from a weighbridge or storage facility. Instead of adjusting individual paths, this method scales the entire dataset so that the final average or total matches your known reference value. It is often described as the simplest and safest calibration option when the overall numbers are trusted.
When to USE Average/Total Calibration:
Known Reference Values: You should use this logic when you have official total yield records (e.g., from a weighbridge) or a highly reliable average yield for the field.
Global Bias Correction: It is ideal if the spatial distribution in the yield map looks correct, but the values are globally shifted - meaning the yield monitor was likely uncalibrated and is reporting values that are consistently too high or too low across the entire field.
Uniform Harvest Conditions: This method is most effective when harvesting conditions were relatively consistent throughout the operation.
Single-Machine Consistency: It works well for harvests completed by a single machine that performed consistently across the field.

When NOT to USE Average/Total Calibration:
Machine-to-Machine Bias: Do not use this method if different parts of the field were harvested by different machines or on different days that resulted in localized biases. In these cases, scaling the whole field will not fix the underlying discrepancies between machines.
Visible Artifacts: If you see strong striping, banding, or directional artifacts in your data, this method will not resolve them; Pathwise calibration is better suited for those issues.
Incomplete Data: Avoid this logic if only a portion of the field was harvested or if the recorded data is incomplete, as the total/average values would be misleading.

Conditional Calibration
Conditional Calibration serves as a safety control by ensuring yield values remain within realistic, pre-defined minimum and maximum ranges.
You SHOULD USE this logic to remove extreme outliers and sensor spikes caused by noise, machine stoppages, or turns. It is ideal for applying specific agronomic expectations - such as "yield cannot exceed X" - without performing a correction.

However, AVOID THIS METHOD if your dataset has a global bias or systematic machine differences, as it does not scale data or fix spatial patterns. Essentially, it keeps values plausible but does not resolve underlying calibration offsets.
First Step
The "Yield Calibrate and Clean" module is initiated directly from the User Interface. The primary requirement is to have an uploaded Yield Dataset. Adjacent to each Yield Dataset, you'll find a button to commence the dataset adjustments.


From there, several options are available for proceeding:
Auto-Processing: Use the default, GeoPard-recommended settings for a one-click calibration.
Clean Only: Configure and execute only the CLEAN operation, including
GeoPard Cleaning: Smart Cleaning of Yield dataset with AI algorithms.
USDA (United States Department of Agriculture) Cleaning Protocol for yield.
Conditional Cleaning: Filter data based on custom attribute thresholds.
Calibrate Only: Configure and execute just the CALIBRATE operation, including
Pathwise: Calibrate yield for each individual machine path using AI algorithms.
Average/Total: Adjust yield based on the field's known average or total yield.
Conditional: Modify yield within set minimum and maximum limits to maintain expected ranges.
Calibrate & Clean: Choose the sequence of operations and customize the parameters.
Auto-Processing
The Auto-Processing option is the preferred choice to utilize GeoPard's recommended settings for calibrating and cleaning the Yield Dataset. However, the configuration is always open for review and potential modifications.

Key configuration parameters include:
Identifying attributes to be calibrated.
Selecting an attribute that represents target yield values for cleaning.
Choosing an attribute as a basis for calibration, typically represented by machines operating in the field or a timestamp.
Hint for Abnormal Values Sometimes Inherent to Yield Datasets.
If an attribute selected for calibration or cleaning predominantly contains zero values across the majority of geometries, these geometries will be excluded from the final Yield Dataset.
To ensure integrity, attributes with such anomalies should be excluded from the list of attributes to be calibrated (2).

Once configured, click Run to apply the logic.
The processed results will be displayed alongside the original dataset, marked with Calibrated and/or Cleaned labels, accompanied by the version of the algorithm used.

Compete Manual Configuration
The Calibrate & Clean option offers comprehensive manual configurations for the calibration and cleaning processes. It's ideal for those seeking full control over the algorithm, making the operations transparent as a white-box approach. The Calibrate Only and Clean Only alternatives are essentially individual components of the Calibrate & Clean process.

Specify the sequence you prefer: first "Calibrate", then "Clean" or the reverse.
Hint for Data Anomalies
If a user encounters anomalies in the data, such as values at or near zero, or unusually large values (for instance, an average of 10 with a maximum of 8000), the Clean & Calibration workflow is advised.
Prioritizing data Cleaning before Calibration ensures the removal of errors, missing values, or inconsistencies, thereby enhancing data quality and accuracy.
Hint for Data without Initial Errors
For datasets initially free from errors, missing values, or inconsistencies, and when multiple harvesters are known to be involved, consider the Calibration & Clean workflow.
Cleaning the data post-calibration helps to refine the dataset further by potentially eliminating any artifacts introduced during calibration.

The Clean step's configuration involves:
Attributes representing target yield values.
Exclusion parameters that determine attributes exempt from the cleaning operation (optional).
Setting conditions to discard attributes based on min/max thresholds (optional).
Hint for Abnormal Values Sometimes Inherent to Yield Datasets.
If an attribute selected for calibration or cleaning predominantly contains zero values across a majority of geometries, these geometries will be excluded from the final Yield Dataset.
To ensure integrity, attributes with such anomalies should be excluded from the list of attributes to be cleaned (2).

For the Calibrate step, configuration parameters include:
A smoothing level to mitigate sudden fluctuations in values.
Choose a calibration type: Pathwise, Average/Total, or Conditional.
Attributes to calibrate.
The calibration basis attribute often relates to the machinery path in the field or timestamps. In the absence of real machinery paths, simulated paths can be utilized.
Option to manually input Average/Total or Conditional values.
Hint for Abnormal Values Sometimes Inherent to Yield Datasets.
If an attribute selected for calibration or cleaning predominantly contains zero values across the majority of geometries, these geometries will be excluded from the final Yield Dataset.
To ensure integrity, attributes with such anomalies should be excluded from the list of attributes to be calibrated (3).

Click Run to initiate the process.
Algorithm Versions
Post-processing, the outcomes are showcased adjacent to the original dataset, distinctly marked with Calibrate" and/or "Clean" labels, as well as the algorithm version utilized.




Last updated
Was this helpful?