Yield Calibration & Cleaning

How to clean and calibrate yield monitor data in GeoPard. Includes USDA yield cleaning protocol. Fix outliers, striping, turnarounds, and multi-harvester datasets.

Use GeoPard to clean yield data and calibrate yield monitor datasets. Get a yield map you can trust for zones, prescriptions, and analytics. This workflow handles outliers, turnarounds, missing attributes, and multi-harvester yield data. It includes the USDA yield cleaning protocol and supports Yield Editor alternative workflows.

This calibration process is instrumental in:

Ensuring Data Consistency: It's not uncommon for multiple harvesters to work in tandem or across different days. This feature ensures that their data sings in harmony.
Homogenizing Data: Yield data can be varied; the calibration ensures it is smooth and consistent, without unwanted spikes or drops.
Filtering Out Noise: Like any data, yield data can have its share of 'noise' or irrelevant info. We make sure it doesn't muddy your insights.
Streamlining Geometries: Any turnarounds or odd geometric data patterns can skew real insights. The calibration is designed to iron these out, ensuring the data truly mirrors field realities.
Cropping by Field Boundary: Harvesters often operate across adjacent areas. For accurate analytical results, it's essential to consider only the data situated within the specified boundary.

The Yield Calibration interface uses the GeoPard API endpoint for Yield Clean/Calibrate (GeoPard API: Calibrate and Clean YieldDataset). It runs the CALIBRATE and CLEAN operations in the UI or via API.

Quick Overview

Real-World Examples

In the realm of agriculture, corrupted yield datasets can pose significant challenges. Below, you can find real-world examples where such datasets were encountered. Through GeoPard's advanced calibration and cleaning algorithms, these datasets were effectively refined and optimized.

To address areas lacking logged Yield Data and achieve completeness of the Yield Map, consider utilizing the GeoPard Synthetic Yield Map approach. This method restores missing data for a complete yield analysis. Learn more here.

Multiple Harvesters Working Together

When dealing with complex scenarios, a two-step calibration process is recommended for optimal accuracy. Begin by running the initial calibration using the Machine ID attribute. Following that, proceed with a second round of calibration, this time utilizing the Simulated (Synthetic) Machine Paths tickbox. This layered approach ensures a thorough and precise calibration, essential for managing intricate cases effectively.

J-turns, Stops, Half Equipment Width Used

Abnormally Large Logged Values

Data Beyond Field Boundary

Calibration Using Provided Average Yield Value

Clean Yield Attributes Ignoring Attributes with Anomalies

The Yield Dataset occasionally includes attributes with irregularities in Moisture, Speed, Elevations, or other secondary (non-yield) attributes. During the execution of Clean or Calibrate activities, it is essential to disregard these anomalies. This can be efficiently achieved using the GeoPard Yield Clean-Calibrate interface.

USDA Yield Cleaning Protocol

Use this option when you need a repeatable, standards-based yield editor workflow. It is optimized for yield monitor data cleaning at scale.

Explanation of Calibration Logics

Pathwise Calibration

USE Pathwise Calibration when a field is harvested by multiple machines or over several days, specifically to correct systematic differences like striping or banding. It is ideal for scenarios where varying machine settings, operators, or environmental conditions cause consistent over- or under-estimation across different paths.

Crucially, the AI requires variation - such as distinct paths, machine IDs, or harvest dates - to learn and calibrate effectively.

DO NOT USE this method for single-machine harvests in one continuous session or if the yield map lacks visible spatial patterns. Additionally, avoid it if the data is sparse or if you only possess total field-level yield values without machine-level differences

Average or Total Calibration

Average/Total Calibration IS BEST USED when you have a high level of confidence in your overall field-level yield data, such as records from a weighbridge or storage facility. Instead of adjusting individual paths, this method scales the entire dataset so that the final average or total matches your known reference value. It is often described as the simplest and safest calibration option when the overall numbers are trusted.

When to USE Average/Total Calibration:

Known Reference Values: You should use this logic when you have official total yield records (e.g., from a weighbridge) or a highly reliable average yield for the field.
Global Bias Correction: It is ideal if the spatial distribution in the yield map looks correct, but the values are globally shifted - meaning the yield monitor was likely uncalibrated and is reporting values that are consistently too high or too low across the entire field.
Uniform Harvest Conditions: This method is most effective when harvesting conditions were relatively consistent throughout the operation.
Single-Machine Consistency: It works well for harvests completed by a single machine that performed consistently across the field.

When NOT to USE Average/Total Calibration:

Machine-to-Machine Bias: Do not use this method if different parts of the field were harvested by different machines or on different days that resulted in localized biases. In these cases, scaling the whole field will not fix the underlying discrepancies between machines.
Visible Artifacts: If you see strong striping, banding, or directional artifacts in your data, this method will not resolve them; Pathwise calibration is better suited for those issues.
Incomplete Data: Avoid this logic if only a portion of the field was harvested or if the recorded data is incomplete, as the total/average values would be misleading.

Conditional Calibration

Conditional Calibration serves as a safety control by ensuring yield values remain within realistic, pre-defined minimum and maximum ranges.

You SHOULD USE this logic to remove extreme outliers and sensor spikes caused by noise, machine stoppages, or turns. It is ideal for applying specific agronomic expectations - such as "yield cannot exceed X" - without performing a correction.

However, AVOID THIS METHOD if your dataset has a global bias or systematic machine differences, as it does not scale data or fix spatial patterns. Essentially, it keeps values plausible but does not resolve underlying calibration offsets.

Usage Strategy

First Step

The "Yield Calibrate and Clean" module is initiated directly from the User Interface. The primary requirement is to have an uploaded Yield Dataset. Adjacent to each Yield Dataset, you'll find a button to commence the dataset adjustments.

From there, several options are available for proceeding:

Auto-Processing: Use the default, GeoPard-recommended settings for a one-click calibration.
Clean Only: Configure and execute only the CLEAN operation, including
1. GeoPard Cleaning: Smart Cleaning of Yield dataset with AI algorithms.
2. USDA (United States Department of Agriculture) Cleaning Protocol for yield.
3. Conditional Cleaning: Filter data based on custom attribute thresholds.
Calibrate Only: Configure and execute just the CALIBRATE operation, including
1. Pathwise: Calibrate yield for each individual machine path using AI algorithms.
2. Average/Total: Adjust yield based on the field's known average or total yield.
3. Conditional: Modify yield within set minimum and maximum limits to maintain expected ranges.
Calibrate & Clean: Choose the sequence of operations and customize the parameters.
Yield Editor Alternative: Use Clean Only → USDA (or Calibrate & Clean) to match a manual “Yield Editor” cleanup workflow, but at scale. In validation tests, USDA protocol cleaning matched manual Yield Editor results with R² (R2) = 0.98 (almost identical output).

One-Button Solution

Hint for Abnormal Values Sometimes Inherent to Yield Datasets.

If an attribute selected for calibration or cleaning predominantly contains zero values across the majority of geometries, these geometries will be excluded from the final Yield Dataset.

To ensure integrity, attributes with such anomalies should be excluded from the list of attributes to be calibrated.

Full Guidance

Choose Flow: Hint for Data Anomalies

If a user encounters anomalies in the data, such as values at or near zero, or unusually large values (for instance, an average of 10 with a maximum of 8000), the Clean & Calibration workflow is advised.

Prioritizing data Cleaning before Calibration ensures the removal of errors, missing values, or inconsistencies, thereby enhancing data quality and accuracy.

Choose Flow: Hint for Data without Initial Errors

For datasets initially free from errors, missing values, or inconsistencies, and when multiple harvesters are known to be involved, consider the Calibration & Clean workflow.

Cleaning the data post-calibration helps to refine the dataset further by potentially eliminating any artifacts introduced during calibration.

Clean Flow: Hint for Abnormal Values Sometimes Inherent to Yield Datasets.

If an attribute selected for calibration or cleaning predominantly contains zero values across a majority of geometries, these geometries will be excluded from the final Yield Dataset.

To ensure integrity, attributes with such anomalies should be excluded from the list of attributes to be cleaned (2).

Calibrate Flow: Hint for Abnormal Values Sometimes Inherent to Yield Datasets.

If an attribute selected for calibration or cleaning predominantly contains zero values across the majority of geometries, these geometries will be excluded from the final Yield Dataset.

To ensure integrity, attributes with such anomalies should be excluded from the list of attributes to be calibrated (3).

Algorithm Versions

Post-processing, the outcomes are shown next to the original dataset. They are marked with "Calibrate" and/or "Clean" labels, plus the algorithm version.

From version 3.0 of the Clean/Calibrate algorithm onward, GeoPard introduces the Cropping by Field Boundary feature. This keeps only geometries within the Field Boundary and improves statistical distribution.

Starting with version 4.0, the Clean/Calibrate algorithm in GeoPard now incorporates a feature for calibration based on Average or Total Values across any attribute. A prevalent application of this enhancement is the calibration of WetMass, which can now be adjusted by the known measured Average Yield for a specific Field.

From version 5.0 of the Clean/Calibrate algorithm onward, GeoPard introduces USDA (United States Department of Agriculture) Cleaning Protocol for yield. USDA provides formal agronomic data standards that govern how yield, moisture, flow, and spatial measurements are normalized, validated, and statistically filtered to produce machine- and field-consistent agricultural datasets.

PreviousVariable Rate Seeding (Planting) Maps NextSynthetic Yield Map

Last updated 2 days ago

Was this helpful?

hashtagQuick Overview

hashtagReal-World Examples

hashtagMultiple Harvesters Working Together

hashtagJ-turns, Stops, Half Equipment Width Used

hashtagAbnormally Large Logged Values

hashtagData Beyond Field Boundary

hashtagCalibration Using Provided Average Yield Value

hashtagClean Yield Attributes Ignoring Attributes with Anomalies

hashtagUSDA Yield Cleaning Protocol

hashtagExplanation of Calibration Logics

hashtagPathwise Calibration

hashtagAverage or Total Calibration

hashtagConditional Calibration

hashtagUsage Strategy

hashtagFirst Step

hashtagOne-Button Solution

hashtagFull Guidance

hashtagAlgorithm Versions