84. Mutation: Calibrate and Clean YieldDataset

API calls to clean and calibrate Yield datasets

Calibrating the "YieldDataset" is a functionality that corrects the distribution of values in alignment with mathematical principles, enhancing the overall integrity of the data. It bolsters the quality of decision-making and renders the dataset valuable for further in-depth analysis.

Common use cases for this functionality include:

  • Synchronizing data when multiple harvesters have worked either simultaneously or over several days, ensuring consistency.

  • Making the dataset more homogeneous and accurate by smoothing out variations.

  • Removing data noise and extraneous information that can cloud insights.

  • Eliminating turnarounds or abnormal geometries, which may distort the actual patterns and trends in the field.

  • Adjusting yield data to correspond with established averages or totals for each attribute.

For a more detailed exploration and examples, please refer to our Yield Calibration & Cleaning use case.

Original YieldDataset with WetMass attribute

Five harvesters worked in parallel on the 30ha Field shown below. The calibration of one of the harvesters was not synchronized with the others, resulting in orange spots, indicating that additional CALIBRATION is required. Additionally, there are numerous turn-around red spots closer to the "Field" edges that need to be eliminated.

Calibrated and Cleaned YieldDataset with WetMass attribute

The result below shows the dataset after applying automatic CALIBRATE and CLEAN operations using default parameters. The resulting "YieldDataset" has become homogeneous, without outliers or abrupt changes between neighboring geometries.

Pathwise vs Average/Total vs Conditional options

Pathwise calibration corresponds to the machine's tracks. Each machine track is processed as an individual region for calibration purposes. The GeoPard team suggests using this method as the standard approach.

Average/Total calibration focuses on redistributing attribute values. If the geospatial patterns are accurate but the absolute figures deviate from the actuals, this method proves beneficial. For optimal results, GeoPard advises combining it with Pathwise calibration: first applying Pathwise, then adjusting to known Average/Total values.

Conditional calibration adjusts attribute values based on provided min and max thresholds. This method is especially valuable when the geospatial patterns are precise, but the distribution of values requires adjustments, particularly when known min and max values exist. For the best outcomes, GeoPard recommends pairing it with Pathwise calibration: starting with Pathwise, followed by adjustments to align with the known min and max values.

Hints

Hint for Data Anomalies

If a user encounters anomalies in the data, such as values at or near zero, or unusually large values (for instance, an average of 10 with a maximum of 8000), the Clean & Calibration workflow is advised. It is configured using parameters actions: [CLEAN, CALIBRATE].

Prioritizing data Cleaning before Calibration ensures the removal of errors, missing values, or inconsistencies, thereby enhancing data quality and accuracy.

Hint for Data without Initial Errors

For datasets initially free from errors, missing values, or inconsistencies, and when multiple harvesters are known to be involved, consider the Calibration & Clean workflow. It is configured using parameters actions: [CALIBRATE, CLEAN].

Cleaning the data post-calibration helps to refine the dataset further by potentially eliminating any artifacts introduced during calibration.

Documentation: Mutation details

The default standard configuration enables auto calibration and cleansing of the "YieldDataset".

mutation AutoCalibrateYieldDataset {
  calibrateYieldDataset(input: {
    fieldUuid: "<placeholder_of_field_uuid>"
    yieldDatasetUuid: "<placeholder_of_dataset_uuid>"
    actions: {
      calibrateAction: {
        calibrationAttributes:["WetMass", "VRYIELDVOL"]
        conditionPathwiseCalibration: {
          calibrationBasis: "WetMass"
          maxHomogeneityRegion: true
          syntheticMachinePath: true
        }
      }
      cleanAction: {
        condtionAutoClean: {
          targetAttribute: "WetMass"
        }
      }
      
      actions:  [CALIBRATE, CLEAN]
    }
  }) {
    uuid
    fieldUuid
    originalUuid
    statusCode
    statusMessage
  }
}

A more advanced sample provides manual control of min/max ranges and incorporates additional attributes.

mutation CustomCalibrateYieldDataset {
  calibrateYieldDataset(input: {
    fieldUuid: "<placeholder_of_field_uuid>"
    yieldDatasetUuid: "<placeholder_of_dataset_uuid>"
    actions: {
      calibrateAction: {
        calibrationAttributes:["WetMass", "VRYIELDVOL"]
        smoothWindowSize: 11
        conditionPathwiseCalibration: {
          calibrationBasis: "Machine"
          maxHomogeneityRegion: true
          syntheticMachinePath: false
        }
        conditionAvgTotalCalibration:[{
          calibrationAttribute: "WetMass"
          average:8.0
        }]
      }
      cleanAction: {
        condtionAutoClean: {
          targetAttribute: "WetMass"
        }
        conditionMinMaxClean: [{
          cleanAttribute: "Distance"
          min: 1.5
          max: 10.0
        }]
      }
      actions:  [CALIBRATE, CLEAN]
    }
  }) {
    uuid
    fieldUuid
    originalUuid
    statusCode
    statusMessage
  }
}

To follow the USDA protocol for the CLEAN operation, you must either mention ALL columns in the cleanAction -> conditionMinMaxClean or specify a portion of them in cleanAction -> conditionMinMaxClean and the remaining ones in condtionAutoClean -> excludedAttributes.

Input parameters:

  • actions as an array, allowing you to choose the correcting actions and their sequence of application; supported values include CLEAN and CALIBRATE.

  • calibrateAction as an object containing configuration details related to the CALIBRATE operation.

    • calibrationAttributes as an array of attributes requiring calibration, typically linked to the Yield column.

    • smoothWindowSize as an odd integer that smoothens the result values, reducing abrupt jumps in the values.

    • conditionPathwiseCalibration as an object with the Pathwise calibration corresponds to the machine's tracks. Each machine track is processed as an individual region for calibration purposes.

      • calibrationBasis as a string representing the attribute used as the basis for calibration.

      • maxHomogeneityRegion as a boolean that indicates whether the maximum homogeneity region is used as the referenced region for calibration.

      • syntheticMachinePath as a boolean that indicates the simulation of machine routes, it is beneficial when the precise machine path attribute is absent and needs simulation based on timestamps or a similar attribute.

    • conditionAvgTotalCalibration as an object with the Average/Total calibration focuses on redistributing attribute values. If the geospatial patterns are accurate but the absolute figures deviate from the actuals, this method proves beneficial.

      • calibrationAttribute as a string representing the attribute used to be calibrated.

      • average as a number representing the average values of the attribute; the attribute values should align with this average. Only one option, either average or total, should be utilized at a time.

      • total as a number representing the total sum of the attribute values; the aggregate of these values should match the total. Only one option, either average or total, should be utilized at a time.

    • conditionMinMaxCalibration as an object with the Conditional calibration adjusts attribute values based on provided min and max thresholds.

      • calibrationAttribute as a string representing the attribute used to be calibrated.

      • min as a number representing the minimum values of the attribute, serving as the lowest range for calibration.

      • minIncluded as a boolean indicating whether or not to include the minimum value

      • max as a number representing the maximum values of the attribute, serving as the highest range for calibration.

      • maxIncluded as a boolean indicating whether or not to include the maximum value.

  • cleanAction as an object that includes the configuration specifics tied to the CLEAN operation.

    • conditionAutoClean as an object that includes the configurations specific to the auto-clean algorithm.

      • targetAttribute as a string representing target Yield values.

      • excludedAttributes as an array of strings defining attributes that don't influence the cleaning operation.

    • conditionMinMaxClean as an array of objects containing the described cleaning rules, every object includes the following parameters.

      • cleanAttribute as a string specifying the column name for the rule.

      • min as a number indicating the minimum value.

      • max as a number indicating the maximum value.

To view the inputs and access the latest available values of enumerations (such as operations), it is recommended to utilize Altair.

Documentation: YieldDataset query

As a GeoPard API consumer, you can retrieve details regarding the corrections applied to YieldDatasets through the attributes appliedCorrections and appliedCorrectionsVersion. The former provides a list of corrections made (e.g., CALIBRATE and CLEAN), with the order of execution denoted by their sequence in the array. Meanwhile, appliedCorrectionsVersion indicates the version of the algorithm employed.

query DetailsAboutAppliedCalibrations {
  getFields (filter: {fieldUuid: "<placeholder_of_field_uuid>"}){
    fields {
      uuid
      yieldDatasets  {
        uuid
        appliedCorrections
        appliedCorrectionsVersion
      }
    }
  }
}

Last updated

Powered by GeoPard Agriculture - Automated precisionAg platform