Back to Case Studies
AgriTech / ML

Pond Score: Building a Machine Learning Risk Engine for Aquaculture Farms

March 30, 2026
10 min read

Water quality data is only useful if you know what it means. We built Pond Score — an XGBoost risk model that turns raw parameter readings into a daily health index per pond, giving farmers early warning of disease risk before symptoms appear.

AquaStackXXGBoostMachine LearningPythonSHAPSupabaseTime-SeriesMLOps

The Problem with Raw Water Data

AquaStackX gave farmers real-time water quality readings. But a reading of "DO: 4.2 mg/L" means nothing to a farm owner without context.

Is 4.2 mg/L dangerous? Compared to what? At what time of day? After what feeding schedule? Alongside what pH and temperature?

Individual parameters in isolation don't predict mortality. It's the combination of parameters — the trend over 48 hours, the time-of-day pattern, the interaction between ammonia and pH, the deviation from that pond's specific baseline — that determines whether a crop is at risk.

Experienced farm technicians have this intuition built up over years. But they can't monitor 40 ponds simultaneously, they're not always on-site, and that knowledge walks out the door when they leave.

We needed to encode that expertise into a model that runs continuously, at scale, in the background — and surfaces actionable risk signals rather than raw numbers.

Pond Score: The Model

Pond Score is a daily risk index (0–100) computed for every active pond in the system. It answers one question:

"Based on everything we know about this pond in the last 7 days, how likely is a significant mortality event in the next 7 days?"

Score rangeInterpretationAction
0–29Healthy operationNo action needed
30–69Elevated riskTechnician review recommended
70–100High riskImmediate inspection triggered + WhatsApp alert to farm owner

Feature Engineering

The model consumes features derived from the WaterReport time series. Features fall into four categories:

Absolute parameter values (most recent reading): Dissolved oxygen (DO), pH, temperature, salinity, ammonia, nitrite, alkalinity, turbidity.

Trend features (24h, 48h, 72h windows):

  • Rate of change per parameter (Δ per hour)
  • Standard deviation of DO over the last 12 hours — a volatility proxy for night-time oxygen crash risk
  • Minimum DO in the last 24 hours (the critical overnight reading)

Pattern and interaction features:

  • Time-since-last-reading (gaps indicate technician absence — itself a risk signal)
  • Deviation from pond-specific 30-day rolling baseline: absolute values vary by species and location; deviation from that pond's own normal is more predictive than the absolute value
  • pH × ammonia interaction: proxy for toxic un-ionized ammonia (NH₃), which is highly pH-dependent
  • Temperature × DO saturation index: thermal stress signal

Contextual features:

  • Days since pond seeding (younger crops are more vulnerable)
  • Species (shrimp vs fish — distinct parameter tolerance profiles)
  • Season / monsoon flag (monsoon season has distinct risk profiles across all parameters)

Model Training

Training data: 18 months of production WaterReport records from AquaStackX, labelled with mortality events (cause, date, severity as % of stock). Approximately 4,200 pond-weeks of labeled data.

Target variable: Binary classification — will this pond have a significant mortality event (>5% of stock) in the next 7 days?

Algorithm: XGBoost gradient boosted classifier. Selected for three reasons:

  1. Handles missing values natively — not all ponds record all parameters at all times
  2. Interpretable feature importance via SHAP values — farm managers need to understand why a score is high, not just that it is
  3. Fast inference: scoring 200 ponds takes under 1 second

Validation: Walk-forward (time-based) split — train on months 1–12, evaluate on months 13–18. No data leakage from future outcomes.

Back-test results:

MetricValue
Overall accuracy91%
Precision at score > 7084%
Recall at score > 7088%
False alarm rate16%
AUC-ROC0.87

A 16% false alarm rate is acceptable here: a false positive means an unnecessary technician visit (small cost). A false negative means a missed mortality event (potentially catastrophic crop loss). The model is tuned to weight recall.

The Scoring Pipeline

Two scoring triggers:

  1. Real-time: When a new WaterReport is submitted, a Supabase database function triggers feature extraction and re-scoring for that pond immediately. The updated score is visible in the app within 30 seconds.
  2. Nightly batch: At 02:00 IST, all active ponds are re-scored using the latest 7-day window. This catches drift in ponds where no new readings were logged that day (absence of readings is itself a feature).

Explainability: SHAP in Production

Each Pond Score record stores the top 3 contributing SHAP feature values alongside the score. When a farm manager sees a risk score of 78, they also see:

"Elevated ammonia trend (+0.3 mg/L over 48h) · Below-baseline DO (−1.8 mg/L vs 30-day average) · pH outside 7.5–8.5 range for 6 consecutive readings"

This is not optional — it's essential for adoption. Farmers and technicians will not act on a black-box number. They act on specific, actionable signals they can verify themselves with a handheld meter.

Integration in the AquaStackX Product

Pond Score is fully embedded, not bolted on:

  • Mobile app: Each pond card shows the current Pond Score with a color indicator (green / amber / red) and the top contributing factor
  • CRM dashboard: 30-day score trend chart per pond, with mortality events overlaid for post-event analysis
  • Technician queue: Ponds are sorted by Pond Score on the technician's daily task list — highest-risk ponds get visited first
  • Alert configuration: Tenants can configure their own alert threshold (some farm owners want alerts at >60; others at >75)

What This Demonstrates

Pond Score is a production ML system running on real farm data — not a notebook, not a proof of concept. It fires alerts that influence real operational decisions, and it gets retrained periodically as new labeled data accumulates.

The architecture here — time-series feature engineering, gradient boosted classification, SHAP-based explainability embedded in the product UI, real-time scoring pipeline via database triggers — is directly applicable to any domain with historical event data and predictive requirements:

  • Equipment failure prediction (manufacturing / industrial IoT)
  • Customer churn scoring (SaaS / financial services)
  • Credit risk assessment (lending / NBFC)
  • Supply chain disruption prediction (logistics)
  • Patient deterioration scoring (healthcare)

The domain changes. The engineering pattern does not.

Not Sure Where to Start?

Book a free 30-minute strategy session with a senior data architect — no pitch, no obligation.

Schedule Your Free Strategy Session

Not Sure Where to Start? Start Here.

We offer a free 30-minute strategy session with a senior data or AI architect — not a sales rep. Bring your current challenge, your stack, or just a vague sense that your data situation needs to improve. We'll give you an honest assessment of where to begin.

No pitch. No obligation. Just a useful conversation.

Typically responds within 1 business day · Available for India, US, UK & Canada