Pond Score: Building a Machine Learning Risk Engine for Aquaculture Farms

Water quality data is only useful if you know what it means. We built Pond Score — an XGBoost risk model that turns raw parameter readings into a daily health index per pond, giving farmers early warning of disease risk before symptoms appear.

The Problem with Raw Water Data

AquaStackX gave farmers real-time water quality readings. But a reading of "DO: 4.2 mg/L" means nothing to a farm owner without context.

Is 4.2 mg/L dangerous? Compared to what? At what time of day? After what feeding schedule? Alongside what pH and temperature?

Individual parameters in isolation don't predict mortality. It's the combination of parameters — the trend over 48 hours, the time-of-day pattern, the interaction between ammonia and pH, the deviation from that pond's specific baseline — that determines whether a crop is at risk.

Experienced farm technicians have this intuition built up over years. But they can't monitor 40 ponds simultaneously, they're not always on-site, and that knowledge walks out the door when they leave.

We needed to encode that expertise into a model that runs continuously, at scale, in the background — and surfaces actionable risk signals rather than raw numbers.

Pond Score: The Model

Pond Score is a daily risk index (0–100) computed for every active pond in the system. It answers one question:

"Based on everything we know about this pond in the last 7 days, how likely is a significant mortality event in the next 7 days?"

Score range	Interpretation	Action
0–29	Healthy operation	No action needed
30–69	Elevated risk	Technician review recommended
70–100	High risk	Immediate inspection triggered + WhatsApp alert to farm owner

Feature Engineering

The model consumes features derived from the WaterReport time series. Features fall into four categories:

Absolute parameter values (most recent reading): Dissolved oxygen (DO), pH, temperature, salinity, ammonia, nitrite, alkalinity, turbidity.

Trend features (24h, 48h, 72h windows):

Rate of change per parameter (Δ per hour)
Standard deviation of DO over the last 12 hours — a volatility proxy for night-time oxygen crash risk
Minimum DO in the last 24 hours (the critical overnight reading)

Pattern and interaction features:

Time-since-last-reading (gaps indicate technician absence — itself a risk signal)
Deviation from pond-specific 30-day rolling baseline: absolute values vary by species and location; deviation from that pond's own normal is more predictive than the absolute value
pH × ammonia interaction: proxy for toxic un-ionized ammonia (NH₃), which is highly pH-dependent
Temperature × DO saturation index: thermal stress signal

Contextual features:

Days since pond seeding (younger crops are more vulnerable)
Species (shrimp vs fish — distinct parameter tolerance profiles)
Season / monsoon flag (monsoon season has distinct risk profiles across all parameters)

Model Training

Training data: 18 months of production WaterReport records from AquaStackX, labelled with mortality events (cause, date, severity as % of stock). Approximately 4,200 pond-weeks of labeled data.

Target variable: Binary classification — will this pond have a significant mortality event (>5% of stock) in the next 7 days?

Algorithm: XGBoost gradient boosted classifier. Selected for three reasons:

Handles missing values natively — not all ponds record all parameters at all times
Interpretable feature importance via SHAP values — farm managers need to understand why a score is high, not just that it is
Fast inference: scoring 200 ponds takes under 1 second

Validation: Walk-forward (time-based) split — train on months 1–12, evaluate on months 13–18. No data leakage from future outcomes.

Back-test results:

Metric	Value
Overall accuracy	91%
Precision at score > 70	84%
Recall at score > 70	88%
False alarm rate	16%
AUC-ROC	0.87

A 16% false alarm rate is acceptable here: a false positive means an unnecessary technician visit (small cost). A false negative means a missed mortality event (potentially catastrophic crop loss). The model is tuned to weight recall.

The Scoring Pipeline

Two scoring triggers:

Real-time: When a new WaterReport is submitted, a Supabase database function triggers feature extraction and re-scoring for that pond immediately. The updated score is visible in the app within 30 seconds.
Nightly batch: At 02:00 IST, all active ponds are re-scored using the latest 7-day window. This catches drift in ponds where no new readings were logged that day (absence of readings is itself a feature).

Explainability: SHAP in Production

Each Pond Score record stores the top 3 contributing SHAP feature values alongside the score. When a farm manager sees a risk score of 78, they also see:

"Elevated ammonia trend (+0.3 mg/L over 48h) · Below-baseline DO (−1.8 mg/L vs 30-day average) · pH outside 7.5–8.5 range for 6 consecutive readings"

This is not optional — it's essential for adoption. Farmers and technicians will not act on a black-box number. They act on specific, actionable signals they can verify themselves with a handheld meter.

Integration in the AquaStackX Product

Pond Score is fully embedded, not bolted on:

Mobile app: Each pond card shows the current Pond Score with a color indicator (green / amber / red) and the top contributing factor
CRM dashboard: 30-day score trend chart per pond, with mortality events overlaid for post-event analysis
Technician queue: Ponds are sorted by Pond Score on the technician's daily task list — highest-risk ponds get visited first
Alert configuration: Tenants can configure their own alert threshold (some farm owners want alerts at >60; others at >75)

What This Demonstrates

Pond Score is a production ML system running on real farm data — not a notebook, not a proof of concept. It fires alerts that influence real operational decisions, and it gets retrained periodically as new labeled data accumulates.

The architecture here — time-series feature engineering, gradient boosted classification, SHAP-based explainability embedded in the product UI, real-time scoring pipeline via database triggers — is directly applicable to any domain with historical event data and predictive requirements:

Equipment failure prediction (manufacturing / industrial IoT)
Customer churn scoring (SaaS / financial services)
Credit risk assessment (lending / NBFC)
Supply chain disruption prediction (logistics)
Patient deterioration scoring (healthcare)

The domain changes. The engineering pattern does not.

The Problem with Raw Water Data

Pond Score: The Model

Feature Engineering

Model Training

The Scoring Pipeline

Explainability: SHAP in Production

Integration in the AquaStackX Product

What This Demonstrates

Not Sure Where to Start?

More Case Studies

Architecting the Single View of the Customer: Building a Composable CDP That Actually Scales

Building a Real-Time Aquaculture Intelligence Platform: The AquaStackX Story

PropstackX: Building a Real Estate Marketing Automation CRM from the Ground Up

Not Sure Where to Start? Start Here.