Lead Intelligence Platform for Automotive Dealerships: 2.8x Improvement in Lead-to-Test-Drive Conversion

Every lead gets a salesperson's attention. The problem is equal attention — treating a Google ad click the same as a walk-in with financing pre-approval. We built a lead scoring platform that told dealerships, within 60 seconds of enquiry, which leads were most likely to buy.

The Automotive Lead Problem

In US automotive retail, leads are expensive. A dealership spending $30,000/month on digital marketing might generate 400–600 leads. Each lead represents real ad spend, real inventory holding cost, and real salesperson time.

The problem: in most dealerships, all 600 leads went into the same queue, processed in the order they arrived. The lead from a person who had been browsing F-150 trim levels for three weeks, had financing pre-approval, and submitted a test drive request at 7pm on Friday got the same treatment — and often a slower response — as someone who clicked an ad by accident and submitted their number to unlock a price.

The result: salespeople spent time chasing cold leads and missing warm ones. Industry-wide lead-to-test-drive conversion hovered around 8–12%. The top 20% of leads by actual purchase intent converted at roughly the same rate as the bottom 20% — not because the leads were equally valuable, but because no one knew which was which.

Previous company engagement at a platform serving 4,000+ US automotive dealerships. Client details are anonymised.

The Scoring Model

A Lead Intelligence Score (0–100) computed for every inbound lead within 60 seconds of creation, answering: "How likely is this person to schedule a test drive and purchase a vehicle within the next 30 days?"

The model was trained on 18 months of historical data from the 4,000+ dealership network: 2.1 million labelled examples with known outcomes (test drive scheduled, test drive completed, purchase, no purchase within 30 days).

Feature Engineering

Features were engineered across four categories:

Enquiry and Source Features

Feature	Notes
Traffic source	Organic, paid search, paid social, third-party portal, referral, direct
Enquiry type	Test drive request and financing enquiry are strong purchase intent signals
Vehicle type	New, certified pre-owned, used — each has a distinct conversion profile
Vehicle price tier	Luxury vs mass market (different urgency and decision cycle)
Trim specificity	Enquiry naming a specific trim/package indicates late-stage research
Device type	Mobile test drive requests convert at 1.9x desktop
Day and hour of submission	Friday evening enquiries convert at 2.1x Monday morning

Prospect Behavior Features (where session data available)

Feature	Notes
Return visitor flag	Prior site visits indicate active consideration
Time on site before submission	Longer dwell time = higher intent
Number of VDP (vehicle detail page) views	Browsing breadth of inventory
Financing calculator interaction	Strong intent signal
Geographic distance from dealership	Leads within 15 miles convert at 3.1x vs 30+ miles

Post-Enquiry Response Features (updated in real-time)

Feature	Notes
First call answered within 5 minutes	Answered within 5 min predicts conversion at 4.1x
Email opened within 2 hours	Engagement with first outreach
SMS / text response	Two-way engagement signal
Test drive appointment booked, rescheduled, or cancelled	Direct intent signal; rescheduled > no-show > cancelled

Model Architecture and Validation

Algorithm: XGBoost gradient boosted classifier.

Chosen because:

Handles missing values natively — not all enquiry sources provide all features; the model degrades gracefully on incomplete data rather than failing
Interpretable via SHAP — dealership sales managers need to understand what predicts conversion at their specific store, not just receive a score
Sub-millisecond inference at serving time — necessary for sub-60-second end-to-end latency

Validation: Time-based walk-forward split — train on months 1–15, validate on months 16–18. No data leakage from future outcomes.

Model performance:

Metric	Value
AUC-ROC	0.84
Precision at top decile (score > 90)	78%
Precision at top quintile (score > 70)	64%
Test drive rate for bottom three deciles (score < 30)	4.1% — near random

The bottom three deciles converting at near-random rate was the key insight for the business case: salespeople were spending significant time on leads with 4% conversion rates while high-intent leads waited.

The Scoring Pipeline

End-to-end latency: New lead created → score available in CRM: < 60 seconds at p95 under production load.

Explainability: SHAP at the Point of Decision

Every Lead Intelligence Score is accompanied by the top 3 contributing SHAP factors, rendered in plain English on the lead card:

Score: 91 — Test drive requested · 5.2 miles from dealership · F-150 Lariat (specific trim selected)

Score: 34 — General info request · 47 miles from dealership · First site visit

This is not optional. Salespeople will not act on a number they cannot explain. When a salesperson can read the reason the score is high, they can validate it against their own judgment and act with confidence. The SHAP output is the adoption mechanism for the model.

CRM Integration and the Priority Queue

The default CRM lead view was re-sorted to lead by Lead Intelligence Score descending. When a salesperson opens their queue in the morning, the 15 most likely-to-convert leads are at the top — not the 15 most recent.

For leads scoring above 85, an immediate push notification fires to the assigned salesperson with the vehicle, request type, proximity, and a nudge to respond within 5 minutes (the 5-minute response window predicting 4.1x conversion was shared with the sales team to give the nudge credibility).

Weekly model retraining: A scheduled job retrains the model weekly on a rolling 18-month window. This keeps the model current with seasonal patterns (spring selling season looks different from Q4), inventory mix changes, and shifts in lead source quality as marketing campaigns evolve.

Feature importance dashboard (Metabase): Each dealership's sales manager sees a dashboard showing:

Current week's average lead score distribution
Conversion rate by score decile (their dealership vs network average)
Top 5 features driving high-conversion leads at their store specifically
Geographic heatmap of high-intent leads by zip code

This lets sales managers make marketing budget allocation decisions using model insights — shift spend toward the sources generating the highest-intent leads.

Results

Measured over 6 months post-deployment across the dealership network:

Metric	Before	After	Change
Lead-to-test-drive rate (score > 70 / top quintile)	11.3%	31.6%	+2.8x
Lead-to-test-drive rate (overall)	11.3%	17.8%	+57%
Time-to-first-response for top-scored leads	38 min avg	6.4 min avg	−83%
Salesperson time on top-quintile leads	28% of day	60% of day	+2.1x focus
Marketing ROI (same spend, more conversions)	Baseline	+34%	Improved

The overall conversion rate improvement (11.3% → 17.8%) reflects a secondary effect: correctly identifying the bottom 30% of leads as low-priority freed salesperson time to work more deals across the middle tier, lifting overall throughput.

What This Demonstrates

Lead scoring is a well-understood ML problem. What makes this implementation worth studying is the operational integration:

Sub-60-second scoring latency with a production-grade feature extraction pipeline — this is what makes the push notification actionable rather than a retrospective curiosity
SHAP explainability surfaced to end users (salespeople, not data scientists) — the insight is presented at the point of decision, in language that matches the domain
Weekly retraining cadence operationalized — not a one-time model deployment, but a live system that evolves with the business
CRM priority queue reordering as the primary activation mechanism — the model's output changes how agents start their day, not just what appears in a dashboard

The domain is US automotive. The same pattern — behavioral scoring, sub-minute scoring pipeline, SHAP explainability at the decision point, automated retraining — applies directly to any business with inbound lead volume: real estate, financial services, SaaS, healthcare, insurance, legal services.

The Automotive Lead Problem

The Scoring Model

Feature Engineering

Enquiry and Source Features

Prospect Behavior Features (where session data available)

Post-Enquiry Response Features (updated in real-time)

Model Architecture and Validation

The Scoring Pipeline

Explainability: SHAP at the Point of Decision

CRM Integration and the Priority Queue

Results

What This Demonstrates

Not Sure Where to Start?

More Case Studies

Architecting the Single View of the Customer: Building a Composable CDP That Actually Scales

Building a Real-Time Aquaculture Intelligence Platform: The AquaStackX Story

Pond Score: Building a Machine Learning Risk Engine for Aquaculture Farms

Not Sure Where to Start? Start Here.