Lead Intelligence Platform for Automotive Dealerships: 2.8x Improvement in Lead-to-Test-Drive Conversion
Every lead gets a salesperson's attention. The problem is equal attention — treating a Google ad click the same as a walk-in with financing pre-approval. We built a lead scoring platform that told dealerships, within 60 seconds of enquiry, which leads were most likely to buy.
The Automotive Lead Problem
In US automotive retail, leads are expensive. A dealership spending $30,000/month on digital marketing might generate 400–600 leads. Each lead represents real ad spend, real inventory holding cost, and real salesperson time.
The problem: in most dealerships, all 600 leads went into the same queue, processed in the order they arrived. The lead from a person who had been browsing F-150 trim levels for three weeks, had financing pre-approval, and submitted a test drive request at 7pm on Friday got the same treatment — and often a slower response — as someone who clicked an ad by accident and submitted their number to unlock a price.
The result: salespeople spent time chasing cold leads and missing warm ones. Industry-wide lead-to-test-drive conversion hovered around 8–12%. The top 20% of leads by actual purchase intent converted at roughly the same rate as the bottom 20% — not because the leads were equally valuable, but because no one knew which was which.
Previous company engagement at a platform serving 4,000+ US automotive dealerships. Client details are anonymised.
The Scoring Model
A Lead Intelligence Score (0–100) computed for every inbound lead within 60 seconds of creation, answering: "How likely is this person to schedule a test drive and purchase a vehicle within the next 30 days?"
The model was trained on 18 months of historical data from the 4,000+ dealership network: 2.1 million labelled examples with known outcomes (test drive scheduled, test drive completed, purchase, no purchase within 30 days).
Feature Engineering
Features were engineered across four categories:
Enquiry and Source Features
| Feature | Notes |
|---|---|
| Traffic source | Organic, paid search, paid social, third-party portal, referral, direct |
| Enquiry type | Test drive request and financing enquiry are strong purchase intent signals |
| Vehicle type | New, certified pre-owned, used — each has a distinct conversion profile |
| Vehicle price tier | Luxury vs mass market (different urgency and decision cycle) |
| Trim specificity | Enquiry naming a specific trim/package indicates late-stage research |
| Device type | Mobile test drive requests convert at 1.9x desktop |
| Day and hour of submission | Friday evening enquiries convert at 2.1x Monday morning |
Prospect Behavior Features (where session data available)
| Feature | Notes |
|---|---|
| Return visitor flag | Prior site visits indicate active consideration |
| Time on site before submission | Longer dwell time = higher intent |
| Number of VDP (vehicle detail page) views | Browsing breadth of inventory |
| Financing calculator interaction | Strong intent signal |
| Geographic distance from dealership | Leads within 15 miles convert at 3.1x vs 30+ miles |
Post-Enquiry Response Features (updated in real-time)
| Feature | Notes |
|---|---|
| First call answered within 5 minutes | Answered within 5 min predicts conversion at 4.1x |
| Email opened within 2 hours | Engagement with first outreach |
| SMS / text response | Two-way engagement signal |
| Test drive appointment booked, rescheduled, or cancelled | Direct intent signal; rescheduled > no-show > cancelled |
Model Architecture and Validation
Algorithm: XGBoost gradient boosted classifier.
Chosen because:
- Handles missing values natively — not all enquiry sources provide all features; the model degrades gracefully on incomplete data rather than failing
- Interpretable via SHAP — dealership sales managers need to understand what predicts conversion at their specific store, not just receive a score
- Sub-millisecond inference at serving time — necessary for sub-60-second end-to-end latency
Validation: Time-based walk-forward split — train on months 1–15, validate on months 16–18. No data leakage from future outcomes.
Model performance:
| Metric | Value |
|---|---|
| AUC-ROC | 0.84 |
| Precision at top decile (score > 90) | 78% |
| Precision at top quintile (score > 70) | 64% |
| Test drive rate for bottom three deciles (score < 30) | 4.1% — near random |
The bottom three deciles converting at near-random rate was the key insight for the business case: salespeople were spending significant time on leads with 4% conversion rates while high-intent leads waited.
The Scoring Pipeline
End-to-end latency: New lead created → score available in CRM: < 60 seconds at p95 under production load.
Explainability: SHAP at the Point of Decision
Every Lead Intelligence Score is accompanied by the top 3 contributing SHAP factors, rendered in plain English on the lead card:
Score: 91 — Test drive requested · 5.2 miles from dealership · F-150 Lariat (specific trim selected)
Score: 34 — General info request · 47 miles from dealership · First site visit
This is not optional. Salespeople will not act on a number they cannot explain. When a salesperson can read the reason the score is high, they can validate it against their own judgment and act with confidence. The SHAP output is the adoption mechanism for the model.
CRM Integration and the Priority Queue
The default CRM lead view was re-sorted to lead by Lead Intelligence Score descending. When a salesperson opens their queue in the morning, the 15 most likely-to-convert leads are at the top — not the 15 most recent.
For leads scoring above 85, an immediate push notification fires to the assigned salesperson with the vehicle, request type, proximity, and a nudge to respond within 5 minutes (the 5-minute response window predicting 4.1x conversion was shared with the sales team to give the nudge credibility).
Weekly model retraining: A scheduled job retrains the model weekly on a rolling 18-month window. This keeps the model current with seasonal patterns (spring selling season looks different from Q4), inventory mix changes, and shifts in lead source quality as marketing campaigns evolve.
Feature importance dashboard (Metabase): Each dealership's sales manager sees a dashboard showing:
- Current week's average lead score distribution
- Conversion rate by score decile (their dealership vs network average)
- Top 5 features driving high-conversion leads at their store specifically
- Geographic heatmap of high-intent leads by zip code
This lets sales managers make marketing budget allocation decisions using model insights — shift spend toward the sources generating the highest-intent leads.
Results
Measured over 6 months post-deployment across the dealership network:
| Metric | Before | After | Change |
|---|---|---|---|
| Lead-to-test-drive rate (score > 70 / top quintile) | 11.3% | 31.6% | +2.8x |
| Lead-to-test-drive rate (overall) | 11.3% | 17.8% | +57% |
| Time-to-first-response for top-scored leads | 38 min avg | 6.4 min avg | −83% |
| Salesperson time on top-quintile leads | 28% of day | 60% of day | +2.1x focus |
| Marketing ROI (same spend, more conversions) | Baseline | +34% | Improved |
The overall conversion rate improvement (11.3% → 17.8%) reflects a secondary effect: correctly identifying the bottom 30% of leads as low-priority freed salesperson time to work more deals across the middle tier, lifting overall throughput.
What This Demonstrates
Lead scoring is a well-understood ML problem. What makes this implementation worth studying is the operational integration:
- Sub-60-second scoring latency with a production-grade feature extraction pipeline — this is what makes the push notification actionable rather than a retrospective curiosity
- SHAP explainability surfaced to end users (salespeople, not data scientists) — the insight is presented at the point of decision, in language that matches the domain
- Weekly retraining cadence operationalized — not a one-time model deployment, but a live system that evolves with the business
- CRM priority queue reordering as the primary activation mechanism — the model's output changes how agents start their day, not just what appears in a dashboard
The domain is US automotive. The same pattern — behavioral scoring, sub-minute scoring pipeline, SHAP explainability at the decision point, automated retraining — applies directly to any business with inbound lead volume: real estate, financial services, SaaS, healthcare, insurance, legal services.
Not Sure Where to Start?
Book a free 30-minute strategy session with a senior data architect — no pitch, no obligation.
Schedule Your Free Strategy Session