CRM data quality with AI: contact name cleanup and automated classification in HubSpot
- hubspot
- ai
- crm
A CRM with 14,000 contacts isn’t worth much if 25% of records have wrong, empty or duplicated names. And it isn’t worth much either if every contact is classified as “Other” because nobody has the time to fill in the “contact type” field by hand. The problem isn’t the CRM itself: HubSpot does its job. The problem is the flow upstream. Leads arriving from different forms, imported from trade shows, synced from WhatsApp, collected during events. Every channel has its own way of writing “info@companyxyz.com” in the name field, or putting everything in caps, or leaving the last name blank.
This post tells how we approached the topic on a real HubSpot CRM, in an education/travel B2C industry, combining deterministic rules, a fallback chain and generative AI. Four layers of cleanup, a skill for classification into 16 categories, and an eval-driven approach that lets us measure quality before going into production.
The actual problem, not the case study one
The first CRM audit showed three families of dirty data:
- Badly populated names: all caps, wrong accents, “info@” as first name, full name duplicated in both firstname and lastname.
- Empty names: 2,759 contacts out of 14,000 had blank firstname and lastname, but valid email.
- Untyped contacts: 60% of the database with a blank or generically set “function” (contact type) field.
Manual cleanup wasn’t viable. “ChatGPT-style” cleanup without guardrails was dangerous: we would have rewritten thousands of records without being able to show what was changed and why.
Layer 1: deterministic pattern matching on email
Before calling any AI, we tried to extract names from the email local part. Across 2,759 empty contacts, we identified 8 recurring patterns:
firstname.lastname@domain.comfirstname-lastname@firstname_lastname@lastname.firstname@(reversed dot form)f.lastname@(initial + last name)lastname.f@(last name + initial)- single token (
mark@) — match against a names dictionary - concatenated form (
markrossi@) — segmentation with dictionary
Each pattern has a priority: the most reliable ones (classic dot form) win over the ambiguous ones (single token). Result: 1,011 names recovered out of 2,759, zero AI in this step, CSV output ready for import.
The point here isn’t magic. It’s that this work is much faster than a manual correction and safer than an open-ended LLM prompt, because every decision is inspectable.
Layer 2: ORG_CONTACT detection
Many records with “name” like Hotel Belvedere ltd or Lombardi Transport aren’t people: they’re organisations shoved into the person form. Treating them as contacts means breaking reporting, segmentation and email marketing.
We built a hybrid detection based on:
- match against a dictionary of about 15 ORG_WORDS (
hotel,ltd,transport,agency,school,tour, etc.) - LONG_FN rule: firstname with more than 3 tokens is almost always an organisation
- domain match: if the “name” matches the company domain (
bellavistaforbellavista.com), it’s an organisation
On 10,772 named contacts, we identified 150 ORGs masquerading as persons. They were routed to a separate branch: tag updated, original name preserved as company.
Layer 3: fallback chain with HubSpot data
When email patterns and ORG detection fail, the enrichment chain kicks in:
- LinkedIn URL slug: if the contact has a populated LinkedIn URL, we extract the final URL segment and parse first/last name.
- Associated company: if the record is linked to a company, we pull salutation/owner.
- HubSpot name fields: title, salutation, any custom property.
- Email threads: last fallback, parsing of greetings (
Hi Mark) and signatures (-- Mark Rossi) on associated email threads.
Each step has a confidence score; every decision is logged. If nothing reaches the threshold, the contact stays in “manual review”.
Layer 4: the automated weekly skill
The first 3 layers run in historical batches. For continuous flow, we packaged the logic into a skill that runs every Monday at 9:00 AM, scans contacts created in the last 7 days and applies 4 actions: CLEAR_BOTH (impossible-to-clean name), CLEAR_FN_ONLY, TRIM_FN, SKIP. HubSpot writes are non-destructive: the original name is saved to an original_name_backup property before touching the record.
On a spot check of 180 fixes across 10,772 contacts, zero production errors.
Classification: 16 categories in real time
The second problem — untyped contacts — we solved with a Make + Gemini classifier. The pattern:
- Trigger: Make scenario
on new contactHubSpot. - Fetch: the scenario pulls notes + email engagements from the last 90 days.
- Prompt: Gemini 2.5 Flash Lite receives email, name, company, engagement content, and has to pick 1 of 16 categories with decision logic + tie-breaker.
- Write: the chosen category lands in the custom
functionfield.
The 16 categories are domain-specific (education/travel): Parent, Program Participant, Hotel, Transportation, Activity Provider, School Partner, Press, Vendor, and so on. The decision logic handles ambiguity (“a parent asking info for their program participant child”): tie-breakers based on hierarchical signals (email signature > engagement type > inferred from domain).
Average classification time: about 8 seconds from create.
In parallel a nightly re-classifier reprocesses, on a rolling 7-day window, all contacts with blank or “Other” function. Contacts that initially didn’t have enough signal — maybe the first engagement arrived later — get recovered without intervention.
Eval-driven: how we avoided shipping junk
The biggest mistake you can make with AI in the CRM is pushing into production a prompt that “seems to work”. For each skill (cleanup and classifier) we built an evaluation dataset: 50-100 hand-labeled examples, “with-skill” vs “baseline” (manual) runs, benchmark and viewer to inspect the diffs.
On a skill with clear binary decisions (CLEAR_BOTH yes/no), the benchmark revealed 12% false positives on the first prompt iteration. We realised that organisations with names like “Marco Polo Tours” were being classified as people. Three prompt iterations and a new guardrail later, we got down to 2% error. All this before touching a single real record.
Replicable pattern
If you have a HubSpot CRM with dirty names and empty contact type, the pattern goes:
- Quick audit of existing data — how many rows, what kind of dirt.
- Deterministic pattern matching before AI — fish out the easy cases without spending tokens.
- Ordered fallback chain — each step has priority and confidence.
- Non-destructive guardrails — backup property before writing.
- Eval suite — 50 labeled examples, prompt iteration, benchmark.
- Weekly scheduled task — cleanup is continuous, not one-off.
For an ecommerce SME with 5-20k contacts, the investment is 3-5 setup days. The ROI isn’t in the number of cleaned contacts: it’s in the fact that from that moment on marketing, customer care and sales work on data they can trust.