Most bad lists are not a data problem — they're a definition problem. The records were accurate; the filter was wrong. Someone typed "VP Marketing, software, United States" into a search box, exported the first ten thousand rows, and three weeks later the campaign report shows replies from HR coordinators at staffing agencies and a sales team asking why half the accounts are 12-person consultancies.
This post is a field guide to the definition step: how to translate "the right companies, the right people" into filters that actually mean what you meant. The mechanics here apply to any database; the examples use Argorant because counts and previews there are free, which makes the iterate-before-you-pay workflow this post recommends cost nothing to follow.
Start with accounts, not people
The most common structural mistake is starting from the person filter. Titles are the noisiest field in any contact database — self-reported, inflated, localized, and inconsistent across companies — so a list that begins with a title search inherits all of that noise before a single firmographic constraint has been applied.
Invert it. Define the account universe first: which industries, which headcount bands, which geographies. Get a company count, skim the actual company names that match, and only then layer the people criteria on top. Account-level fields are more stable and more verifiable than person-level ones, and a person filter applied to the wrong account universe produces precisely targeted contacts at companies you never wanted.
A useful discipline: write the account definition as one sentence before touching any tool. "Logistics and freight companies, 50–500 employees, Germany and the Netherlands, excluding staffing and recruiting firms." If the sentence is fuzzy, the filters will be fuzzier.
Industry filters lie more than any other filter
Industry looks like the cleanest filter in the panel and is usually the dirtiest. Three reasons. First, classification is partly self-described: companies pick the label that flatters them, which is why every agency is "technology" and every reseller is "software." Second, taxonomy granularity rarely matches your ICP — "Software" as a category spans a two-person plugin shop and a 40,000-person enterprise vendor, and the code-level taxonomies underneath (the NAICS-style numbering schemes) were designed for government statistics, not for go-to-market. Third, multi-line companies hold one primary label: a conglomerate with a large logistics arm may be filed under "industrial manufacturing" and never surface in your logistics search.
The practical countermeasures are simple. Search by industry segments at the human-readable layer rather than memorizing code numbers, and select a basket of three to six adjacent segments instead of one broad category — for logistics, that might mean freight and cargo, warehousing, supply-chain services, and last-mile delivery as separate picks. Then preview twenty company names from the result. The preview is the test: if four of twenty are staffing agencies or consultancies that serve the industry rather than operate in it, add explicit exclusions for those segments. Service firms orbiting an industry are the single most common contaminant in industry-filtered lists.
Headcount bands deserve the same skepticism in one direction: they're generally reliable for the floor and soft at the ceiling, because subsidiaries sometimes carry the parent's headcount. If your ICP truly caps at 500 employees, preview a few of the largest matches and check whether they're independent companies or local arms of giants.
Seniority is a pattern set, not a dropdown
Now the people layer. The mistake here is treating seniority as a single dropdown value, when in reality it's a function-by-level matrix expressed through wildly inconsistent title strings. "Vice President" is a senior executive at a 200-person SaaS company and a mid-level individual contributor at a bank, where VP is famously a volume title. "Manager" ranges from a first-line supervisor to the only marketing person in the building. And in European data, the decision-maker often won't carry an English title at all — the German Geschäftsführer, the French directeur général, the Dutch directeur are managing directors who vanish from a naive "CEO OR founder" search.
Build the people filter as two explicit lists. An include set: the title patterns that signal your buyer, across languages and conventions — for operations leadership that might be head of operations, VP operations, COO, operations director, Betriebsleiter. And an exclude set: the patterns that match the include strings but miss the level — assistant to, deputy, intern, former, and in this example warehouse shift lead, which the substring "operations" will happily catch. The exclude list is where most of the precision lives, and almost nobody writes one.
Calibrate the matrix to company size. At a 2,000-person company, the buyer of an operations tool is a director or VP; at a 60-person company there is no VP of operations, and the same purchase decision sits with a founder or general manager. If your account universe spans both, split the list into a small-company segment targeting founders and owners and a mid-market segment targeting function heads, with copy written for each. One blended list with one blended message underperforms both.
Size the list backward from sending capacity
Before exporting anything, do the arithmetic that determines how big the list should be — it's almost always smaller than the temptation. Outbound capacity is bounded by inboxes: a sensibly warmed mailbox sends on the order of 20–40 new cold contacts a day, so two sending inboxes support roughly 1,200–2,400 new contacts a month. A 50,000-row export against that capacity is not a pipeline; it's an inventory write-off in progress, because B2B contact data decays at an estimated 2–3% per month while it sits in your CRM waiting its turn.
Pull four to six weeks of sending capacity at a time and come back for the next tranche when you need it — the database doesn't go anywhere, and on a verified-at-export model each tranche gets a fresh SMTP pass instead of aging in a spreadsheet. Cap contacts per account too: two or three well-chosen people per company nearly always beats eight, both for deliverability optics on the receiving domain and because the fourth-best contact at an account is rarely worth a credit.
Iterate on counts and previews before money moves
The workflow that ties this together is a loop, and the loop should be free. Set the account filters, read the count. Preview a sample of companies; fix the industry basket. Add the people layer, read the count again — if 4,800 people collapse to 90, your title patterns are too narrow; if they balloon to 40,000, an include pattern is matching something you didn't intend. Preview actual titles, grow the exclude list, repeat. Expect three to five rounds before the previews stop surprising you; a definition that survives previews is quality assurance you get for nothing.
On Argorant every step of that loop — counts, company previews, title samples — costs zero credits, in the dashboard, over the API and CLI, or through an MCP-connected agent that runs the refinement rounds conversationally. Credits are only spent at the final step, when verified contacts are revealed or exported, and invalid addresses cost nothing even then.
Export hygiene: the last ten percent
A few habits at export time protect the work you just did. Suppress domains you already know: current customers, open opportunities, competitors, and anyone who has opted out — a great filter definition will cheerfully re-discover your own pipeline. Decide explicitly how to treat catch-all results rather than letting a default decide for you; on a flagged-catch-all model you can exclude them entirely for a cold campaign or include them knowingly where coverage matters more than certainty.
Finally, save the definition itself — the industry basket, headcount bands, geographies, include and exclude patterns — as text, in a runbook or an agent's project file. The list you exported is a depreciating asset; the definition is the durable one. Next quarter you re-run the same sentence against a re-verified database and get the current state of the market, which is the entire point of building the filter carefully the first time.