Cold email deliverability, explained.
Deliverability isn’t a trick. It’s a reputation system with knowable rules — and the single biggest input is one most teams treat as an afterthought: the quality of the list itself.
How sender reputation actually works
Every mailbox provider — Google, Microsoft, and the filtering stacks in front of corporate mail — keeps a running score on the infrastructure that sends to it. That score attaches to your sending domain and, to a lesser degree, your IP. It moves on signals the provider can measure directly: how often your mail hard bounces, how often recipients mark it as spam, whether they open and reply, whether you hit spam traps (recycled dead addresses providers monitor precisely to catch careless senders), and whether your volume behaves like a human operation or a cannon.
Two properties of this system matter strategically. First, it is asymmetric: reputation is lost in days and rebuilt in weeks. A single batch of 2,000 sends into a 15%-bounce list can put a domain in the penalty box for a month. Second, it is largely invisible: there is no dashboard where Google shows you your score. You infer it from bounce codes, spam-folder placement tests, and reply-rate collapses — usually after the damage is done. The entire discipline of deliverability is about never needing that inference.
SPF, DKIM, DMARC: the identity layer
Before a provider judges your behavior, it verifies your identity. Three DNS records do that work. SPF lists the servers allowed to send mail for your domain, so a receiver can check the connection actually came from infrastructure you authorized. DKIM signs each message cryptographically with a key published in your DNS, proving the content wasn’t altered and the sender controls the domain. DMARC ties the two together: it tells receivers what to do when SPF or DKIM fails (none, quarantine, or reject) and where to send aggregate reports about who is sending as you.
Since the 2024 Google and Yahoo bulk-sender requirements, this stack stopped being optional: senders without aligned SPF, DKIM, and at least a minimal DMARC policy get filtered before content is even evaluated. Set all three up on day one, on every sending domain. Then verify them with a real test, not by eyeballing DNS — a misaligned DKIM selector fails silently. Note what this layer does and doesn’t do: authentication proves who you are. It earns you the right to be judged on behavior; it doesn’t make the judgment favorable.
Warm-up and the volume ramp
A new domain has no reputation, and no reputation is treated as suspicious. Warm-up is the process of building history before you need it: weeks of low, human-looking volume with real engagement before any campaign traffic. In practice most teams run dedicated sending domains (never the company’s primary domain — a burned outbound domain should never take your transactional mail down with it), give each new domain two to three weeks of warm-up, and only then introduce campaign sends.
The ramp itself should be boring: start around 10–20 cold sends per mailbox per day, increase gradually week over week, and hold a ceiling around 30–50 per mailbox per day even at cruise. Need more volume? Add mailboxes and domains, don’t push one mailbox harder. Spikes are the tell providers look for — a domain that sent 40 emails yesterday and 4,000 today looks exactly like what it probably is.
Bounce rate: the threshold that ends campaigns
Of all the reputation signals, hard bounces are the most mechanical: every one tells the provider you mailed an address you didn’t verify. The working thresholds are well established — under 2% bounces is where you want to live, 2–3% is the warning zone, and sustained rates above that trigger throttling and spam-folder placement that outlasts the campaign. Some receiving systems start degrading treatment at levels a marketer would consider rounding error.
Run the arithmetic on a typical purchased list. B2B emails decay at roughly 2–3% a month as people change jobs, so a list verified by the vendor eight months ago can bounce at 15–20% — five times past the danger line on the first send. This is not a copywriting problem or a warm-up problem, and no amount of either fixes it. It is a data-freshness problem, and it is solved before the send or not at all.
List quality is the #1 lever
Teams routinely spend weeks A/B-testing subject lines on lists that were never going to be allowed into the inbox. The order of operations matters: authentication and list quality decide whether you get delivered; copy decides what happens after. A verified, recently-probed list with mediocre copy outperforms brilliant copy sent into 12% bounces — because the second campaign stops being delivered at all.
The operational rule is simple: never send to an address whose verification verdict is more than a couple of months old, and never send to unverified addresses, period. This is why verification timing in your data vendor matters so much — a stamp applied at collection decays in inventory, while a live SMTP probe at export gives you a verdict that is minutes old. That distinction, and what “valid,” “catch-all,” and “unknown” actually mean, is covered in how we verify and in our verification guide. Treat catch-all addresses as a separate, deliberate decision: they can deliver, but they carry residual bounce risk, so include them only when the segment is valuable enough to justify it.
Copy and content: the spam-filter basics
Modern filters are engagement-driven more than keyword-driven, but content still moves the needle at the margins. Plain text or near-plain HTML beats heavy templates for cold outreach. Avoid link-shorteners and tracking domains with poor reputations — the domains you link to are scored along with you. Go light on images and attachments (ideally none on a first touch), skip the ALL-CAPS-and-exclamation register entirely, and keep messages short enough that a reply feels proportionate, because replies are the strongest positive signal you can generate.
Personalization helps deliverability indirectly: it raises replies and lowers spam complaints, the two engagement signals with the most weight. And keep complaint hygiene tight — honor opt-outs instantly and suppress entire domains that ask. A spam-complaint rate above roughly 0.3% puts you in bulk-sender enforcement territory at Google; well-run cold programs operate far below it.
Monitoring: closing the loop
Deliverability degrades silently, so instrument it. Watch bounce rate per campaign and per sending domain (your sequencer reports this; alert at 2%). Read DMARC aggregate reports to catch authentication drift and spoofing. Use Google Postmaster Tools for domain reputation at Gmail, run periodic seed tests to check inbox-versus-spam placement, and check your sending IPs and domains against public blocklists on a schedule. The reply rate itself is your final gauge — a sudden drop with stable volume usually means placement collapsed, not that prospects changed their minds.
Every metric in that list improves when the input list is clean. That’s the quiet conclusion of this whole guide: deliverability programs succeed upstream. Start with addresses that were SMTP-verified at the moment of export — where invalid rows are filtered out and cost nothing — and the rest of the discipline gets dramatically easier. Pricing starts under $100/month, and unfamiliar terms from this guide are in the glossary.
Keep your bounce rate
under the line.
Every export is SMTP-verified live — invalid addresses filtered out, zero credits charged. Free account, 100 credits included.