inboxes.
← Blog Deep dive July 2026 · 9 min read

Sender Score Explained: What It Measures, What It Misses, and What to Track Instead

Sender Score is a 0-100 reputation rating for a sending IP address, published by Validity (which acquired Return Path, the score's creator, in 2019) and calculated from a rolling 30 days of data seen by Validity's network: complaint rates, spam-trap hits, bounce rates, blocklist appearances, and sending volume patterns. It is one vendor's estimate of one dimension (the IP) of your reputation, which is exactly why it should inform your monitoring rather than define it.

That one-sentence definition contains the two things senders most often get wrong about Sender Score: it measures the IP, not your domain, and it measures Validity's view, not Gmail's or Microsoft's. This guide covers what the number actually reflects, where it is genuinely useful, where it misleads, why mailbox providers' internal reputation systems matter far more, and the four signals worth tracking instead of (or at least alongside) any single third-party score.

What Sender Score measures

Validity aggregates data from its network of partner ISPs, filtering companies, and trap feeds, then ranks every observed sending IP against the rest. The score is a percentile: a Sender Score of 90 means the IP's measured behavior looks better than roughly 90% of other IPs in the dataset over the trailing 30 days. Inputs include, per Validity's own descriptions:

  • Complaint rates reported by partner networks
  • Spam-trap hits (mailing addresses that should not exist on any legitimate list)
  • Bounce and unknown-user rates
  • Appearances on blocklists monitored by Validity
  • Volume consistency and infrastructure signals (rDNS, sustained vs spiky sending)

As rough interpretation bands: above 80 is generally considered healthy, 70-80 warrants attention, and below 70 correlates with meaningful filtering at receivers that use Validity data. Because the window is a rolling 30 days, both damage and recovery show up on a delay measured in weeks.

Why sender score is IP-based, and why that matters in 2026

Sender Score was designed in an era when the IP address was the primary identity in email. That era is over. Three shifts have eroded the IP's importance:

  1. Shared sending pools. Most senders in 2026 send through ESPs (Mailchimp, Klaviyo, SendGrid, Postmark, Amazon SES and the rest) on shared IPs. On a shared pool, the IP's Sender Score reflects the pool's aggregate behavior, dominated by your ESP's policies and your neighbors' lists, not by you. Your excellent program and a stranger's terrible one blend into one number.
  2. Authentication moved identity to the domain. SPF, DKIM, and DMARC let receivers attribute mail to a domain reliably, so providers increasingly key reputation to the domain, which follows you across IPs and ESPs. Google says this explicitly in its sender guidelines and tracks the two separately.
  3. Providers stopped outsourcing judgment. Gmail, Microsoft, and Yahoo run their own reputation models on their own data, at a scale no third party can observe. Validity sees its partner network; Gmail sees every interaction of well over 1.8 billion mailboxes.

One vendor's view: the visibility problem

Here is the structural limit, stated plainly: no external company can see inside Gmail's or Microsoft's filtering decisions. Sender Score is built from the traffic and traps Validity's network observes, which skews toward the ISPs and filters that partner with Validity. Gmail does not use Sender Score. Microsoft runs its own systems (SNDS and SRD give senders a window into them). Yahoo runs its own. A perfectly respectable 95 tells you Validity's network sees little abuse from that IP; it tells you nothing about whether Gmail has classified your domain reputation as Low because your complaint rate crept over 0.3% last month.

This is not a criticism of Validity specifically; the same caveat applies to every third-party score, including any single number a deliverability tool (ours included) could print. Any vendor-issued reputation score is an outside estimate of systems that are deliberately opaque. Treat all of them as weather forecasts, not thermometers.

Providers' internal reputation is the one that decides placement

The reputation that actually routes your mail lives inside each mailbox provider, and the best window into the biggest one is Google Postmaster Tools. There Google shows, for your verified domain: user-reported spam rate against its published thresholds (stay under 0.1%, never reach 0.3%), domain reputation and IP reputation as four bands (Bad, Low, Medium, High), authentication pass rates, and delivery errors. Microsoft's SNDS offers IP-level data for mail into Outlook and Hotmail. These are first-party measurements from the systems making the filtering decisions, which makes them categorically more decision-worthy than any external percentile. The mechanics of how Google turns these inputs into placement are covered in how the Gmail spam filter works.

Sender Score vs provider-internal reputation

Validity Sender Score Provider internal reputation (e.g. Google)
Keyed toIP address onlyDomain and IP, tracked separately
Data sourceValidity partner network and trapsThe provider's entire mailbox base
Used for actual filteringOnly by receivers using Validity dataYes, directly
Visibility to youPublic 0-100 numberCoarse bands and metrics via Postmaster Tools / SNDS
WindowRolling 30 daysMonths of history, provider-defined
Blind spotsGmail, Microsoft, Yahoo internals; your domainOther providers; exact scoring is opaque

When Sender Score is still worth checking

To be fair to the metric, it has real uses. If you run dedicated IPs, a falling Sender Score is a legitimate early-warning sign that trap hits or complaints are rising somewhere in Validity's field of view, often before you notice placement decay. It is also a reasonable due-diligence check on an ESP's shared pools before you sign, and a data point some smaller receivers and filtering appliances genuinely consult. Check it monthly, note the direction of movement rather than the absolute number, and treat any sudden multi-point drop as a prompt to look at traps and complaints; just do not steer the program by it.

Why your scores disagree with each other

Check the same sending setup across Sender Score, Google Postmaster Tools, Microsoft SNDS, and a couple of other vendor scores and you will routinely get four different answers: an 88 here, Medium there, green somewhere else, and a warning from the fourth. This is expected, not a bug in any of them. Each system watches a different slice of traffic (Validity's partner feeds vs Gmail's mailboxes vs Microsoft's), keys reputation to a different identity (IP vs domain vs both), and weighs inputs on different windows (a rolling 30 days vs months of history). The disagreement itself is information: strong Postmaster Tools numbers with a sagging Sender Score usually points at a shared-IP neighbor problem or trap hits inside Validity's network, while the reverse, a healthy Sender Score with Low Gmail domain reputation, almost always means a complaint or engagement problem specific to your Gmail audience. When scores diverge, believe the one closest to the mailboxes you actually deliver to, and use a placement test to see the real outcome per provider.

What to track instead: four signals that predict placement

  1. Spam complaint rate, per provider. The single most consequential number in deliverability, with published hard thresholds: Google and Yahoo both cap acceptable user-reported spam at 0.3%, with under 0.1% as the stated target. Watch it weekly in Postmaster Tools; it leads reputation changes by days to weeks.
  2. Placement trends across providers. Reputation scores are abstractions; where your mail actually lands is the outcome. Regular inbox placement tests across Gmail, Outlook, Yahoo, iCloud, GMX, and Zoho show inbox vs Promotions vs spam per provider, and the trend line over weeks is far more informative than any single test.
  3. Blacklist status. A Spamhaus or SpamCop listing moves delivery within hours, faster and harder than any score movement. Automated blacklist checks across 130+ lists turn "we found out three weeks later" into "we got an alert the same day".
  4. DMARC pass rate. Aggregate DMARC reports tell you, from receivers' own measurements, what fraction of mail claiming your domain authenticates and aligns. A dip flags a broken key, a forgotten sending service, or active spoofing. Parsed DMARC monitoring should sit at 98%+ for your legitimate streams.

Together these four cover cause (complaints), gate (authentication), tripwire (blacklists), and outcome (placement). That combination catches every failure mode a single IP percentile can catch, plus the domain-level and provider-specific ones it cannot. As a cadence: complaints and placement weekly, blacklists and DMARC pass rate continuously via alerts, and any third-party score monthly as a sanity check.

The honest limits of every reputation number

A closing boundary: all reputation measurement, third-party scores and seed-based placement testing alike, is estimation of deliberately secret systems. Our placement tests are directional readings from seed mailboxes, not a guarantee of what your specific subscribers see, and no vendor can honestly promise a score or a test that equals inbox placement. We publish diagnostics and ranked fixes for legitimate, permission-based senders; anyone promising guaranteed placement or a "reputation reset" is selling confidence, not measurement.

If you want the four signals in one place, run a placement test with Inboxes: per-provider placement, blacklist and authentication status, and DMARC pass-rate monitoring, ending in a fix list ranked by what will actually move your mail.