Scoring Model

How Risk Scores Are Calculated

Scores are deterministic, auditable, and transparent. Every input to the model is documented here. Platform members can understand exactly why a given score was calculated.

Risk Ratings

Scores range from 0 to 100. Each score maps to a named rating your platform can act on directly.

0 10 30 60 85 100

Rating	Score Range	Recommended Action
clear	0–10	No meaningful history. Treat as unknown; no action required.
flagged	11–30	Some reports on record. Consider light friction or passive monitoring.
cautioned	31–60	A pattern is emerging. Consider additional verification or restricted access.
restricted	61–85	Significant confirmed history. Recommend: deny access or require human review.
blacklisted	86–100	Severe or repeated violations across multiple platforms. Block.

Calculation Overview

A score is calculated from all confirmed reports for an identity. Each report contributes to a dimensional score (per violation category), which are then combined into a composite 0–100 score.

1

Per-report weight

Each report is assigned a base weight from its severity multiplier, then multiplied by the submitting platform's trust score (0–1), then by a time decay factor based on report age.
2

Diminishing returns per platform

For each platform submitting multiple reports, each additional report carries 0.8× the weight of the previous one. This prevents a single platform from overwhelming the score.
3

Dimensional scores

Weighted report values are accumulated per violation category to produce five independent dimensional scores (0–100 each), then normalized.
4

Composite score

Dimensional scores are combined using category weights (see table below) into a single 0–100 composite score. This is the score returned by GET /v1/scores.

Category Weights

Five violation categories contribute to the composite score. Weights reflect the severity of harm associated with each category. Weights sum to 1.0.

Category	Weight	Description
harassment	0.30	Direct targeting, threats, sustained unwanted contact
fake_profile	0.25	Identity fraud, impersonation, sockpuppet accounts
explicit_content	0.20	Unsolicited explicit material, non-consensual sharing
unsolicited_dm	0.15	Unsolicited direct messages, repeated unwanted outreach
spam	0.10	Mass unsolicited messages, automated bulk activity

Severity Multipliers

Each report is submitted with a severity level. The severity multiplier scales the report's contribution to the score.

Severity	Multiplier	Use when
low	0.5×	Minor policy violation, first-time, low impact
medium	1.0×	Clear violation, confirmed intent, single incident
high	1.75×	Serious violation, pattern of behavior, or victim impact
critical	3.0×	Extreme violation, illegal content, credible threat, CSAM

Score Modifiers

Time Decay

Reports older than 365 days carry reduced weight. The decay factor reaches a floor of 0.2 — old reports never drop to zero. Recent confirmed behavior is weighted most heavily.

Platform Trust

Each report is weighted by the submitting platform's trust score (0–1). All platforms start at 0.5. Platforms that consistently submit accurate reports earn higher trust over time.

Diminishing Returns

Each additional report from the same platform carries 0.8× the weight of the previous one. Prevents a single platform from dominating an identity's score.

Confidence Levels

Alongside the score, the API returns a confidence level reflecting how much corroborating evidence exists. A high score with low confidence may warrant more caution interpreting it.

Confidence	Condition	Interpretation
low	Fewer than 3 confirmed reports	Limited data. Score reflects few data points — treat with caution.
medium	3+ reports, fewer than 3 contributing platforms	Pattern established but corroboration is limited to one or two sources.
high	3+ reports from 3+ distinct platforms	Well-corroborated. Independent platforms have each independently confirmed violations.