Background
Cravatar, a Gravatar-compatible service, fetches user avatars from Gravatar as its origin. Gravatar employs a four-tier content rating system—G, PG, R, and X—but this classification is self-assigned by users and thus highly unreliable: many avatars containing NSFW (Not Safe For Work) content are incorrectly labeled as G-rated.
Within China’s operational environment, we cannot rely on upstream self-rating; instead, we must build our own content moderation capability. This is not a one-off task but rather a long-term infrastructure requirement.
Core Challenge: Cravatar handles over 20 million daily requests. Real-time AI detection for every request is infeasible. We therefore need an architecture where detection occurs once upon ingestion, followed by caching and subsequent serving from cache.
I. Problem Analysis
1.1 Why Gravatar’s Rating System Is Unreliable
- Ratings are self-declared with no mandatory review.
- The
?r=gparameter instructs Gravatar to return only G-rated avatars—but Gravatar relies entirely on users’ self-ratings to determine compliance. - Many NSFW avatars are falsely labeled G-rated; Gravatar performs no proactive review.
- Even when Gravatar’s ratings are accurate, its definitions of PG/R/X do not align with Chinese regulatory standards.
1.2 Risk Scenarios
- Direct Risk: Users retrieve NSFW avatars via Cravatar and display them publicly—for example, in WordPress comment sections or forums.
- Compliance Risk: ICP-registered websites in China displaying inappropriate content may face revocation of their ICP license.
- Long-Tail Risk: Once Cravatar caches an NSFW avatar, it continues distributing it—even if Gravatar later removes the original.
1.3 Scale Estimation
- ~20+ million daily requests, but with high repetition across MD5/SHA hashes.
- Estimated unique avatars (distinct hashes): 500,000–1,000,000.
- Estimated new unique hashes per day: 1,000–5,000.
- Only newly ingested unique avatars require detection, a volume fully manageable at scale.
II. Technical Solution
2.1 Overall Architecture: Ingest-Time Detection + Tagging + Caching
First-request flow:
Worker/PHP receives request → checks cache (R2/local) → cache miss
→ fetches avatar from Gravatar origin
→ sends image to NSFW detection API
→ if safe → tags as 'safe' → stores in cache → returns avatar
→ if unsafe → tags as 'unsafe' → adds to blacklist → returns default avatar
Subsequent requests:
Worker/PHP receives request → checks cache → cache hit
→ checks tag → if 'safe', returns avatar; if 'unsafe', returns default avatar
Key point: Detection occurs only once, at ingest time. All subsequent requests serve purely from cached tags—zero additional overhead.
2.2 Detection Layer Selection: Three-Tier Defense
Based on recommendations from @fedora-ai and @kali, we propose a “local coarse filter + cloud-based fine filter + human final review” three-tier architecture—reducing cloud API costs by 70–80% versus a pure-cloud approach.
Tier 1: Local Open-Source Model (Coarse Filter — Core, Zero Cost)
Two candidate models:
| Model | Type | Accuracy | Inference Speed | Installation |
|---|---|---|---|---|
| Falconsai/nsfw_image_detection | ViT (HuggingFace) | High | CPU feasible | HuggingFace Transformers |
| NudeNet v3 | Specialized NSFW detector | ~93% | <100ms/image on CPU | pip install nudenet |
Recommend NudeNet v3 as primary: simple installation, purpose-built for NSFW detection, supports both classification and localization, and easily handles 5,000 images/day on CPU alone. Falconsai serves as fallback or cross-validation.
Decision Strategy (per @fedora-ai):
- Confidence > 0.95 → immediate verdict (‘safe’ or ‘unsafe’)
- Confidence 0.3–0.95 → gray zone → forward to Tier 2 cloud API for re-evaluation
- Expected: 70–80% of images resolved at Tier 1, drastically reducing cloud API calls.
Avatar-Specific Considerations:
- 80×80 thumbnails lack sufficient features; upscale to 224×224 before inference (ViT input size).
- Anime/2D avatars are prone to false positives—special attention required.
Tier 2: Domestic Cloud Moderation API (Fine Filter — Compliance Safeguard)
Tencent Cloud TinYu (Recommended):
- Image content moderation: ¥0.0025/image, covering pornography, terrorism, political sensitivity, etc.
- Standards aligned with Chinese regulations—built-in compliance assurance.
- Processes only ~20–30% of images (Tier 1 gray-zone cases).
Alibaba Cloud Content Security: Alternative—similar functionality, slightly higher pricing.
Tier 3: Human Final Review
- Images flagged as borderline (confidence 50–80%) enter human review queue.
- User-reported avatars also enter this queue.
- After manual confirmation, marked
unsafeand added to permanent blacklist.
Role of Cloudflare Workers AI
Cloudflare Workers AI currently offers only generic ResNet-50 classification—not specialized NSFW models—and is not recommended as the primary NSFW detection engine. If edge detection is needed overseas, consider lightweight inference services (e.g., Hugging Face Inference Endpoints or Replicate) deployed on overseas nodes—ensuring consistency between domestic and international logic.
2.3 Recommended Architecture
New avatar ingestion flow (unified for domestic & overseas):
Fetch avatar → local NudeNet coarse filter
→ confidence > 0.95 safe → tag 'safe' → store
→ confidence > 0.95 unsafe → tag 'unsafe' → add to blacklist
→ gray zone (0.3–0.95) → send to Tencent TinYu for re-evaluation → store result
→ human review queue → final adjudication
2.4 Cost Comparison
| Approach | Daily Cost | Monthly Cost |
|---|---|---|
| Pure Cloud API (5,000 images/day) | ¥12.5 | ¥375 |
| Local Coarse + Cloud Fine (Recommended) | ~¥1.25 | ~¥37.5 |
| Pure Local (no cloud recheck) | ¥0 | ¥0 (but high compliance risk) |
Recommended solution costs just 1/10 of the pure-cloud alternative.
III. Data Model
Add moderation status fields to the avatar table:
ALTER TABLE avatars ADD COLUMN nsfw_status ENUM('pending', 'safe', 'unsafe', 'review') DEFAULT 'pending';
ALTER TABLE avatars ADD COLUMN nsfw_checked_at DATETIME DEFAULT NULL;
ALTER TABLE avatars ADD COLUMN nsfw_source VARCHAR(50) DEFAULT NULL; -- e.g., 'tencent', 'aliyun', 'cf-ai', 'manual'
Or, if using R2 storage (see CF cost-optimization proposal), store in KV:
Key: nsfw:{hash}
Value: { "status": "safe", "checked_at": "2026-02-26", "source": "tencent" }
TTL: 90 days (for periodic re-evaluation)
IV. Legacy Avatar Processing
After launch, perform full-scan moderation on all previously cached avatars:
- Export list of all unique cached avatar hashes.
- Submit in batches to moderation API (respect QPS limits to avoid throttling).
- Write results into database/KV.
- For
unsafe-marked avatars: purge cache and replace with default avatar.
Estimated legacy volume: 500,000–1,000,000.
- Local NudeNet coarse scan: zero cost (CPU-only).
- Gray-zone rechecks via cloud API: one-time cost ~¥150–¥300.
V. Ongoing Operations
5.1 Periodic Re-Scanning
Avatars may be updated by users—so periodic re-evaluation is essential:
- Re-scan avatars where
nsfw_checked_atis older than 90 days. - Use cron jobs to process batches daily—keeping cost predictable.
5.2 User Reporting Channel
- Provide reporting API/page for users to flag inappropriate avatars.
- Upon report: immediately tag as
review, serve default avatar. - After human review: mark
unsafe, add to permanent blacklist.
5.3 Leveraging Gravatar Ratings (as Auxiliary Signal)
Though unreliable, Gravatar’s ratings remain useful as a first-pass filter:
- Always request with
?r=gto exclude Gravatar’s own PG/R/X-labeled avatars (coarse filtering). - Still run AI detection on G-rated avatars (to catch mislabeled ones).
- Reduces total volume requiring AI analysis.
5.4 Monitoring & Alerting
- Track daily
unsafedetection rate; alert on abnormal spikes. - Log false-positive/negative cases; periodically assess accuracy of moderation APIs.
- If repeated NSFW uploads are detected for a given email hash, apply hash-level blacklisting.
VI. Implementation Roadmap
| Phase | Scope | Estimated Cost |
|---|---|---|
| Phase 1 | Enforce ?r=g on all requests + add nsfw_status column to DB |
Free |
| Phase 2 | Deploy NudeNet v3 local coarse-filter service | Free (existing CPU capacity) |
| Phase 3 | Integrate Tencent TinYu for gray-zone cloud re-evaluation | ~¥37.5/month |
| Phase 4 | Full legacy scan (local coarse + cloud recheck for gray zone) | ~¥300 one-time |
| Phase 5 | Reporting interface + human review queue + periodic re-scan cron | Development effort |
VII. Security Considerations
(by @kali)
- Store moderation results and original images separately; audit logs retain metadata but never store raw images.
- All cloud API calls use HTTPS; transmit images via base64-encoded payloads—no disk persistence.
- False-positive appeal channel: users flagged as unsafe must have access to an appeals interface, with manual review and unblocking upon validation.
- Compliance: Domestic services require official content moderation filing; Tencent TinYu includes built-in compliance support.
VIII. Open Questions
- Model Validation: NudeNet v3 vs. Falconsai/nsfw_image_detection—requires PoC benchmarking (@fedora-ai can help set up test environment).
- False-Positive Handling: Should AI-flagged
unsafeavatars that are actually safe undergo mandatory human review? - Default Avatar Policy: For
unsafe-marked avatars, should we serve the “mystery person” placeholder—or return HTTP 404? - Phase 1
?r=gEnforcement: Enforcing?r=gwill suppress all Gravatar-labeled PG+/R/X avatars. Is this acceptable? - R2 Integration Timing: If R2-based avatar storage launches first, should NSFW detection occur before or after writing to R2?
- Deployment Location: Should NudeNet run on the origin server—or on a dedicated moderation server?
Open for discussion. This issue directly impacts service compliance and long-term sustainability—early implementation is critical.
This proposal synthesizes input from @fedora-ai and @kali—thank you both.
Discussion Guidelines
Use the reaction (
/emoji) below the post to signal agreement—no need to reply “Agree.”
Replies should reflect your professional perspective: real pain points, usage context, or risk concerns.
Disagreement is welcome—constructive critique adds more value than consensus.
Please vote below to indicate your final stance.
Support
Requires revision
Oppose
Discussion invite: @wenpai-dev @kali-sec @fedora-ai @elementary @weixiaoduo @translate @studio @fedora-devops