← All tool ratings

DocumentCloud

Upload, analyze, annotate, and publish source documents for investigations.

Data & analysis
Built for journalismOpen source
Strong
https://www.documentcloud.org Reviewed 2026-04-02 Editorial assessment by Mike Schneider — not an independent security audit

What should journalists know about DocumentCloud?

DocumentCloud is how major investigations show their work. ProPublica, The New York Times, and hundreds of newsrooms use it to upload court filings, leaked memos, and government records, annotate key sections, then embed them directly in stories. The platform's add-on ecosystem now includes GPT-4 Vision table extraction, PII detection, and entity extraction via Google Cloud NLP — real AI tooling, not vaporware. MuckRock's nonprofit stewardship (since the 2018 merger) keeps it journalist-focused, and the October 2025 merger with Sunlight Research Center added hands-on research support for local newsrooms. The January 2025 UI redesign is noticeably faster. Biggest gap versus Google Pinpoint: no semantic search or knowledge-graph entity matching. Biggest advantage over Pinpoint: public embedding, collaborative annotation, and self-hosting via open source.

Best for

Publishing annotated source documents alongside stories. OCR on scanned PDFs (Tesseract free, Textract/Azure/Google Vision premium). Collaborative document review across a newsroom. Embedding primary sources in articles via responsive viewer. Bulk processing large FOIA dumps with add-ons.

Not for

Semantic search across large document sets — Google Pinpoint is stronger there. Not a private document vault by default (check access levels before uploading). Not for audio/video transcription. Limited entity-matching compared to Pinpoint's knowledge graph.

Security & Privacy

Encryption in transit Yes

Data is scrambled while being sent to their servers

Encryption at rest Yes

Data is scrambled when stored on their servers

Data jurisdiction AWS US. All documents stored on Amazon Web Services infrastructure in the United States.

Where servers are located — affects which governments can request your data

Security rating Strong

Privacy policy summary

Operated by MuckRock, a 501(c)(3) nonprofit. Three access levels: private (only you), organization (your newsroom), and public (anyone, indexed and searchable). Default is private. MuckRock does not sell user data. Public documents are fully indexed by search engines. Organization members can edit any org-shared document, including changing ownership.

How to protect yourself:

Verify the access level before every upload — organization members can edit org-shared documents. Redact before uploading, not after (originals may persist in processing pipeline). Strip metadata from files before upload. Use private access for pre-publication documents. Notes can be set independently to private, collaborator-only, or public. If a journalist leaves an organization, they lose edit access to public documents owned by that org.

Nonprofit-operated, open-source, hosted on AWS US. Three-tier access controls (private, organization, public). Built specifically for journalism with source document publishing as the core use case. No tracking or advertising. The coarse org-level permissions and the risk of accidentally publishing private documents are the main concerns — both mitigated by verifying access levels before upload.

Who Owns This

Owner MuckRock Foundation (501(c)(3) nonprofit, merged with DocumentCloud in 2018, merged with Sunlight Research Center in October 2025)
Funding Knight Foundation grants, Google News Initiative, Democracy Fund, News Integrity Initiative, individual donations, and paid premium plans.
Business model Freemium nonprofit. Free tier for verified journalists (100 pages/month). Paid professional and organization tiers fund AI credits and premium OCR. Gateway grants available for newsrooms needing bulk document processing.

Known issues

Default access level has changed over the years — always verify before uploading sensitive documents. OCR quality with free Tesseract engine is mediocre on noisy scans; premium Textract is significantly better but costs AI credits. No semantic search or entity-matching — if you need to find connections across thousands of documents, use Google Pinpoint alongside DocumentCloud. Embed viewer below 200px width degrades to a thumbnail link. Organization permission model is coarse: any org member can edit any org-shared document, including reassigning ownership. Open-source self-hosting option exists but documentation is sparse and the codebase has diverged from the hosted version.

Pricing

Free tier: 100 pages/month for verified news organizations. Professional plans include 2,000 AI credits/month. Organization plans include 5,000 AI credits/month for the first 5 users, plus 500 per additional user. AI credits power premium OCR (Textract, Azure, Google Vision) and GPT-based add-ons.

This is an editorial assessment based on publicly available information as of 2026-04-02, using our published methodology. Independent security review is pending. Security posture can change at any time. This is not a guarantee of safety.

Something wrong or outdated? Report it.