DocumentCloud
Upload, analyze, annotate, and publish source documents for investigations.
What should journalists know about DocumentCloud?
DocumentCloud is how major investigations show their work. ProPublica, The New York Times, and hundreds of newsrooms use it to upload court filings, leaked memos, and government records, annotate key sections, then embed them directly in stories. The platform's add-on ecosystem now includes GPT-4 Vision table extraction, PII detection, and entity extraction via Google Cloud NLP — real AI tooling, not vaporware. MuckRock's nonprofit stewardship (since the 2018 merger) keeps it journalist-focused, and the October 2025 merger with Sunlight Research Center added hands-on research support for local newsrooms. The January 2025 UI redesign is noticeably faster. Biggest gap versus Google Pinpoint: no semantic search or knowledge-graph entity matching. Biggest advantage over Pinpoint: public embedding, collaborative annotation, and self-hosting via open source.
Publishing annotated source documents alongside stories. OCR on scanned PDFs (Tesseract free, Textract/Azure/Google Vision premium). Collaborative document review across a newsroom. Embedding primary sources in articles via responsive viewer. Bulk processing large FOIA dumps with add-ons.
Semantic search across large document sets — Google Pinpoint is stronger there. Not a private document vault by default (check access levels before uploading). Not for audio/video transcription. Limited entity-matching compared to Pinpoint's knowledge graph.
Security & Privacy
Data is scrambled while being sent to their servers
Data is scrambled when stored on their servers
Where servers are located — affects which governments can request your data
Privacy policy summary
Operated by MuckRock, a 501(c)(3) nonprofit. Three access levels: private (only you), organization (your newsroom), and public (anyone, indexed and searchable). Default is private. MuckRock does not sell user data. Public documents are fully indexed by search engines. Organization members can edit any org-shared document, including changing ownership.
How to protect yourself:
Verify the access level before every upload — organization members can edit org-shared documents. Redact before uploading, not after (originals may persist in processing pipeline). Strip metadata from files before upload. Use private access for pre-publication documents. Notes can be set independently to private, collaborator-only, or public. If a journalist leaves an organization, they lose edit access to public documents owned by that org.
Nonprofit-operated, open-source, hosted on AWS US. Three-tier access controls (private, organization, public). Built specifically for journalism with source document publishing as the core use case. No tracking or advertising. The coarse org-level permissions and the risk of accidentally publishing private documents are the main concerns — both mitigated by verifying access levels before upload.
Who Owns This
Known issues
Default access level has changed over the years — always verify before uploading sensitive documents. OCR quality with free Tesseract engine is mediocre on noisy scans; premium Textract is significantly better but costs AI credits. No semantic search or entity-matching — if you need to find connections across thousands of documents, use Google Pinpoint alongside DocumentCloud. Embed viewer below 200px width degrades to a thumbnail link. Organization permission model is coarse: any org member can edit any org-shared document, including reassigning ownership. Open-source self-hosting option exists but documentation is sparse and the codebase has diverged from the hosted version.
Pricing
Free tier: 100 pages/month for verified news organizations. Professional plans include 2,000 AI credits/month. Organization plans include 5,000 AI credits/month for the first 5 users, plus 500 per additional user. AI credits power premium OCR (Textract, Azure, Google Vision) and GPT-based add-ons.
This is an editorial assessment based on publicly available information as of 2026-04-02, using our published methodology. Independent security review is pending. Security posture can change at any time. This is not a guarantee of safety.
Something wrong or outdated? Report it.