Google Pinpoint
AI document analysis for investigative journalism.
What should journalists know about Google Pinpoint?
Pinpoint is the best free tool for searching large document sets. Upload thousands of FOIA pages and it identifies entities, transcribes audio in 100+ languages, and lets you ask questions about your collection using Gemini-powered generative AI. The structured data extraction — highlight fields in one document, Pinpoint pulls them from similar documents across the collection — is genuinely powerful for repetitive FOIA work. The catch is the one you already know: your documents go to Google's servers. Google says Pinpoint data won't train LLMs. But human reviewers at Google can read samples of your prompts and feedback. The broader Google Privacy Policy permits using data to 'improve existing services' and 'develop new services.' Google complies with government data requests. For public records this is fine. For leaked documents, whistleblower materials, or anything where the mere existence of your search interest is sensitive, it's a non-starter. Use DocumentCloud (self-hostable, open-source) or Aleph for those.
FOIA document analysis, public records research, large document set exploration, court filing review, audio/video transcription, structured data extraction from repetitive document formats.
Leaked or classified documents, whistleblower materials, anything where your search queries themselves are sensitive. No API — can't integrate into custom workflows. Not useful on mobile. Generative AI features require separate early-access approval.
Security & Privacy
Data is scrambled while being sent to their servers
Data is scrambled when stored on their servers
Where servers are located — affects which governments can request your data
Privacy policy summary
Google's general privacy policy applies. Collections are private by default. Google states uploaded documents will not be used to train LLMs. However, Google human reviewers may read, annotate, and process samples of your Pinpoint data — including prompts and thumbs-up/down feedback on generative AI features. Google explicitly warns against including personally identifiable information (phone numbers, emails, birth dates) in AI prompts. The broader Google Privacy Policy permits using data to 'develop new services' and 'improve existing services.' Google complies with government data requests and publishes a transparency report. Your Pinpoint activity may be correlated with other Google services tied to your account.
How to protect yourself:
Use a dedicated Google account for Pinpoint that's not linked to your personal email, browsing, or Android phone. Don't upload documents that could identify confidential sources. Don't put PII in generative AI prompts — Google's own help docs warn against this. For sensitive document sets, use DocumentCloud (self-hosted option, open-source) or Aleph (occrp.org). Delete collections when analysis is complete. If a collection sits inactive for 4+ months, Gemini features degrade — you'll need to re-upload to a new collection.
Strong infrastructure security (Google Cloud encryption, private-by-default collections) but documents are processed on Google's servers under Google's broad privacy policy. Human reviewers can sample your prompts. No journalist-specific data protection guarantees. Use a dedicated account and keep sensitive source materials off the platform entirely.
Who Owns This
Known issues
Entity recognition produces false positives (in one documented case, a transcript of 'sixty frickin' Chiefs' surfaced the Kansas City Chiefs as an organization). Audio transcription does not separate speakers and breaks paragraphs poorly, making quote extraction difficult. Handwriting OCR is unreliable for anything less than perfectly neat writing. Hindi and some non-Latin script OCR accuracy is weak. No API — zero programmatic access, no way to integrate into custom pipelines. Not conversational — each generative AI question is independent, no follow-up context. Table auto-detection sometimes fails, requiring manual intervention that doesn't scale for large sets. Structured data extraction limited to 100 documents and 5 fields per batch. New features only apply to collections created after the feature ships — old collections don't get upgrades. Desktop-only in practice; mobile experience is minimal. Google has complied with government requests for user data in cases involving journalists.
Pricing
Free. No paid tiers. 100GB storage per user, up to 200,000 files per collection, 20,000 uploads per day.
Entirely free for verified journalists and academic researchers.
This is an editorial assessment based on publicly available information as of 2026-04-02, using our published methodology. Independent security review is pending. Security posture can change at any time. This is not a guarantee of safety.
Something wrong or outdated? Report it.