← All tool ratings

Google Pinpoint

AI document analysis for investigative journalism.

Newsgathering
Free for journalistsBuilt for journalism
Caution
https://journaliststudio.google.com/pinpoint/about/ Reviewed 2026-04-02 Editorial assessment by Mike Schneider — not an independent security audit

What should journalists know about Google Pinpoint?

Pinpoint is the best free tool for searching large document sets. Upload thousands of FOIA pages and it identifies entities, transcribes audio in 100+ languages, and lets you ask questions about your collection using Gemini-powered generative AI. The structured data extraction — highlight fields in one document, Pinpoint pulls them from similar documents across the collection — is genuinely powerful for repetitive FOIA work. The catch is the one you already know: your documents go to Google's servers. Google says Pinpoint data won't train LLMs. But human reviewers at Google can read samples of your prompts and feedback. The broader Google Privacy Policy permits using data to 'improve existing services' and 'develop new services.' Google complies with government data requests. For public records this is fine. For leaked documents, whistleblower materials, or anything where the mere existence of your search interest is sensitive, it's a non-starter. Use DocumentCloud (self-hostable, open-source) or Aleph for those.

Best for

FOIA document analysis, public records research, large document set exploration, court filing review, audio/video transcription, structured data extraction from repetitive document formats.

Not for

Leaked or classified documents, whistleblower materials, anything where your search queries themselves are sensitive. No API — can't integrate into custom workflows. Not useful on mobile. Generative AI features require separate early-access approval.

Security & Privacy

Encryption in transit Yes

Data is scrambled while being sent to their servers

Encryption at rest Yes

Data is scrambled when stored on their servers

Data jurisdiction United States (Google Cloud)

Where servers are located — affects which governments can request your data

Security rating Caution

Privacy policy summary

Google's general privacy policy applies. Collections are private by default. Google states uploaded documents will not be used to train LLMs. However, Google human reviewers may read, annotate, and process samples of your Pinpoint data — including prompts and thumbs-up/down feedback on generative AI features. Google explicitly warns against including personally identifiable information (phone numbers, emails, birth dates) in AI prompts. The broader Google Privacy Policy permits using data to 'develop new services' and 'improve existing services.' Google complies with government data requests and publishes a transparency report. Your Pinpoint activity may be correlated with other Google services tied to your account.

How to protect yourself:

Use a dedicated Google account for Pinpoint that's not linked to your personal email, browsing, or Android phone. Don't upload documents that could identify confidential sources. Don't put PII in generative AI prompts — Google's own help docs warn against this. For sensitive document sets, use DocumentCloud (self-hosted option, open-source) or Aleph (occrp.org). Delete collections when analysis is complete. If a collection sits inactive for 4+ months, Gemini features degrade — you'll need to re-upload to a new collection.

Strong infrastructure security (Google Cloud encryption, private-by-default collections) but documents are processed on Google's servers under Google's broad privacy policy. Human reviewers can sample your prompts. No journalist-specific data protection guarantees. Use a dedicated account and keep sensitive source materials off the platform entirely.

Who Owns This

Owner Alphabet Inc. / Google LLC
Funding Corporate. Part of Google News Initiative ($300M+ committed since 2018). Journalist Studio is the product suite; Pinpoint is its flagship tool.
Business model Free tool. No direct revenue. Builds Google's relationship with the journalism industry and positions Google infrastructure as the default for newsroom workflows. Classic ecosystem play — free tools create dependency on the platform.

Known issues

Entity recognition produces false positives (in one documented case, a transcript of 'sixty frickin' Chiefs' surfaced the Kansas City Chiefs as an organization). Audio transcription does not separate speakers and breaks paragraphs poorly, making quote extraction difficult. Handwriting OCR is unreliable for anything less than perfectly neat writing. Hindi and some non-Latin script OCR accuracy is weak. No API — zero programmatic access, no way to integrate into custom pipelines. Not conversational — each generative AI question is independent, no follow-up context. Table auto-detection sometimes fails, requiring manual intervention that doesn't scale for large sets. Structured data extraction limited to 100 documents and 5 fields per batch. New features only apply to collections created after the feature ships — old collections don't get upgrades. Desktop-only in practice; mobile experience is minimal. Google has complied with government requests for user data in cases involving journalists.

Pricing

Free. No paid tiers. 100GB storage per user, up to 200,000 files per collection, 20,000 uploads per day.

Entirely free for verified journalists and academic researchers.

This is an editorial assessment based on publicly available information as of 2026-04-02, using our published methodology. Independent security review is pending. Security posture can change at any time. This is not a guarantee of safety.

Something wrong or outdated? Report it.