What we collect, what we don't, and why.

Effective May 18, 2026 (v0.7.0 update) · Northbeams Inc

The short version. Northbeams runs on three surfaces: browser extension, desktop app (Mac and PC), CLI (same desktop app). The browser extension classifies prompts inside the browser; raw prompt text never leaves the device. The desktop apps watch outbound connection metadata and process names; they never see prompt content, keystrokes, or screen contents. Only category labels (e.g., "credentials"), a redacted snippet, and connection/process metadata are sent to your dashboard. We never sell data, never train models on customer data, and never use customer data to improve our classifier.

1. Who we are

Northbeams Inc ("Northbeams", "we", "our") provides Northbeams, a SaaS product that helps organizations discover and govern AI tool use by their employees. This policy explains how we handle information when you visit our website (northbeams.com), use our dashboard (monitor.northbeams.com), install our browser extension, or install our desktop apps for Mac or PC.

2. Information we collect

Marketing site (northbeams.com)

IP address & user-agent string - captured by our hosting provider (Vercel) in standard server logs for abuse prevention. Retained for at most 30 days.
Marketing analytics & ad-platform pixels (opt-in) - on the marketing site only, we use Google Analytics 4, the Meta (Facebook) Pixel, the LinkedIn Insight Tag, and the Reddit Pixel to measure marketing reach and audience overlap. Everything runs under Google Consent Mode v2 with all storage defaulted to denied. No cookies are written and no data is sent to any of these platforms until you click "Accept" in the consent banner shown on your first visit. You can change your mind at any time by clearing the nb_consent entry in your browser's site storage, by using your browser's "do not track" setting, or by opting out directly with each platform (Your Online Choices, NAI, Meta opt-out). We enable IP anonymization on GA4. We never run any of these pixels on the dashboard (monitor.northbeams.com) or the browser extension. We do not pass form-field contents (email, name, company) to any of these platforms; we only pass aggregated pixel-fire signals that the visit happened.
Live chat (Crisp, functional) - the Crisp chat widget is loaded only on /pricing, /it-lead, and /cfo so visitors can ask sales and account questions. Crisp sets a session cookie to maintain the conversation across page loads. We treat this as a functional service (the same category as a sign-in cookie) and do not gate it behind the analytics consent banner. No chat content is sent to advertising platforms. If you would rather not engage, simply do not open the widget. See Crisp's privacy notice for details.

Dashboard (monitor.northbeams.com)

Account info - email and display name from your Google account, used for authentication via Firebase Auth.
Workspace info - the workspace name you choose at onboarding, plus an internal workspace identifier.
Cookies / session tokens - required for sign-in. No third-party tracking or advertising cookies.

Browser extension

AI tool visit events - when an employee opens an AI tool URL we recognize (e.g., chat.openai.com), we record the hostname, page title (truncated), tool identifier, and timestamp.
Sensitive content findings - when an employee submits a prompt on a supported AI tool site, the extension's in-browser classifier scans the prompt locally. If it matches one or more sensitive-content categories (credentials, PII, source code, customer data, contracts), we record:
- the category labels matched (e.g., ["credentials","sourceCode"]);
- per-pattern match counts (numbers only);
- a redacted snippet ≤200 characters with detected secrets masked as [REDACTED:type];
- tool, hostname, timestamp, prompt char-count, and the user label configured in the extension's settings.
How the extension identifies AI tool pages: the extension maintains a catalogue of AI tool hostnames that is refreshed from our servers every 6 hours. To support this live catalogue without requiring a browser-extension update each time a new AI tool is added, the extension requests permission to access all websites (<all_urls>). The in-page classifier script is injected only on pages whose hostname matches the current catalogue. No scripts are injected, and no data is read, on pages that do not match.
What we do NOT collect: the original prompt text, page DOM, full URL paths or query strings, keystrokes, or any data from pages that are not in our AI tool catalogue. Users can disable sensitive-content classification entirely from the extension's options page.
Image and PDF uploads (on-device OCR, Phase 1): when an employee uploads an image or PDF into a supported AI tool's composer, the bytes are processed entirely in the browser using vendored, SHA-256-pinned libraries (pdf.js for PDFs, Tesseract.js for images). The bytes never leave the device on this path. The extracted text feeds the same in-browser classifier; only the same category labels, redacted snippet, and a SHA-256 hash of the bytes (so admins can correlate findings without seeing the file) reach the dashboard.
Cloud OCR fallback (Phase 2, off by default): when the on-device reader cannot extract text from a file and an admin has explicitly turned the feature on, the bytes are sent in a single in-memory request to AWS Bedrock running Anthropic Haiku for text extraction. AWS Bedrock and Anthropic retain nothing by contract. Northbeams never writes the bytes to disk. The full data flow is documented at monitor.northbeams.com/security/ocr-data-flow. Every cloud OCR call is audit-logged with the SHA-256 hash, size, engine, and text length, but never the bytes themselves.
Heuristic AI-tool candidate reports: if a tab loads an unknown hostname that looks AI-adjacent (.ai top-level domain, chat./ai. subdomain, AI keywords), the extension reports just the bare hostname plus a short set of signal labels to your dashboard for admin review. Never the full URL, query, page title, or DOM. Each hostname is reported at most once per workspace per 24 hours.
Extension error reporting: uncaught errors in the extension are POSTed to monitor.northbeams.com/api/extension/error under your workspace's identity and forwarded to Sentry for diagnosis. Reports contain only the error message + stack trace from our own code, the extension version, and a small whitelisted context map. They never contain prompt text, OCR text, classifier match values, or any user content. URL-shaped context fields are query-stripped before transmission.

Desktop apps (Northbeams for Mac and Northbeams for PC)

AI tool process events - when an employee runs a recognized AI desktop app (e.g., Claude Desktop, ChatGPT Desktop, Cursor, Granola) or a recognized AI CLI tool (e.g., Claude Code, Aider) on the same laptop, we record the process name (not the full command line), the matched tool identifier, the user label, and a timestamp.
Current document path (optional, opt-in v0.7.0+) - for a small set of editor-class AI tools that expose it (Cursor, VS Code, Sublime Text, BBEdit, Xcode), the daemon resolves the absolute path of the document the tool has open at the moment of the sighting and stores it alongside the event. We never capture contents. The home-directory segment is collapsed to ~ before storage and the field is capped at 500 characters; longer strings are truncated.
Outbound connection events - when the laptop opens a network connection to a recognized AI service host (matched against a bundled catalogue of AI service signatures), we record the destination hostname, the matched tool identifier, and a timestamp. We never see, store, or transmit the contents of the connection.
DNS lookup events (optional, opt-in v0.7.0+) - power users can point their system resolver at the daemon's local DNS proxy. When enabled, the daemon records the lookup hostname for any name that matches the AI-service catalogue. No upstream modification, no resolution failure if the proxy is paused or crashes (it transparently forwards every query upstream). Default off; requires manual configuration. We never see, store, or transmit the response bodies.
Apple Intelligence framework loads (heuristic, macOS 26+) - for stock Apple host apps (Mail, Notes, Safari, Messages, System Settings), the daemon checks whether the AppleIntelligence / GenerativeModels private frameworks are present in the process's module list via public vmmap output. This is a heuristic. The presence of a framework load is a strong signal the app has wired up AI surfaces, but it is not proof the user invoked one in this session. Recorded with surface process and disposition framework-loaded:<hostApp>.
Bidirectional command channel (v0.7.0+) - workspace admins can enqueue commands (pause / resume a tool, run a self-test, refresh the catalogue) for paired daemons via the dashboard. The daemon long-polls for pending commands and acks each one with status (ok / failed). Every issued command and every ack is recorded in the audit log. The daemon validates each command kind against a hard-coded allow-list. Arbitrary shell commands are never accepted from the server. Commands expire if unclaimed within 5 minutes (configurable up to 30 minutes); the full record retains for 60 days for forensic review.
Self-test events (v0.7.0+) - clicking "Run self-test" in the menu bar writes one synthetic prompt-finding entry to the dashboard with a known-safe placeholder credential and a five-minute Firestore TTL. The doc auto-deletes once you have had a chance to see the round-trip complete; nothing real-user-generated is involved.
Device metadata - operating system family (macOS or Windows), OS major version, and an installation identifier we generate so the dashboard can show this laptop as a Connected surface. No hardware serial numbers, no MAC addresses, no user-account names.
What we do NOT collect from desktop: prompt content, AI tool responses, keystrokes, clipboard contents, screen contents, file contents, full command lines, browsing history, or traffic to non-AI hosts.
Optional TLS interception (opt-in, off by default) - administrators can enable an on-device MITM proxy for two hostnames only: api.anthropic.com and api.openai.com. When on, the daemon installs a machine-local CA, the proxy classifies the request body in memory, and only categorical labels (e.g. credentials, pii, source-code) ever leave the device. Raw bodies are dropped the moment the upstream response returns. The menu bar shows a permanent "MITM:" disclosure in the stats line whenever interception is active. Default installs ship with this off; nothing about the desktop app acts as a network proxy unless the admin opts in.

3. How we use your information

To operate the dashboard, show your team's AI usage, and surface sensitive-content findings.
To prevent abuse (rate limiting, anomaly detection).
To communicate with you about your account, billing, and material product changes.

4. How we do not use your information

We do not sell your data, ever.
We do not train AI models - ours or anyone else's - on customer data.
We do not share data with advertising networks.
We do not run third-party trackers or analytics that profile users inside the dashboard, browser extension, or desktop apps. The marketing site uses opt-in Google Analytics 4 plus the Meta, LinkedIn, and Reddit ad-platform pixels for visit measurement only, all gated behind explicit consent (see Section 2).

5. Where data is stored. Sub-processors.

Customer data is stored in Google Cloud's Firestore via the Firebase platform, hosted in the United States. The dashboard and the marketing site are served by Vercel through its global edge. Northbeams does not currently offer EU-region hosting; contact us if your contract requires it.

For EU customers: transfers to the United States rely on the Standard Contractual Clauses with each sub-processor below, plus supplementary technical measures (TLS in transit, AES-256 at rest, server-side identity stamping so a client cannot forge user identity in the audit trail).

The full sub-processor list, kept current at /sub-processors, is:

Google Cloud (Firestore, Firebase Auth, Cloud Functions). Workspace data and authentication.
Vercel. Hosting for the dashboard and the marketing site.
AWS (Bedrock). Cloud OCR fallback only; off by default per workspace.
Anthropic. Cloud OCR fallback model provider, reached via AWS Bedrock with zero retention.
Stripe. Subscription billing. We never store full card numbers.
Resend. Transactional and marketing email.
Crisp. Customer support chat (loaded only on /pricing, /it-lead, /cfo).
Sentry. Application error reporting. Stack traces and sanitized context only; never prompt text, OCR text, classifier match values, or any user content.
Cloudflare R2. Static-asset hosting for the desktop installer downloads.
Scytale. SOC 2 compliance tooling (read-only access to a subset of organization metadata).

We will email workspace admins at least 30 days before adding a new sub-processor that materially changes how customer data flows.

6. Data retention

The canonical retention schedule is published in docs/data-retention.md. The short version:

Tool-visit and prompt-finding incidents: 13 months from the event timestamp. After that, deleted from the Firestore hot store via a TTL policy. Aggregated counters (daily / weekly / monthly rollups, no PII) are kept indefinitely.
Tool candidate records: indefinite while the admin has not yet decided; 24 months after the admin marks them verified or dismissed.
Workspace members: lifetime of subscription plus 90 days. You can request immediate deletion at any time.
Audit log (admin actions, policy changes, extension pairings): 24 months, to meet SOC 2 and EU AI Act record-keeping expectations.
Sentry error reports: 90 days (Sentry's default).
Stripe billing records: 7 years (US federal tax law).
Raw prompt text, raw OCR bytes: never stored, anywhere.

Workspace owners can request earlier deletion of any class of customer data by emailing privacy@northbeams.com. We act within 30 days.

7. Your rights

Depending on where you live (e.g., EU/UK GDPR, California CCPA), you may have rights to access, correct, export, or delete the personal information we hold about you. To exercise these rights, email privacy@northbeams.com. We respond within 30 days.

8. Security

Workspace keys (used by the browser extension to authenticate to our backend) are stored only in your local browser via chrome.storage.local. Desktop install tokens are short-lived, signed, and consumed once at first launch; the desktop app then holds a per-device bearer token in the OS keychain (Keychain on Mac, Credential Manager on PC). All bearer tokens live in our backend's secure Firestore collection (admin-SDK access only). All traffic uses TLS. We use Firebase Auth for sign-in and follow Google's recommended security practices.

9. Changes to this policy

We will email customers and update the "Effective" date at the top of this page if we make material changes. Continued use of Northbeams after the effective date constitutes acceptance of the updated policy.

10. Contact

Privacy questions: privacy@northbeams.com
General contact: hello@northbeams.com
Northbeams Inc, 2261 Market Street STE 76418, San Francisco, CA 94114

← Home Support Resources Contact Terms