Surveillance Detection Notice

VPN users: If you are a regular visitor using a VPN, you may have been redirected here because your VPN exit node shares an IP range with known corporate or datacenter scanning infrastructure. This is a side effect of blocking by ASN and IP range, not a judgment about you personally. Try switching to a residential IP, or access the site from public WiFi such as a coffee shop or library.

Active monitoring detected

A Note on Corporate Surveillance

This page is served to network ranges, TLS fingerprints, and behavioral patterns identified as conducting systematic intelligence gathering against this project. If you are reading this, your traffic matched that classification, or maybe you clicked the Corporate Notice on the topbar.

First Detection
23 March 2026
Last Updated
8 April 2026
Classification
Corporate OSINT / Coordinated Scanning
Organizations Detected
Meta, Microsoft, Google, Palo Alto, Censys, others
Status
Logged, archived, ongoing

What We Observed

Starting 23 March 2026, we recorded 1,285 requests from 70 unique IP addresses within Meta's AS32934 Ashburn infrastructure, and 1,659 requests from 18 IP addresses registered to Microsoft's AS8075. The access patterns from both networks match coordinated corporate intelligence gathering, not routine crawling.

Behavior Observation
Repository enumeration Every document in the findings repo read in rendered, ?display=source, and /raw/ format - six extraction methods per file
Full archive exfiltration Complete .bundle archives downloaded for every commit: main.bundle, plus individual bundles for commits ef7295b, cd27fcd, 463889d, and 5850d38
Version diffing Same files downloaded across all four commits to reconstruct editorial history. Every document compared revision-by-revision
Authorship analysis git blame queries on specific files including kinvolk-amutable-pipeline.md, founding-deed-analysis.md, and political-connections.md
Diff and patch extraction Commit diffs (.diff) and patches (.patch) for all commits. Whitespace normalization flags tested: ignore-change, ignore-all, ignore-eol, show-all. Split and unified views both accessed
Document retrieval German corporate registry filings downloaded: Handelsregister PDFs (AD, CD filings), Gesellschafterliste, and SI XML extract
Change monitoring setup RSS and Atom feeds accessed for the repo and for individual files. Per-file feeds created for political-connections.md, lobbying-landscape.md, and 15 other documents
Engagement mapping Watchers, stars, forks, issues, pull requests, milestones, labels, and contributor graphs all enumerated
Activity timeline scraping Repository activity scraped at every time granularity: daily, halfweekly, weekly, monthly, quarterly, semiyearly, yearly, plus code-frequency, recent-commits, and contributors views
Issue creation attempt Accessed /issues/new - tried to open an issue on the repository
Authentication probing Repeated login redirect attempts to access authenticated routes for specific documents, commit histories, and branch views
IP rotation Requests distributed across 70 IPs at ~2-second intervals from a single /24 block. One request per IP before rotating. Textbook rate-limit evasion

This is not someone following a link from Reddit. This is automated extraction of our research, our sources, our editorial history, and our engagement metrics. A crawler rotating across 70 IP addresses at two-second intervals to dodge rate limiting. You downloaded complete repository archives. You diffed every revision. You set up RSS feeds to watch for future changes. You tried to create an issue. You tried to get behind the login page. All of it is logged.

On Meta's Position

All 1,285 requests from Meta's network carried the user agent meta-externalagent/1.1. This is not someone reading the site from a desk. This is Meta's corporate crawler, deployed from 70 rotating IPs within AS32934's Ashburn infrastructure, pulling every file in the repository across multiple commits. The crawl finished in nine minutes. Automated, distributed, and built to capture the full contents and revision history of this investigation in a single pass.

Meta Platforms spent $26.3 million on federal lobbying in 2024 while funding legislative frameworks - including DAAA and ASAA - designed to shift age verification obligations from social platforms to operating system providers. A former Meta employee who contributed to systemd as a maintainer now works at Amutable GmbH, the company co-founded by the individual who blocked the community revert of the birthDate merge. Meta sponsored the systemd developer conference organized by Amutable's co-founder.

These are documented facts drawn from public lobbying disclosures, corporate filings, and conference records. They are why this investigation exists. Deploying a corporate crawler to bulk-extract the investigation does not alter the public record. It does, however, become part of it.

On Microsoft's Position

All 1,659 requests from Microsoft's network identified as GPTBot/1.3 and OAI-SearchBot/1.3 - OpenAI's crawlers, running on Microsoft Azure infrastructure registered to AS8075. The traffic originated from IP ranges in Atlanta, Warsaw, and Seoul, hitting every Forgejo endpoint in rapid succession. One request per second, working through the repository's metadata, documents, diffs, patches, and admin interfaces. Whether Microsoft is monitoring an investigation into its own employees through OpenAI's crawler infrastructure, or OpenAI is independently scraping the repository, the traffic originates from Microsoft-registered IP space and the access pattern looks like directed intelligence gathering.

Microsoft spent $10.35 million on federal lobbying in 2024 supporting KOSA, COPPA 2.0, and child safety legislation. Microsoft acquired Kinvolk GmbH in April 2021. Three former Kinvolk and Microsoft employees founded Amutable GmbH in August 2025. The birthDate merge was approved by a Microsoft employee and the community revert was blocked by an Amutable co-founder. Microsoft already collects birth dates in Windows - the compliance cost is near zero. The cost falls on open source.

Your crawlers read the founding deed analysis. They downloaded the Gesellschafterliste. They downloaded both Handelsregister PDFs. They tried to open an issue on the repository. They scraped every activity timeline from daily to yearly. We assume they confirmed that everything we published is accurate, because it is.

Update: The Scanning Got Worse

Since the original notice was published, the reconnaissance has expanded well beyond Meta and Microsoft. We now track coordinated scanning from Google Cloud Platform infrastructure, Palo Alto Networks' Cortex Xpanse, Censys, and several other organizations running automated probes against this site.

The Google Cloud operation is particularly brazen. Dozens of IPs across multiple /24 blocks, all running an identical outdated browser signature from 2020, all hitting the site in rotation. The same TLS fingerprint on every connection. It is the digital equivalent of sending 30 people into a library wearing the same uniform, each reading one page before leaving and sending the next one in, and then pretending nobody will notice because they technically used different doors.

Palo Alto's Cortex Xpanse scanner probes this site from a dedicated /20 block as part of their "attack surface management" product. In plain language, they sell a service that maps other people's infrastructure for vulnerabilities. A company with $6.9 billion in annual revenue is using enterprise reconnaissance tools against a three-person investigative project built on public records. Read that again.

We are not going to list every network, every fingerprint, or every behavioral pattern we now detect. What we will say is that the detection infrastructure has been substantially expanded since March. We now correlate across multiple data sources simultaneously. IPs that previously would have gotten through are now identified and redirected here within seconds of their first request. The system is automated and it does not sleep.

To be clear about what is happening here: multiple corporations, including companies whose employees and contractors are named in our findings, are running coordinated automated reconnaissance against a site that publishes public records about their conduct. Some of them are doing it through third-party scanning services, presumably so their names do not appear directly in access logs. We see them anyway.

This is not a rebuttal. It is not a legal challenge. It is not a factual correction. It is surveillance of the people investigating you. We find it embarrassing on your behalf, and we want you to know that it has not gone unnoticed.

Update: Persistent Infrastructure-Grade Surveillance (April 2026)

Since expanding our detection capabilities, we can now quantify what "monitoring" actually looks like when three of the world's largest technology companies decide to watch an investigative project full-time.

Over a 72-hour observation window ending 8 April 2026, we recorded 8,518 requests from corporate surveillance infrastructure. That is 15.3% of all traffic to this site. Not 15% from bots in general - 15% specifically from the companies named in our findings. For a site with roughly 2,400 unique daily visitors, that ratio is not normal. It is not close to normal.

Organization Requests Unique IPs Days Active Assessment
Google Cloud Platform 5,528 1,105 3 of 3 Automated scanner cluster
Microsoft 2,126 672 3 of 3 Mixed: scanners + directed research
Google (Corporate) 313 65 3 of 3 Crawling + corporate browsing
Meta / Facebook 244 147 3 of 3 Targeted document retrieval
Zscaler 117 43 3 of 3 Corporate proxy monitoring
Apple 72 27 3 of 3 Periodic monitoring

Every single organization was active on every single day. The traffic never stops. Our hourly heatmap shows corporate hits in every hour of every 24-hour period, peaking at 953 requests in a single hour from Google Cloud alone. At 3 AM UTC. This is not employees reading articles over coffee. This is automated infrastructure that runs around the clock.

What They Are Targeting

The single most accessed page on this site from corporate infrastructure is /surveillancefindings/, with 4,862 hits from corporate IPs in 72 hours. Google Cloud accounts for 4,218 of those. The same page, hit from over a thousand different IP addresses, most running an identical Chrome 83 user agent - a browser version from 2020, now 66 versions behind the current release. No human being uses Chrome 83 in 2026. This is a headless browser cluster cycling through IPs, and whoever deployed it did not bother to update the version string.

Microsoft probed 193 unique URLs across the site - the widest target surface of any organization. Their traffic includes vulnerability scanning (/wp-admin/, /admin.php, /cgi-bin/), infrastructure reconnaissance, and reading raw source files from our git repository at specific commit hashes. In particular, 8 requests targeted ld203_campaign_finance_findings.md in raw source format at the commit level, along with 20-red-hat-personnel-connections.md and corporate-structure-and-funding.md. These are our findings about Microsoft's own connections to systemd governance and Red Hat personnel. Someone at Microsoft is reading the primary sources of an investigation into Microsoft - from the git commit history, where they can see exactly what was added, removed, and when.

Meta is the most focused. They are not scanning broadly. They are pulling specific evidence documents by name: carver_40G_LA-131568.pdf, carver_10P_LA-114350.pdf, carver_30P_LA-112952.pdf, and carver_supp2025_LA-133066.pdf. Each document was accessed via multiple retrieval methods - raw file download, commit view, branch view, RSS feed - across multiple commits. This is someone ensuring they have every version of specific attestation documents. They also accessed /documents/meta_national_lobbying_findings/ - our analysis of their own lobbying expenditure. You do not accidentally download the same PDF through four different Forgejo API endpoints.

On the scale of this response: A typical investigative journalism site might see 1-3% of its traffic from corporate reconnaissance, mostly legal and PR teams checking published articles during business hours. We are at 15.3%, sustained 24 hours a day, from automated infrastructure rotating through over a thousand IP addresses. Google Cloud alone used 1,105 unique IPs in three days. That is not a team of analysts. That is a dedicated scanning cluster. The companies we are investigating have allocated more computational resources to monitoring this project than most organizations allocate to their entire security operations.

We find this proportionate to what we have published. We have found it informative about what we have not yet published.

What Happens Next

Every request from flagged networks is logged with full headers, timestamps, TLS fingerprints, and access sequences. These logs are archived off-site and append-only. They are a running record of which corporations are paying attention to this investigation and how much effort they are putting into it.

If any contributor to this project experiences professional retaliation, legal threats, or harassment that correlates with information only obtainable through this surveillance activity, these logs - and the pattern of access they demonstrate - will be published alongside the relevant findings. They will also be provided to journalists covering the open source governance story.

The research is built on public records. The Handelsregister filings are public. The lobbying disclosures are public. The systemd pull requests are public. The FTC settlement is public. You cannot make public records private by identifying who compiled them.

If you want to dispute a specific finding, the contact page and methodology are public. We welcome factual corrections from anyone, including the subjects of our reporting. We do not welcome silent intelligence gathering as a substitute for engaging with the substance of the work.

On platform suppression: Our social media accounts have been suspended within minutes of posting links to this research. We are aware this is happening. Suspending accounts does not remove public records from German corporate registries. It does not unfile IRS 990 disclosures. It does not redact SEC EDGAR proxy statements. It does not re-encrypt the Persona source code that was already leaked on a FedRAMP endpoint. Every finding on this site is independently verifiable using the commands we provide. Removing our ability to post about it on your platforms just means other people will post about it instead. They already are. We possess the means to circumvent your bans.

On continuity: All research data, access logs, attribution records, and unpublished findings are held in multiple independent locations by multiple individuals across multiple jurisdictions. Encrypted backups are distributed to trusted third parties with instructions to publish in full if any contributor to this project is subjected to physical harm, imprisonment, or disappearance. This arrangement is automated and does not require any action from the research team to execute. Automated systems on independent infrastructure will distribute materials to journalists, researchers, regulatory bodies, and public archives through anonymous channels. Harming the researchers does not suppress the research. It releases everything, including materials we have so far withheld out of discretion.

We expected corporations to monitor coverage of their activities. Our infrastructure was built for it. What we did not expect was the scale, or frankly, how sloppy it would be. Dozens of IPs from the same blocks, identical fingerprints, textbook rotation patterns. You are not hard to spot. You were never hard to spot. The difference now is that we have automated the process of spotting you, and you will be redirected here every time.

Continue scanning. Continue suspending. We will continue publishing.

The TBOTE Project