Bran (Brandon) Myers
Public-Interest Investigation · Follow-up · June 2026

Statistics or Operations? What the UK's Own Documents Admit

A follow-up to The Quiet Joining-Up. Built entirely from public government documents. Every claim carries a confidence level; what was not checked is marked as not checked, not as cleared.

I didn't set out to be angry about this. I set out to be wrong — to find the line in a government document that says, plainly, this linked-data machine only ever counts populations; it never reaches down and touches a named person.

I couldn't find that line. What I found instead was the government drawing the line itself — and then, in a separate document, stepping over it.

So this piece asks one narrow question, and refuses to inflate it: did the UK's linked-administrative-data infrastructure stay statistical and research-only, or are there public documents showing it moving toward operational, near-real-time decisions about identifiable individuals? Not "is it illegal." Just: which way is it moving, and who says so.

The honest answer is two-track — and the split is in their own words.

The verdict in one paragraph

The named research programme — MoJ Data First, funded by ADR UK, accessed only inside the ONS Secure Research Service — really is de-identified, accredited-researcher-only, and retrospective. The Ministry of Justice's own Algorithmic Transparency Record says so in as many words. But the same linkage tool, Splink, appears in a different transparency record running operationally on identifiable people: in courts to pull a defendant's probation record for sentencing, in "both batch and real-time deployments," in a piloted real-time Core Person Record, and in a daily feed that sends probation supervisees' police numbers to the police. A third, separate strand — DWP/HMRC fraud and the new Eligibility Verification powers — is unambiguously operational and person-level. The research infrastructure held the line it drew. The wider machinery, built on the same tooling, has stepped past it.

Track 1 — the research line (research-only, as labelled)

Data First makes person- and case-level justice records — magistrates' and Crown courts, custody, probation, offender assessments, family and civil courts — available de-identified, to vetted researchers only, inside controlled "safe settings." Its stated purpose is to spot trends and inform policy. The MoJ states it directly:

The Data First datasets are not integrated into a decision-making process… not for operational decision-making.

Source: MoJ Algorithmic Transparency Record, moj-data-first-splink (17 Dec 2024). HIGH confidence.

Take that at face value. For the Data First datasets, I found nothing contradicting it. That part of the system did what it said.

Track 2 — the operational line (the same tool, identifiable people)

This is the finding. Splink is not only a research instrument. A second transparency record — the Splink “Master Record” — describes the same tool in live justice operations:

It is used in courts to find probation records associated with individuals coming to court… It is being piloted as part of Core Person Record, a product that aims to create a unique identifier for persons across prisons, probation and the criminal courts… It is used to find Police National Computer (PNC) numbers associated with individuals, in order to request relevant arrest information from the police.

Source: MoJ Algorithmic Transparency Record, moj-splink-master-record (6 Oct 2025). HIGH confidence. The GDS overview confirms the same tool runs “in both batch and real-time deployments.”

Read it again. In courts. For individuals. In real time. With a daily line to the police. That is not counting a population. That is resolving a specific person and acting on them.

A separate strand makes the operational turn even plainer. Under the Public Authorities (Fraud, Error and Recovery) Act 2025, banks will be required to match benefit-recipient accounts against DWP eligibility indicators and hand back identifiable details:

…match these accounts to specific eligibility indicators… specified details about the account holder(s) (such as their name(s) and date(s) of birth)… considered by DWP for further inquiry.

Source: DWP Eligibility Verification factsheet, and the DWP/HMRC fraud Data Usage Agreement — “a risking API will generate a risk score for an attribute bank account” (both 22 Jul 2025). HIGH confidence. DWP states a human is always involved; implementation from 2026. This is a SEPARATE programme from Data First.

What I will not claim (the part that keeps this honest)

I do not claim the Data First research datasets are used operationally. The MoJ denies it and I found no document contradicting that. “Splink-the-tool is operational” is true; “Data First datasets are operational” is not something the evidence supports. Conflating the two is the mistake that would discredit everything else.

NHS, Home Office immigration enforcement, and the ONS Integrated Data Service — I went back and checked. The verdicts are in the follow-up at the end of this piece: NHS came back mixed (care-direction operational, not enforcement), the Home Office operational but historical (withdrawn 2018), and ONS IDS still unconfirmed either way. A gap is not an acquittal — so rather than leave it a gap, I went and got the documents.

No evidence of fully automated decisions with no human. Every DWP document I read asserts a human in the loop. I take that as stated, and note it is the agencies' own description.

• The Core Person Record and the daily police feed are described as pilots, not confirmed national rollout. The trajectory is the story; the full deployment is an open question.

The shape of it, on a timeline

Jul 2021 — Splink published as MoJ open-source, framed as research/insight.

Sep 2022 — Splink shares linked datasets with accredited researchers via ADR UK Data First.

Dec 2024 — Data First transparency record: “not for operational decision-making.”

Jul 2025 — DWP/HMRC fraud agreement + Eligibility Verification factsheet: risk scoring, bank-account matching.

Oct 2025 — Splink “Master Record” transparency record: courts, real-time, Core Person Record pilot, daily PNC feed to police.

Dec 2025 — Fraud, Error & Recovery Act receives Royal Assent; verification powers live from 2026.

The research framing was published in 2024. The operational record was published in 2025. Same tool. Eleven months apart.

Why it matters

A statistical system that counts a population and an operational system that identifies a person and pulls a police record are not the same thing wearing two hats. They carry different duties — of notification, of redress, of the right to object. The public conversation, the safe-settings, the Five Safes vocabulary, the entire reassurance apparatus, was built for the first. The documents now describe the second. Nobody appears to have re-asked the consent question on the way across.

That's the whole point. Not outrage — the record. Their record.

← Read the original report: The Quiet Joining-Up

↗ Full repository, evidence & supporting investigations (GitHub)

Verification: 5 search angles → 19 primary sources fetched → 87 candidate claims → 25 adversarially verified (2-of-3 refutation vote required to kill a claim) → 24 confirmed, 1 killed → synthesized. The killed claim — “Splink is not used operationally” — is exactly what splits this into two tracks: it is true of the Data First datasets, false of Splink the tool. Public sources only; no non-public system was accessed.

Follow-up: the three I hadn't checked

When I first published this, I flagged NHS, Home Office immigration enforcement, and the ONS Integrated Data Service as unsearched — not cleared. That bothered me, so I ran a second pass under the same rule: only their own documents, only claims that survive an adversarial check. Here is what came back.

NHS — verdict: MIXED / two-track

NHS England runs the same split the Ministry of Justice does. On the research side, its Data Science team is “implementing a probabilistic linkage model using Splink, in order to improve linkage outcomes, and by extension, patient outcomes” — and NHS staff published the method themselves (IJPDS art. 3271, Aug 2025), confirming an in-house Splink build that links health data to the Personal Demographics Service as a national linkage spine.

But the operational side is explicit. The Federated Data Platform — built on Palantir Foundry — is, in NHS England's own words: “gives frontline staff quick access to patient information… view the most current patient data, manage waiting lists, schedule operations… it will not be used for external research.” It carries live triage tools that prioritise “those with the most urgent needs.

Sources: NHS England FDP FAQs; NHS England Data Linkage Hub; IJPDS art. 3271. HIGH confidence.

The fair reading matters here. NHS operational linkage is pointed at care — shorter waits, fewer duplicate records, faster triage — not enforcement. That is a different thing from a daily feed to the police, and I won't blur the two. The honest finding is narrower: the same probabilistic-linkage machinery now sits on both a research track and a real-time, identifiable, frontline track — and patients can't opt out of the operational one, because NHS England classes it as direct care.

Home Office — verdict: OPERATIONAL, but historical

This is the one to state carefully, because the strongest version of it is also the most out of date. A 2017 Memorandum of Understanding between NHS Digital and the Home Office “allowed the Home Office to access patient data – including non-clinical information – for the purpose of tracing immigration offenders” — used, in the agreement's own words, as “a last resort… where the last known address has proven useless.” In 2016 that channel ran 8,127 requests and traced 5,854 people.

Source: UK Parliament Health & Social Care Committee written evidence MOU0009. The MoU was suspended in May 2018 and withdrawn on 9 November 2018. HIGH confidence — and it is no longer live.

So: confirmed that NHS patient data was once used operationally to locate identifiable individuals for immigration enforcement, and confirmed that the formal route for it was shut down in 2018. The live concern is now capability, not a known pipeline — Palantir's Foundry (the FDP base) is, per Palantir's own documentation, interoperable with its enforcement-oriented Gotham platform, with “‘drag and drop’ of data being possible between the two systems.” That is a real technical fact. It is not evidence that any NHS data has actually moved that way, and I'm marking it as exactly that: a capability and an open question, not a finding.

ONS Integrated Data Service — verdict: insufficient evidence

I'll be blunt: this pass confirmed nothing about IDS, in either direction. Not one verified claim landed on it. So I won't tell you it's a clean research-only safe-setting, and I won't tell you it's drifting operational — I don't have the documents yet. It stays open. An empty box is less satisfying than a verdict, but inventing one would be the exact dishonesty this whole piece is supposed to refuse.

Second pass: 5 search angles → 21 primary sources fetched → 91 candidate claims → 25 adversarially verified → 24 confirmed, 1 killed. The killed claim was an overstatement on my side — that nationally-shared FDP data stays identifiable. It doesn't: identifiable processing is at trust level and de-identified when shared upward. Correcting our own overreach is part of the method, not an exception to it. Public sources only; no non-public system was accessed.

← All Writing