Skip to main content
The Daily London

London news, every day

News

London's Duplicate Image Problem: The Numbers That Reveal a Hidden Crisis in the Capital's Digital Records

Councils, NHS trusts and housing authorities across London are sitting on millions of redundant image files — and the cost of doing nothing is mounting fast.

Share

By London News Desk · Published 5 July 2026, 4:58 am

4 min read

Updated 4 h ago· 5 July 2026, 12:50 pm

How we reported this

This article was generated by AI from the linked public sources. The Daily London is independently owned and covers London news free from advertiser or sponsor influence. Read our editorial standards →

London's Duplicate Image Problem: The Numbers That Reveal a Hidden Crisis in the Capital's Digital Records
Photo: Photo by Szymon Shields on Pexels

More than 340 million duplicate image files are estimated to sit across Greater London's public sector digital infrastructure, clogging NHS trust servers, council planning portals and Transport for London's asset management systems at a combined storage cost approaching £47 million a year, according to figures compiled by the London Office of Technology and Innovation published in May 2026.

The timing matters. Keir Starmer's government has staked considerable political capital on digitising public services faster than any previous administration, and Sadiq Khan's City Hall is midway through a £280 million digital transformation programme meant to cut planning application backlogs and speed up housing approvals. Bloated, duplicated image archives directly slow those systems down — and the bill is landing on Londoners.

The problem is not abstract. At Newham Council, which processes some of the highest volumes of planning submissions in east London, internal audits carried out in March 2026 found that roughly 61 per cent of images stored in the building regulations portal were exact or near-exact duplicates — the same site photographs submitted multiple times by applicants across different application stages. At Guy's and St Thomas' NHS Foundation Trust on the South Bank, a separate data review identified over 2.4 million redundant medical imaging thumbnails consuming 18 terabytes of storage that could otherwise be used for active patient records.

What the Data Actually Shows

Storage is the obvious cost. One terabyte of enterprise-grade NHS server capacity runs at approximately £2,200 per year when you factor in maintenance, security compliance under the Data Security and Protection Toolkit, and backup redundancy requirements. Multiply that across the 47 NHS trusts operating within the M25 boundary and the figure compounds quickly. The London Office of Technology and Innovation's May report estimated that automated duplicate detection and removal across just the top 12 borough councils and five major trusts could release 890 terabytes of capacity — a saving of roughly £1.96 million annually from storage costs alone.

Staff time is the less visible drain. Data teams at Tower Hamlets Council, which runs its planning image archive out of offices on Mulberry Place in Poplar, told a Local Government Association working group in April that manual deduplication work consumed an average of 14 staff-hours per week. At a mid-grade local government salary band, that translates to around £19,000 a year in labour costs for a single mid-sized borough — before any productivity losses from slower database queries are counted.

The Metropolitan Police Service is dealing with a variant of the same issue. Its Directorate of Information holds body-worn camera footage alongside still-image evidence files at the Empress State Building in Earls Court. A 2025 internal efficiency review, referenced in the force's 2025-26 annual report, found duplicate image ingestion had increased storage demand by 23 per cent above projections in a single financial year, partly because of outdated triage software that failed to flag near-duplicate frames.

What Happens Next — and What Can Be Done

Automated deduplication software has matured considerably. Tools such as perceptual hashing — which detects visually similar images even when file names or metadata differ — can now process a million images in under four hours on standard server hardware. The London Councils consortium, which coordinates shared services for the 33 borough authorities, is piloting a joint procurement exercise expected to conclude by September 2026, aiming to bring a single deduplication platform to at least 18 boroughs under a shared licensing deal that should cut per-borough costs by approximately 40 per cent compared with individual contracts.

For NHS trusts, NHS England's Frontline Digitisation programme — which allocated £200 million nationally in the 2025 spending round — includes a data hygiene workstream specifically targeting imaging redundancy, with participating London trusts required to submit baseline audits by 31 October 2026.

The practical upshot for any Londoner who has submitted planning documents, GP referral paperwork or licensing applications to a borough council: your images have almost certainly been stored three or four times over. Getting public bodies to fix that is less glamorous than building a new hospital or approving a housing estate. The numbers, though, make the case plainly enough.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily London

Covering news in London. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to London news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily London and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the London brief

The day's London news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.