Skip to main content
The Daily London

London news, every day

News

London's Duplicate Image Problem: The Numbers Hiding Inside the Capital's Digital Archives

Councils, NHS trusts and cultural institutions across London are sitting on millions of redundant digital files — and the storage bill is climbing fast.

Share

By London News Desk · Published 5 July 2026, 4:51 am

4 min read

Updated 3 h ago· 5 July 2026, 1:57 pm

How we reported this

This article was generated by AI from the linked public sources. The Daily London is independently owned and covers London news free from advertiser or sponsor influence. Read our editorial standards →

London's Duplicate Image Problem: The Numbers Hiding Inside the Capital's Digital Archives
Photo: Photo by Jacek Herbut on Pexels

London's public sector organisations collectively store an estimated tens of millions of duplicate image files across their digital infrastructure, a problem that costs far more than most administrators acknowledge and that is quietly worsening as scanning and digitisation programmes accelerate under pressure from central government. The scale of the duplication, while rarely audited in full, is becoming harder to ignore.

The issue matters right now because the Starmer government's push to digitise public records — from NHS patient imaging to planning application archives at local authorities — is dramatically increasing the volume of digital assets held by London institutions. Without systematic deduplication, the redundant data compounds. Storage costs rise. Search tools degrade. And the institutions that most need clean, retrievable records — hospitals, planning departments, cultural bodies — end up working around the problem rather than solving it.

What the numbers actually show

Industry benchmarks from digital asset management research suggest that between 20 and 40 percent of files in large unmanaged image repositories are duplicates or near-duplicates. Apply even the lower end of that range to a major London institution and the figures become stark. The Wellcome Collection on Euston Road, which holds one of the world's largest medical heritage archives and has been actively digitising holdings since 2015, manages hundreds of thousands of image files. The Museum of London Archaeology, known as MOLA and operating out of offices near Aldgate, routinely processes photographic records from active excavation sites across the city — sites that generate dozens of near-identical shots per dig day.

Neither institution is unusual in this respect. What makes London distinctive is the sheer density of overlapping digitisation programmes running simultaneously. Transport for London's asset management teams photograph infrastructure continuously. The thirty-two borough councils each maintain their own planning image archives, many of which have never been cross-referenced with the Greater London Authority's own records at City Hall. The NHS's seven integrated care boards covering London are in the middle of migrating legacy imaging data — X-rays, MRI scans, clinical photographs — onto shared platforms under NHS England's What Good Looks Like digital maturity framework, a process expected to run through 2027.

Storage is not cheap. Commercial cloud storage for healthcare-grade data in the UK, which must comply with NHS Data Security and Protection Toolkit requirements, runs at a meaningfully higher cost per terabyte than standard enterprise storage. Analysts tracking UK public sector IT spending have noted that London NHS trusts collectively spend hundreds of millions of pounds annually on digital infrastructure, a figure that includes storage that has never been audited for redundancy. The Integrated Care Board for North Central London, covering boroughs from Camden to Barnet, began a digital infrastructure review in early 2026 — but deduplication of image assets was not listed among its initial work streams in publicly available documentation.

What councils and institutions can do

The practical tools exist. Perceptual hashing — software that identifies images that are visually identical even if their file metadata differs — has been deployed effectively in archive projects at institutions including the Victoria and Albert Museum in South Kensington, where curators have used it to consolidate acquisition photography. The process can reduce storage loads by double-digit percentages within a single archive in a matter of weeks. The barrier is rarely technical. It is organisational: agreeing which version of a duplicated file is the authoritative one requires governance decisions that cut across departmental boundaries.

For London's borough councils, the most immediate opportunity sits inside their planning portals. Tower Hamlets, Southwark and Hackney — all managing high volumes of development applications in 2025 and 2026 — each receive multiple image submissions per application, many of them near-identical site photographs uploaded by different parties. A coordinated deduplication standard, applied at the point of upload rather than retrospectively, would reduce archive bloat before it compounds.

The government's Central Digital and Data Office is due to publish updated guidance on public sector data quality standards later in 2026. Whether that guidance addresses image duplication specifically remains unclear from documents released so far. For now, the cost of doing nothing accumulates quietly, one redundant file at a time.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily London

Covering news in London. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to London news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily London and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the London brief

The day's London news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.