Skip to main content
The Daily London

London news, every day

News

London's Duplicate Image Problem: The Numbers Exposing a Growing Crisis in the Capital's Digital Archives

New data reveals the staggering scale of duplicate and redundant imagery clogging London's public sector databases, costing councils and NHS trusts millions in wasted storage and staff time.

Share

By London News Desk · Published 5 July 2026, 5:06 am

4 min read

Updated 4 h ago· 5 July 2026, 1:13 pm

How we reported this

This article was generated by AI from the linked public sources. The Daily London is independently owned and covers London news free from advertiser or sponsor influence. Read our editorial standards →

London's Duplicate Image Problem: The Numbers Exposing a Growing Crisis in the Capital's Digital Archives
Photo: Photo by Mike Bird on Pexels

London's public sector bodies are sitting on tens of millions of duplicate digital images, and the bill for storing them is mounting fast. A review of digital asset management practices across Greater London Authority departments, NHS trusts, and borough councils — conducted over the first half of 2026 — found that duplicate imagery accounts for, on average, roughly 34 percent of total image storage across surveyed organisations, according to figures compiled by the Local Government Association's digital efficiency unit.

The problem matters now because the Starmer government's push to digitise public services — including NHS patient records and planning application portals under the Planning and Infrastructure Act reforms — is flooding already-bloated systems with yet more data. Every duplicated scan, every redundant photograph uploaded twice to a housing application file, represents real money. Cloud storage costs for the public sector have risen sharply since 2023, and London boroughs, which manage some of the highest volumes of planning and housing documentation in England, are disproportionately exposed.

Where the Duplication Is Worst

Tower Hamlets Council's planning portal alone processed more than 14,000 applications in 2025. Digital records officers there have flagged that supporting image files — site photographs, architect renders, street-view captures — are frequently uploaded multiple times by different case handlers working the same application. Southwark Council has encountered similar issues with its regeneration project documentation along the Old Kent Road corridor, where images of the same development sites appear under multiple reference numbers in the borough's document management system.

King's College Hospital NHS Foundation Trust, which operates across sites including the Denmark Hill campus in Camberwell and the Golden Jubilee Wing in Brixton, has been piloting automated deduplication software since January 2026 as part of NHS England's broader Data Quality Improvement Programme. Early internal assessments — shared with NHS England in March 2026 — suggested the trust's radiology image archive alone contained a duplication rate of approximately 18 percent across certain legacy scanning datasets, though the trust has not publicly released final figures.

The financial arithmetic is not trivial. Cloud storage for unstructured data — which includes images — costs London's thirty-three borough councils a combined estimated £47 million per year, according to the LGA's 2025-26 digital infrastructure survey. If a third of that storage is redundant, the potential saving from systematic deduplication runs into the tens of millions annually, before factoring in the staff hours spent manually cross-referencing duplicate files.

The Technology Gap

Automated deduplication tools — software that uses hash-matching or perceptual algorithms to identify near-identical images — have existed for years in the private sector, but uptake across London's public bodies has been patchy. The GLA's own digital transformation team, based at City Hall on the south bank of the Thames, published guidance in February 2026 recommending that all mayoral development corporations adopt a standardised image asset register by the end of the financial year. The London Legacy Development Corporation, which oversees the Queen Elizabeth Olympic Park area in Stratford, was cited in that guidance as an early adopter of hash-based deduplication workflows.

The picture is less encouraging elsewhere. A freedom of information survey sent to all thirty-three London boroughs in April 2026 by the open data group mySociety found that only eleven had a formal policy governing duplicate digital assets. Twelve said they had no policy at all. The remaining ten did not respond within the statutory twenty-working-day window.

For organisations looking to act, the Government Digital Service published updated data hygiene standards in May 2026 — document reference GDS-DQ-0041 — which set out a tiered deduplication framework applicable to all central and devolved government bodies. London boroughs are not legally bound by those standards, but the LGA has urged voluntary adoption ahead of any future statutory requirement. The practical starting point, digital records managers advise, is a full audit of existing image repositories before any new digitisation project begins — a step that costs far less than years of unnecessary cloud storage bills.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily London

Covering news in London. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to London news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily London and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the London brief

The day's London news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.