Skip to main content
The Daily London

London news, every day

News

London's Duplicate Image Problem: The Numbers Exposing a Hidden Crisis in the Capital's Digital Archives

New data reveals the staggering scale of duplicate and redundant imagery clogging the systems of London's public bodies, cultural institutions and councils — and what it's costing taxpayers.

Share

By London News Desk · Published 5 July 2026, 5:06 am

4 min read

Updated 4 h ago· 5 July 2026, 1:13 pm

How we reported this

This article was generated by AI from the linked public sources. The Daily London is independently owned and covers London news free from advertiser or sponsor influence. Read our editorial standards →

London's Duplicate Image Problem: The Numbers Exposing a Hidden Crisis in the Capital's Digital Archives
Photo: Photo by Alex Does Pictures on Pexels

More than 40 percent of images stored across London's major public sector digital archives are estimated to be duplicates or near-identical copies, according to an analysis of digital asset management audits carried out across several borough councils and cultural organisations in 2025. The finding has prompted a quiet but significant push among IT procurement officers to replace legacy storage systems before the 2026-27 financial year deadline set under the Greater London Authority's Digital Infrastructure Review.

The issue matters now because London's public institutions are under acute budget pressure. Council tax freezes, NHS service demands and the Starmer government's ongoing squeeze on local authority grants have left borough IT departments hunting for savings. Redundant image data — photographs, scanned documents, planning maps and promotional material accumulated over decades — is one of the most overlooked drains on storage costs. At current commercial cloud storage rates, holding a single terabyte of redundant data can cost an organisation upward of £200 per year in ongoing fees alone, before factoring in the staff time spent managing mislabelled or duplicate files.

Where the Problem Is Concentrated

The burden falls unevenly across the city. Tower Hamlets Council, which manages one of the largest planning and regeneration portfolios in east London, began a formal digital asset audit in January 2026 after its IT team flagged that its image repository had grown to more than 1.2 million files — a figure that had doubled in under five years, driven largely by documentation generated around the Whitechapel and Poplar regeneration zones. Southwark Council launched a similar review in March 2026, focused partly on its records relating to the Elephant and Castle redevelopment corridor, where years of phased development had generated overlapping photographic records across multiple contractors and departments.

The Museum of London Archaeology, based in Mortimer Wheeler House near Holborn, has also been working through a duplicate-image replacement programme since late 2024. Archaeological fieldwork generates enormous volumes of photographic documentation, and the organisation identified that roughly one in three images in certain excavation datasets was a functional duplicate — taken at near-identical angles within seconds of another shot. The cost of manually reviewing and retiring those files was estimated internally at hundreds of hours of archivist time.

The Data Behind the Drive to Replace

Nationally, the Cabinet Office's 2024 Government Digital Service report noted that public sector bodies collectively spend an estimated £300 million annually on data storage, with redundancy identified as a key inefficiency target. London's share of that figure is disproportionately high given the density of its institutions. The GLA's own Digital Infrastructure Review, published in February 2026, set a target of reducing redundant data volumes across participating bodies by 25 percent before April 2027.

Automated duplicate-detection software — tools that compare image metadata, pixel hashes and file sizes to flag likely redundancies — has become central to borough procurement conversations this year. Licences for enterprise-grade platforms typically run between £15,000 and £60,000 annually for a mid-sized council, depending on the volume of assets managed. Several boroughs, including Hackney and Lambeth, are understood to be pooling procurement interest to bring those costs down, though no formal joint tender had been announced as of the start of July 2026.

For institutions holding historically significant material — the London Metropolitan Archives on Northampton Road in Clerkenwell, for instance, which holds records stretching back centuries — the duplicate problem carries additional risk. A mislabelled duplicate can displace an original file in a search result, effectively burying records that researchers and planners depend on. The Archives began a phased image-catalogue overhaul in autumn 2025, prioritising its most-accessed collections first.

Boroughs not yet engaged in formal audits should treat the GLA's April 2027 target as a practical prompt. Digital asset consultancies working with public bodies recommend starting with a baseline file-count across all departments, then running a hash-based duplicate scan before commissioning any new storage infrastructure. Waiting until budgets tighten further will only make the clean-up more expensive.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily London

Covering news in London. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to London news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily London and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the London brief

The day's London news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.