Skip to main content
The Daily London

London news, every day

News

London's Duplicate Image Problem: How the Capital Stacks Up Against New York, Amsterdam and Tokyo

Cities worldwide are grappling with how to audit and remove redundant visual records from public databases — and London's approach is drawing both praise and scrutiny.

Share

By London News Desk · Published 5 July 2026, 4:51 am

4 min read

Updated 3 h ago· 5 July 2026, 1:57 pm

How we reported this

This article was generated by AI from the linked public sources. The Daily London is independently owned and covers London news free from advertiser or sponsor influence. Read our editorial standards →

London's Duplicate Image Problem: How the Capital Stacks Up Against New York, Amsterdam and Tokyo
Photo: Photo by Tsvetelina Yankova on Pexels

London holds more than 70 million digitised images across its network of public archives, borough councils and NHS trusts — and a significant portion of them are duplicates. That is the operational reality confronting the Greater London Authority's digital asset teams as they push through a major rationalisation effort in 2026, one that has quietly become a test case for how large civic administrations manage sprawling visual records in the post-digitisation era.

The issue matters now because storage is not cheap, and the government's broader data reform agenda under Keir Starmer has put pressure on public bodies to demonstrate efficiency. Cabinet Office guidance issued earlier this year directed central and local government bodies to reduce unnecessary data redundancy by the end of the 2026-27 financial year. For a city like London — where Transport for London, the NHS North East London Integrated Care Board, the Metropolitan Police, and dozens of borough councils each maintain separate image libraries — the duplication problem is structural, not incidental.

What London Is Actually Doing

The most concrete effort so far has come from the London Metropolitan Archives on Northampton Road in Clerkenwell, which since January has been running a deduplication audit across roughly 4.2 million scanned heritage photographs. Staff there are using open-source perceptual hashing software — the same class of tool used by media libraries — to flag near-identical images before human reviewers make final deletion decisions. The archive has so far flagged approximately 340,000 potential duplicates, though confirmed removals remain in the tens of thousands pending sign-off.

Separately, the Wellcome Collection on Euston Road began a parallel review of its 100,000-image digital collection in March, focusing specifically on medical photography records shared across NHS partner databases. The Wellcome process is more cautious — duplicates are quarantined rather than deleted, and a retention review panel meets monthly. It is a slower model, but one designed to avoid the kind of irreversible loss that has caused problems elsewhere.

Transport for London's image library — which covers everything from engineering schematics to press photography — has taken a third route, outsourcing the deduplication work to a contracted digital asset management firm under a deal worth £1.8 million over three years, according to procurement records published on the GLA's contracts register in April 2026.

How London Compares to New York, Amsterdam and Tokyo

New York City's Department of Records and Information Services completed a comparable exercise in 2024, working through the Municipal Archives' holdings of around 2.5 million images. The New York process relied heavily on automated deletion with minimal manual review — faster, but it drew criticism from archivists after several historically significant photographs were reportedly lost in error. The city has since revised its policy to require human sign-off on any image predating 1980.

Amsterdam took the most aggressive centralisation approach. The Stadsarchief Amsterdam consolidated image holdings from 14 municipal departments into a single platform in 2023, cutting its overall storage footprint by roughly 28 percent in the first year alone. That figure, cited in the Stadsarchief's own annual report, has been held up as a benchmark by GLA digital officers, though London's fragmented governance structure — with 33 borough councils, each retaining independent data controls — makes a direct Amsterdam-style merger politically and logistically difficult.

Tokyo's approach through the Tokyo Metropolitan Archives has been the most conservative. The city opted for tagging and cross-referencing duplicates rather than removing them, prioritising discoverability over storage savings. The result is a larger but better-mapped dataset — useful for researchers, expensive for server budgets.

London currently sits somewhere between Amsterdam's efficiency drive and Tokyo's preservation instinct. That middle path has costs: the GLA's own digital infrastructure budget has grown by 12 percent year-on-year since 2023, partly because unresolved duplication inflates storage overhead.

Public bodies across London have until March 2027 to submit compliance reports against the Cabinet Office redundancy-reduction targets. Organisations that have not begun formal deduplication audits by October 2026 risk being flagged in a cross-government review. For borough councils still running legacy image systems — several in outer east London have yet to migrate to cloud-based asset management — that deadline is already uncomfortably close.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily London

Covering news in London. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to London news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily London and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the London brief

The day's London news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.