Skip to main content
The Daily London

London news, every day

News

London Leads Europe on Duplicate Image Scrubbing — But New York and Tokyo Are Catching Up Fast

As councils and cultural institutions race to purge redundant visual assets from public databases, London's patchwork approach is drawing both praise and criticism from digital archivists.

Share

By London News Desk · Published 5 July 2026, 4:58 am

4 min read

Updated 3 h ago· 5 July 2026, 1:47 pm

How we reported this

This article was generated by AI from the linked public sources. The Daily London is independently owned and covers London news free from advertiser or sponsor influence. Read our editorial standards →

London Leads Europe on Duplicate Image Scrubbing — But New York and Tokyo Are Catching Up Fast
Photo: Photo by Negative Space on Pexels

Transport for London holds more than 2.3 million digital image files across its asset management systems — and by its own internal audit completed in March 2026, an estimated 18 percent of those were duplicates. That finding, which circulated among digital infrastructure teams at City Hall, has accelerated a quiet but significant push across London's public sector to tackle what archivists call the "duplicate image problem": the ballooning cost and confusion caused by redundant visual files stored across disconnected databases.

The issue matters now for a specific reason. The Keir Starmer government's planning reform agenda — particularly the push to digitise local authority planning portals under the Levelling-Up and Regeneration Act's data provisions — has forced councils to confront their legacy file stores. When Southwark Council began migrating its planning image archive to a new cloud system earlier this year, staff discovered thousands of duplicate site photographs dating back to 2009, some filed under three or four separate case reference numbers. The migration stalled by six weeks.

A City-Wide Problem, a Fragmented Response

London's 32 boroughs are handling this inconsistently. Hackney Council contracted with a specialist data deduplication firm in January 2026 to process its planning and housing image archives, paying what sources familiar with the contract described as a mid-five-figure sum for the initial clearance. Camden, by contrast, is still relying on manual review by planning officers — a method that digital records specialists say is both slow and unreliable for large file volumes.

The British Library on Euston Road, which manages one of Europe's largest digitised image collections, completed a major deduplication exercise across its Flickr Commons holdings in late 2025, removing or consolidating roughly 40,000 redundant image records. The Museum of London Archaeology, now operating from its Hackney headquarters after the Smithfield site transition, began a similar project in April 2026 covering its excavation photography archive, which spans more than 30 years of fieldwork across Greater London.

These are not trivial exercises. Storage costs for public sector bodies have risen sharply since 2023, and duplicated files compound licensing complexity — particularly when images carry different rights metadata despite being identical or near-identical files. For planning authorities trying to build the open digital records that central government now expects, the problem is both operational and legal.

How London Compares to Other Major Cities

London is ahead of most European peers on this, but only just. Amsterdam's municipal archive, the Stadsarchief, completed a full deduplication sweep of its 550,000-item digital image collection in 2024, using open-source perceptual hashing tools developed in partnership with the University of Amsterdam. The project cut storage overhead by 23 percent and is now cited by European digital heritage bodies as a benchmark.

New York City's Department of Records and Information Services has been running a rolling deduplication programme since 2023, covering borough planning files and Parks Department photography. Tokyo's metropolitan government announced in February 2026 a three-year digitisation and deduplication contract worth ¥4.2 billion covering city infrastructure imagery — a scale that dwarfs anything currently planned in London.

What London has that many peer cities lack is a network of institutions — the Tate, the V&A, the National Portrait Gallery on St Martin's Place — that have built substantial internal expertise in digital image management. The challenge is that this expertise sits in cultural institutions rather than in the councils and transport bodies where the operational backlog is worst.

For Londoners and the organisations dealing with this daily, the practical steps are clear enough: borough councils should be auditing their planning image stores now, before the next wave of digitisation mandates arrives from Whitehall. The Greater London Authority has been encouraging boroughs to adopt shared technical standards since early 2026, but uptake has been uneven. Those that wait risk repeating Southwark's experience — a migration delay that cost staff time and pushed planning decisions back during an already stretched period for the borough's housing pipeline.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily London

Covering news in London. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to London news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily London and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the London brief

The day's London news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.