London's planning and heritage bodies are sitting on tens of thousands of duplicate digital images across public-facing databases, a problem that has quietly ballooned since councils accelerated their digitisation drives after 2020. The Greater London Authority's own data management review, circulated internally earlier this year, flagged that duplicated imagery in planning portal submissions alone was consuming significant server capacity and slowing case officer workflow across all 33 boroughs.
The issue matters now because the Starmer government's planning reform agenda — centred on the Planning and Infrastructure Bill moving through Parliament this year — is pushing local authorities to modernise their digital infrastructure faster than many are equipped to handle. More applications, more digital attachments, more duplicates. Historic England, which manages the National Heritage List covering thousands of London entries from Bermondsey to Barnet, has separately acknowledged a need to clean up its image asset records, though no public timeline has been set for a dedicated deduplication programme.
What London Is Actually Doing
Two organisations are ahead of the curve inside the capital. The London Metropolitan Archives, based on Northampton Road in Clerkenwell, began a structured deduplication audit of its digitised photograph collections in January 2025, using open-source image-matching software to flag near-identical scans. Staff there are working through an estimated 1.2 million digitised images, a project expected to run until at least mid-2027. Meanwhile, the Museum of London — now relocated to its new West Smithfield site — built deduplication protocols into its digital collections migration from the old London Wall building, treating the move as an opportunity to strip redundant image files before they embedded themselves in the new system.
Southwark Council's planning department ran a pilot in late 2024 to auto-detect duplicate image submissions in planning applications, partly in response to applicants accidentally re-uploading identical elevation drawings across multiple documents. The pilot covered around 400 applications and identified duplicate imagery in roughly one in six cases. The council has not yet rolled out the system borough-wide.
How London Compares With New York, Amsterdam and Tokyo
New York City is further along. The NYC Department of Records and Information Services completed a deduplication pass of its municipal photograph archive — covering more than 800,000 images — by December 2024, using a combination of perceptual hashing algorithms and manual review. The city dedicated a full-time team of four archivists to the project for 18 months. London has no equivalent centrally funded, cross-authority effort.
Amsterdam is arguably the most aggressive. The Amsterdam City Archives, working with the Netherlands' national digital infrastructure programme, adopted a continuous deduplication model in 2023 — meaning new images are checked against existing holdings at the point of upload rather than in retrospective batch audits. The system cost an estimated €340,000 to implement. London bodies have largely stuck with periodic batch reviews, which archivists say leaves the problem compounding between audit cycles.
Tokyo's approach is more decentralised, mirroring London's fragmented borough structure. Japan's National Archives has a national standard for image metadata that helps prevent duplicates at source, but municipal-level implementation across Tokyo's 23 special wards is patchy. In that respect, London and Tokyo face similar structural headaches: no single authority owns the problem, so no single authority fixes it.
The practical gap for Londoners shows up in planning searches. Anyone pulling historical site images through the Planning Portal or a borough's local list can encounter the same photograph filed under multiple reference numbers, with no flag indicating which is the authoritative version. That creates risk in heritage assessments, where case officers need confidence they are looking at the correct, unduplicated visual record of a listed building or conservation area.
The GLA's digital team is expected to publish its data management recommendations before the end of the third quarter of 2026. If those recommendations include mandatory deduplication standards for borough planning portals — something being discussed, according to published GLA committee papers — London could move quickly toward the Amsterdam model. For now, progress depends on individual institutions deciding the problem is worth solving on their own budgets, borough by borough, archive by archive.