London's borough councils and the Greater London Authority are sitting on a backlog of duplicated digital images estimated to run into the tens of millions of files, clogging planning portals, slowing NHS document-sharing systems and inflating cloud storage costs at a time when local government budgets are already under serious strain. The scale of the problem has become harder to ignore as the Starmer government pushes major planning reform through Westminster, demanding faster, more transparent decision-making from every tier of local administration.
Duplicate images — identical or near-identical scanned documents stored multiple times across different departmental databases — are not a new phenomenon, but the pace of post-pandemic digitisation has turned a manageable nuisance into a structural headache. Councils that rushed paper planning files online between 2020 and 2023 often did so without deduplication protocols, leaving registries bloated and search results unreliable. For residents trying to pull planning histories on streets like Coldharbour Lane in Brixton or parcels along the Silvertown Tunnel corridor in Newham, the experience can mean wading through dozens of duplicate scans of a single document before finding the original.
What London Is Doing — and Where It Falls Short
Two initiatives are worth watching. The London Digital Planning programme, coordinated through City Hall and backed by funding from the Department for Levelling Up's successor department, is piloting automated deduplication tools across six borough councils, including Southwark and Tower Hamlets. The programme uses perceptual hashing — a technique that identifies visually similar images even when file names or metadata differ — to flag redundant files before archivists delete them. Southwark Council's planning portal alone holds records stretching back to the late 1990s, and early results from the pilot, which launched in January 2026, reportedly identified duplicate rates of more than 30 percent in scanned legacy documents.
The Wellcome Collection on Euston Road, which manages one of the UK's largest digitised medical image archives, began its own deduplication project in 2024 and has been sharing methodology with NHS trusts trying to reduce redundant imaging files across patient record systems. That cross-sector knowledge transfer is one area where London's approach is genuinely ahead of many peers.
New York City, by comparison, has been running a citywide deduplication mandate through its Department of Records and Information Services since 2022, applying it to all borough-level digitisation projects. The mandate requires councils — there called agencies — to certify deduplication compliance before new cloud storage contracts are approved. Amsterdam's municipal archive, the Stadsarchief, completed a full deduplication sweep of its digitised canal-district planning records in 2023, cutting storage costs by roughly 22 percent according to the archive's published annual report for that year. London has no equivalent citywide mandate yet.
The Cost Question and What Comes Next
Cloud storage is not free. Government procurement data from the Crown Commercial Service shows that local authorities collectively spend hundreds of millions of pounds annually on cloud infrastructure, and storage of duplicated image files is a recognised inefficiency flagged in multiple National Audit Office reviews of public sector IT. Without a binding deduplication standard, London's 33 boroughs are left to address the problem individually, with varying levels of technical capacity and budget.
The GLA has indicated that the Digital Planning programme will expand to all London boroughs by the end of 2027, though that timeline depends on central government funding commitments that have not yet been confirmed for the next spending review period. Campaigners at the Open Data Institute, based in King's Cross, have been calling for a public-facing dashboard that would let residents track deduplication progress across boroughs — a tool Amsterdam already provides through its open-data portal.
For Londoners dealing with planning applications or FOI requests right now, the practical advice is blunt: if you are pulling documents from a borough planning portal, cross-reference file creation dates and document reference numbers manually, because automated deduplication is still patchy. The fix exists. The question is how long it takes City Hall to make it mandatory across the board.