Transport for London quietly completed a 14-month audit of its digital signage estate in March 2026, purging roughly 4,200 duplicate wayfinding images from the system that feeds screens across 272 Underground stations. The exercise, run through TfL's Digital Experience directorate, was prompted by a 2024 internal review that found redundant image files were slowing real-time updates on the Elizabeth line — particularly at Paddington and Canary Wharf, where passenger-facing screens had, in some cases, been pulling from three separate copies of the same platform graphic. Nobody publicly flagged the problem until it started causing display lag during the post-Christmas engineering works.
The timing matters. Across local government and public institutions, the pressure to maintain clean, non-duplicated digital asset libraries has intensified sharply since the UK Government's Central Digital and Data Office published its Data Quality Framework in late 2024, setting minimum standards for public-sector image and document repositories. Councils and arms-length bodies that fail to meet the framework's deduplication benchmarks by April 2027 risk losing access to certain shared infrastructure grants. For a city the size of London — with 33 borough councils, multiple mayoral agencies, and institutions like the British Museum and the Wellcome Collection each running their own digital collections — that deadline is concentrating minds.
What London Is Actually Doing
The Greater London Authority's Smart London Board convened a working group in January 2026 specifically on digital asset governance. The group has been mapping deduplication practices across borough councils, with early findings showing sharp divergence. Hackney Council, which rebuilt its content management infrastructure after its 2020 ransomware attack, now runs automated duplicate-detection across all public-facing image libraries weekly. Southwark, by contrast, only began a manual audit of its planning portal image database in February 2026, after a Freedom of Information request revealed the portal contained more than 800 duplicate building-condition photographs submitted by developers.
The British Library's digital preservation team at St Pancras has been operating a deduplication pipeline since 2021, processing roughly 2.3 million digitised items annually through checksum-based matching software. The library does not disclose its full vendor contracts, but its publicly available digital preservation strategy — updated in 2025 — describes a target of less than 0.5 percent duplication across new ingest batches. That is a materially tighter standard than most borough councils currently maintain.
How London Compares to New York, Berlin and Seoul
New York City's Department of Records and Information Services has been running a centralised deduplication programme since 2022 across its Municipal Archives, covering approximately 900,000 historical photographs. The programme, funded partly through a $3.2 million federal digitisation grant, uses perceptual hashing — a technique that catches near-identical images even when file names differ. London has no equivalent city-wide programme. Digital asset management remains siloed by institution, with no single mayoral body holding oversight across the full estate.
Berlin's Stadtmuseum consortium, which covers eight municipal museums, mandated shared image-repository standards in 2023, reducing reported storage costs by 18 percent within the first year. Seoul's Smart City Division embedded deduplication requirements directly into procurement contracts for all city-commissioned photography from 2025 onwards, meaning new images are checked against a central register before payment is released. Neither approach has a direct London parallel yet, though the GLA's working group is understood to be examining both models.
The practical consequences of inaction are not trivial. Duplicate images inflate cloud storage bills — relevant to boroughs already operating under tight revenue budgets following multi-year local government funding squeezes — and create legal exposure around image licensing, since duplicate files can circulate with different rights metadata attached to each copy.
The GLA working group is expected to publish recommendations by September 2026. Councils and public bodies that want to get ahead of the April 2027 Data Quality Framework deadline should be auditing their content management systems now, paying particular attention to planning portals and public-health communications libraries, which internal reviews in multiple boroughs have flagged as the highest-duplication categories. The window for orderly remediation is narrowing.