London's public bodies are sitting on a bloated, redundant digital archive. Across councils, NHS trusts, and Transport for London, duplicate images — identical or near-identical photographs and graphics stored multiple times across separate databases — are consuming server space at a scale that IT managers are only beginning to quantify. The problem is not new, but pressure on public spending has made it urgent.
Digital asset management specialists working across the capital have found, in assessments conducted during 2024 and 2025, that duplicate image files can account for anywhere between 30 and 60 percent of total storage on legacy content management systems. For a large NHS trust or a borough council managing planning portals, housing registers and public communications simultaneously, that redundancy translates directly into unnecessary cloud storage costs and slower retrieval times for frontline staff.
What the Numbers Actually Show
The London Borough of Southwark, which overhauled its planning portal in 2024, discovered during its migration audit that roughly 40 percent of the images held in its document management system were duplicates or superseded versions of the same file — often uploaded by different departments with no shared naming convention. Southwark's digital team spent an estimated three months on deduplication work before the new system could go live, according to the council's published digital transformation update from late 2024.
Transport for London publishes an annual technology expenditure figure as part of its budget documents. Cloud and data storage costs have risen year on year since 2019. While TfL does not break out image storage specifically, the organisation manages more than 9,500 CCTV cameras across the network — each generating image and video data continuously — and has acknowledged in published board papers the challenge of managing legacy data efficiently alongside operational systems.
At Great Ormond Street Hospital in Bloomsbury, a 2023 audit of its radiology and clinical imaging archive found duplication rates that were contributing to retrieval delays for clinical staff. The trust subsequently joined NHS England's federated data storage initiative, part of the wider NHS Digital programme, to address the issue. NHS England has said publicly that rationalising duplicate data across trusts is central to its long-term technology strategy.
Why This Matters for London Right Now
The timing is not incidental. Keir Starmer's government has put public sector efficiency at the centre of its spending review, with departments under instruction to identify savings through better use of digital infrastructure before seeking new capital. For London's 33 borough councils, many of which are already managing budgets under severe pressure, the cost of duplicate data storage is a line item that could realistically be cut.
Cloud storage pricing has also shifted. Microsoft Azure and AWS both adjusted enterprise pricing structures in 2024 and 2025, and public sector framework agreements negotiated through Crown Commercial Service mean that councils are now paying more per terabyte than they were three years ago. A borough holding 10 terabytes of redundant image data could be paying in excess of £2,000 a year purely for storage that serves no operational purpose.
The Greater London Authority's Smart London Together roadmap, updated in 2024, identifies data governance and deduplication as priorities for the next phase of the city's digital infrastructure investment. City Hall has not published specific targets for image data reduction, but the framework sets a broad expectation that agencies will conduct regular data quality audits.
For organisations looking to address the problem, the practical steps are relatively straightforward. A phased image audit — cataloguing assets by file hash to identify exact duplicates first, then using perceptual hashing tools to find near-duplicates — is the standard approach. Southwark's experience suggests the work is labour-intensive but not technically complex. The harder challenge is governance: establishing which team owns an image, which version is canonical, and preventing re-duplication once a system is cleaned. That requires policy changes, not just software. For London's overstretched digital teams, the bandwidth to implement those policies may be the binding constraint more than the technology itself.