SharePoint — What We Found & What We Built
CDT’s SharePoint had grown unchecked for nearly a decade — 67,760 files across 22 sites, almost half duplicated, and heritage media locked behind Microsoft authentication. Over seven weeks (February–March 2026) we audited everything, extracted intelligence that was previously invisible, and built systems to make it accessible. The cleanup was the enabling step, not the goal.
Timeline
What We Found
Three weeks of authentication work (device registration, Azure AD app setup, Graph API permissions) were needed before any audit could begin. Once access was established on 3 March, the first full scan revealed the scale of the problem.
Duplication patterns
| Pattern | Wasted | What happened |
|---|---|---|
| Triple finance copies | 31.1 GB | Three officers (SD, SDawe, AH) each had full or partial copies of the same finance records in separate folders |
| Heritage Group mirror | 23.0 GB | Entire VISITOR STREAM heritage collection duplicated wholesale into WORKING GROUPS AREA |
| OneDrive sync artifacts | 1.0 GB | Timestamped copies created automatically during staff account migrations |
| Cross-folder copies | 2.2 GB | Same documents appearing in PROJECTS, OTHER, and VISITOR STREAM folders |
What We Unlocked
The real value wasn’t in the cleanup — it was in the intelligence buried across thousands of documents. Five areas of operational data were extracted, structured, and surfaced on the portal for the first time.
Leases & occupancy
PROPERTY/Leases/65 active leased units, 16 already expired
~£65k/yr unrealised rent identified from vacant or expired units
Policies & governance
ADMINISTRATION/Policies & Procedures/26 policies (POL-001–026) and 37 procedures found, well-structured numbering
No Safeguarding or Whistleblowing policy — high-risk gaps flagged
Funding history
FINANCE/Funding - Grants/55 grants across 31 funders, 265 documents analysed
~£4.1M awarded from ~£6.5M applied — full track record now visible
Heritage assessments
PROJECTS/Heritage Conservation Framework 2025/89 building surveys + 1,148 evidence photos extracted from 3.8 GB of ZIPs
Condition data, repair priorities, and costs now structured and searchable
Solar project
PROJECTS/SolarPV&BESS/£254K cost breakdown, heritage constraints, contractor details
Full project dashboard built — previously scattered across 20+ documents
What We Built
New systems were created to make the extracted data accessible and to replace SharePoint as the presentation layer for heritage and operational content.
Cleanup — The Enabling Work
| Action | Detail | Freed |
|---|---|---|
| Duplicate identification | 27,650 duplicate files (57.3 GB) mapped by hash. 392 timestamp artifacts (0.42 GB) removed. Remaining duplicates flagged for manual review | 0.42 GB |
| Site consolidation | 21 satellite sites deleted (10 empty, 11 with minimal content). Energy site archived first | 74 MB |
| Funding folders | 19 COF files merged from 3 locations, 8 empty folders deleted, pre-2023 grants archived | — |
| Blair OneDrive | 1,414 items (photos + videos) migrated from personal OneDrive to R2 gallery | — |
Where We Are Now (March 2026)
SharePoint remains the operational filing system for finance, administration, and HR. Heritage media and project data have moved to purpose-built systems (R2 CDN and this portal) that are faster, public where needed, and structured for search and reporting.
Still To Do
| Item | Detail | Impact |
|---|---|---|
| Delete heritage originals from SharePoint | Photos and videos have been copied to R2 but originals remain on SharePoint | ~33 GB reclaimable |
| Clear recycle bin | Deleted files remain recoverable for 93 days — recycle bin currently holds ~56 GB | Free quota back to tenant |
| Investigate storage growth | CDT site grew from 123.9 GB (Feb audit) to 137 GB (Mar admin centre) despite cleanup — cause unknown | 13 GB unexplained |
| Re-run full audit | Current file count and folder structure not verified since February — need fresh numbers | Accurate baseline for ongoing management |
Data sourced from Graph API audit (February 2026), conversation history analysis (March 2026), and SharePoint admin centre · Full audit logs retained