← Home/Impact/Population Health Analytics
HEALTHECONOMICSData EngineeringData Science
Integrated Care Analytics
LEVEL RISK STRATIFICATION

Population Health Analytics — Healtheconomics

Healthcare Provider — Integrated Care System · 2023

[THE CHALLENGE]

Patient Records Fragmented Across Trusts.

A Healthcare Provider spanning multiple trusts faced a critical challenge: patient records existed in siloed systems across hospitals, GP practices, community care providers, and mental health services — with no unified view of patient risk at the Integrated Care System level. Clinicians and commissioners making population health decisions were working from partial pictures.

High-risk patients — those with multiple long-term conditions, frequent emergency presentations, or complex social care needs — were invisible until they reached crisis. Early intervention was impossible without a complete, longitudinal patient record that crossed organisational boundaries.

Compounding the challenge, existing data pipelines were manual and SQL-based — slow, brittle, and impossible to scale across a growing provider network. Every new client engagement required bespoke engineering effort, with no standardised framework for rapid onboarding.

The organisation needed a unified patient data platform with automated risk stratification — identifying high-risk cohorts proactively, with explainable clinical reasoning that clinicians could trust and act on.

Build a scalable, Big Data-native patient data platform with automated risk stratification — surfacing high-risk cohorts to clinical teams before crisis presentations, with a templatized onboarding model for rapid provider expansion.

[OUR APPROACH]

Six Phases.
One Platform.

PHASE 01
Data Source Discovery & Governance Framework
Mapped all patient data sources across the Integrated Care System: acute trusts (PAS, EPR), GP systems (EMIS, SystmOne), community care, mental health services, and social care. Information Governance framework established — data sharing agreements, pseudonymisation standards, and access control policies designed in alignment with Healthcare Provider data governance requirements.
IG FrameworkData GovernanceData MappingAccess Control
PHASE 02
Data Pipeline Modernisation
Migrated existing SQL-based manual pipelines to a scalable Big Data execution framework built on Databricks and Azure Data Factory — eliminating manual intervention, reducing pipeline latency, and enabling processing at population scale. Developed a templatized onboarding architecture allowing new Healthcare Provider clients to be integrated in a fraction of the time previously required. Standardised data schemas, reusable pipeline components, and automated data quality checks form the foundation of the onboarding template — turning each new client engagement from a bespoke build into a governed, repeatable deployment.
DatabricksAzure Data FactoryPipeline ModernisationTemplatized Onboarding
PHASE 03
Unified Patient Record Build
Designed and built a longitudinal patient record on Azure Data Lake — ingesting from 8+ source systems via modernised Data Factory pipelines with patient identifier as the master key. Patient demographics, diagnosis history, medication, care events, and social determinants unified into a single Silver-layer patient spine. Full audit trail and lineage maintained for clinical governance.
Azure Data LakeADLS Gen2Patient Master RecordDatabricks
PHASE 04
Risk Stratification — ETG & ERG Modelling
Risk stratification built on Episode Treatment Group (ETG) and Episode Risk Group (ERG) metrics derived from the Symmetry suite of applications — the clinical standard for measuring resource utilisation and predicting future healthcare risk. ETG and ERG scores enriched with population demographic features (age, gender, deprivation index, comorbidity burden) and geographic features (locality, proximity to care settings, regional utilisation patterns) to produce a multi-dimensional risk profile per patient. SHAP values generated per prediction — surfacing the top contributing clinical and social factors for every high-risk flag to support transparent, governance-compliant clinical decision-making.
Symmetry ETGSymmetry ERGSHAP ExplainabilityDemographic EnrichmentGeographic Features
PHASE 05
Clinical Intelligence Dashboard
Power BI dashboard built for ICS commissioners and clinical leads — population-level risk distribution, cohort drill-down by risk decile, and patient-level detail view with SHAP explanation cards. Role-based access ensuring clinical governance: clinician views scoped to registered patients, commissioners seeing aggregated population data only.
Power BIRBACClinical GovernanceRisk Decile View
PHASE 06
Care Pathway Integration
High-risk flags integrated with care coordination workflows — generating structured case referrals for community intervention teams. Monthly model refresh pipeline with performance monitoring tracking recall rates against 3-month emergency admission actuals. Clinical validation panel review built into the governance cycle to ensure model outputs remain clinically credible and actionable.
Care CoordinationModel GovernanceClinical ValidationAutomated Refresh
[KEY OUTCOMES]

The Results.

8+
Source Systems Unified
Acute, primary care, community, mental health, and social care data unified into a single patient spine across the Healthcare Provider network.
ETG/ERG
Clinical Risk Metrics
Episode Treatment Group and Episode Risk Group metrics from the Symmetry suite — enriched with demographic and geographic features for precise risk stratification.
3
SHAP Factors per Patient
Every risk flag explained — top 3 contributing clinical factors surfaced to clinicians for every high-risk patient identified.
Early
Intervention Enabled
High-risk cohorts identified and referred to community intervention teams before emergency presentation — proactive care at scale.

Commissioners and clinical leads gained a complete, explainable, cross-sector view of population risk — enabling proactive care planning rather than reactive emergency response, on a platform designed to scale across providers.

[TECHNOLOGY USED]

Stack.

DATA PLATFORM
Azure Data Lake (ADLS Gen2)Azure Data FactoryDatabricksAzure ML
CLINICAL ANALYTICS
Symmetry Suite (ETG / ERG)SHAP ExplainabilityPython (Pandas, NumPy, Scikit-learn)
PIPELINE & ONBOARDING
Templatized Onboarding FrameworkAzure Data Factory PipelinesAutomated Data Quality Checks
VISUALISATION & GOVERNANCE
Power BIRBAC / Row-Level SecurityHealthcare Data Governance Framework
[NEXT CASE STUDY]
HEALTHCARE — GLOBAL · DATA ENGINEERING
Data Liberation Analytics —
1,000+ Analysts, Petabyte Scale
Enterprise self-service analytics enabling over 1,000 analysts to access petabyte-scale data without engineering dependency — governed data mesh with role-based access.
Read Next Case Study →

Healthcare data
working in silos?

Tell us about your population health or data integration challenge. We understand healthcare data governance and clinical explainability requirements.

Start a Conversation →