Healthcare Data Warehouse:
62+ Billion Claims and 2+ Petabyte Size
The Healthcare Data Warehouse or the Integrated Data Repository (IDR) is the centerpiece of Centers for Medicare and Medicaid Services’ (CMS) Enterprise Data Warehouse (EDM) strategy.
The IDR supports the oversight function of Medicare and Medicaid. It provides CMS with an integrated data environment that contains Medicare and Medicaid claims, as well as beneficiary, provider, and plan data.
CMS chose CORMAC as the development prime contractor on the IDR engagement.
Functions Supported by Integrated Data Repository (IDR)
- Medical Trend & Utilization Analysis
- Healthcare Cost Assessment
- Policy Analysis and Development
- Provider Profiling & Management
- Quality and Effectiveness: Pay for Performance
- Rapid Response to Legislative Inquiries
- Program Integrity and Fraud, Waste & Abuse
By building a single unified data repository, CORMAC fulfilled IDR’s vision and provided the following:
- Greater information sharing
- Broader and easier access
- Enhanced data integration
- Increased security
- Improved privacy
- Strengthened query and analytic capability
CORMAC’s implementation of the Agile Scrum method on certain projects resulted in predictable and timely delivery of work products.
Innovations to Bring Business Value
CORMAC used the following innovations to deliver added value:
- Reusable Data Quality Module (DQM) — a sophisticated, metadata-driven, highly scalable, reusable module performs data quality checks on the large volume of data processed, leading to higher quality data.
- Automated testing/validation tool — Automating data validation reduces the time and effort of integrating the data, which increases process efficiency and lowers cost.
- Reusable ETL/ELT framework — Microservices reduced the cost of development and maintenance and promote reuse.
- Data Dictionary User Interface (DDUI) — Presenting highly complex IDR metadata in a human-centered user interface provides self-service capabilities for metadata search —allowing users to better understand the data.
- Row Column Level Security (RCLS) — A complex but elegant metadata-driven security solution provides Role-Based Access Control (RBAC) to data at the row/column level.
Building the Big Data Platform
- Data Architecture: The Enterprise Data Model (EDM) represents major domains in Medicare such as Beneficiary, Provider, Claims and Plans. It encompasses Medicare Part A, B, C and D including Encounter and Accountable Care Organizations (ACO) claims. The data model with 1600+ entities represents major domains in Medicare such as Beneficiary, Provider, Claims and Plans. The enhanced metadata, which included information for metadata sources (i.e., data lineage), supports better user understanding of the data.
- Data Integration/Data Wrangling: Our work involves a large-scale data integration and optimization services, ETL/ELT, data wrangling, data quality and validation and automation. We integrated 40+ disparate data sources that include the Medicare enrollment system, three Medicare shared systems (FISS, MCS, and VMS), and Medicare payment systems data from Common Working File (CWF), and many provider data sources such as PECOS, NPICS, QIES, and PV, and Encounter Data into a 2+ petabyte IDR data warehouse.
- Data Analytics: We built the semantic/user access layer that brings in easy user access and better understanding of data via many analytical tools. The user community is diverse spread throughout the country. We supported various front-end data analytics systems such as Fraud Prevention System (FPS) and users such as Unified Program Integrity Program Integrity Contractors (UPIC). Data analytics includes predictive Data Mining, Statistical analysis and reporting/dashboarding using tools such as SAS, MicroStrategy, SAP Business objects (BO), ArcGIS. CORMAC developed a highly user-friendly Semantic Access Layer with human-centered design (HCD) best practices, greatly simplified the ad-hoc querying and data analytics for the IDR End User community.
- User Training: CORMAC delivered monthly training to end users on a national level. The purpose of the trainings was to teach participants on how to perform data analytics using the IDR.
Areas of Expertise
Talk to us about your business challenges.
We’ll develop the exact solution for your needs.Let's Talk