Stories that Data Science and Cut Points Tell

Author

date

Get more articles like this in your inbox monthly!

Checkbox(Required)

Background

It’s that time of year when Star scores are front of mind for everyone in MA. Measure cut points are always a huge source of anxiety but that’s especially the case this year given the uncertainty created by the Tukey ruling. At Lilac we apply hard core data science to forecast cut points. It’s intensive work to apply instructions embedded in the mathematically and statistically dense CMS technical documentation to years of data. We’re lucky to have a team of data and healthcare nerds that love this kind of thing.

As we’re digging into the data to get to cut points we also let our curiosity guide us to find a wide range of unexpected learnings and insights about the Star program more broadly. A lot of this comes from conversations with our clients and industry connections where we hear anecdotal statements along the lines of:

  • “Well that measure is hard to influence”
  • “Those measures are recovering from COVID distortions”
  • “These measures weren’t impacted by COVID at all”
  • “Tukey was a game changer for that measure”
  • Etc etc

We wanted to see if data can validate or disprove these and many other statements. This article details the lessons we learned.

Introduction to our Methodology

For any data science exercise, one needs clean data, test cases, really smart subject matter experts (the human kind) and detailed meticulous definitions of the models to build. For our exercise, all we needed were CMS’ performance data and Lilac’s team of experts. So, we started our journey with:

  • Published Star scores for all the contracts and all the measures for all the years that they are available
  • To account for Tukey changes, we incorporated the information from the simulated data for the years it was available
  • Published technical notes for all years
  • A large collection of CMS memos, notices, announcements and final rules
  • A really smart set of mission driven humans with deep expertise in data science, informatics and healthcare

We put all those ingredients together and went to work.

  • We cleaned the data
  • We massaged and normalized the data
  • We read through all the detailed explanations in technical notes
  • We built models to make sense of the data we were seeing
  • We built models to verify that given a set of inputs we can get to the same set of outputs as was reality

Finally we got to something we felt was telling us stories. A few of them follow.

 

Learning 1: Breast Cancer Screening dropped significantly during the COVID years but is recovering in utilization.

In keeping with the tradition of BCS measure being the first in technical notes, we picked that measure as the first story.

The above box plot is a charting of every star rating for every contract for the Breast Cancer Screening measure over the past 9 years. The plot lines are boundaries between the cut points (every contract below the B12 line was 1 star, every contract between B12 and B23 was 2 star and so on). Each dot on the plot line is the cut point for that year across that boundary.

Here are our key takeaways:

  • There was a significant decline in the number of Breast Cancer Screenings during COVID. The trend was prevalent enough that the cut points across all boundaries dropped between 2020 and 2022.
  • The significant uplift in the cut points for the boundary between 1 and 2 stars (B12) between 2023 and 2024 is the Tukey impact.
  • Removing the least performing outliers raised the cutpoints and it got harder for low performing plans to improve.
  • The cut points have been recovering since 2022 across all boundaries. This is a direct reflection that utilization is picking up, that needed care is being provided and the plans are getting back to where they used to be before COVID

So as that picture tells us, we can all back up our statements that COVID had a huge impact on utilization (but not Star Ratings), that Tukey really made it harder for poorer performing plans and that utilization is getting back to the pre-COVID normal.

 

Lesson 2: Timely Decisions about Appeals – When you go 5, you stay there.

Let’s review what is plotted above;

The above box plot is a charting of every star rating for every contract for the Plan Makes Timely Decisions About Appeals measure over the past 9 years. The plot lines are boundaries between the cut points (every contract below the B12 line was 1 star, every contract between B12 and B23 was 2 star and so on). Each dot on the plot line is the cut point for that year across that boundary.

Here are our key takeaways:

  1. The shape of the plots for each year looks like an upside down top. It means that when a plan gets to the 5 star territory, it likely stays there. This indicates that there is likely a fundamentally different process or organization momentum around ensuring appeals have decisions made timely and once in place, that proces+momentum stays in place.
  2. Lack of volatility in the cut point boundary for 5 stars indicates that it’s roughly the same seized cohort of plans in that category.
  3. Just like BCS, this is another measure that saw a significant elevation in expectations of plans because of Tukey. The outliers, by definition were at the lower band and removal of outliers caused all cut points to be lifted higher. CMS is really sending a message to plans to get their operations under control and and address issues with members. It isn’t just the CAHPS measures, it is also this one.

 

Lesson 3: Medication Reconciliation Post Discharge isa a hard measure to get better at.

Let’s review what is plotted above;

The above box plot is a charting of every star rating for every contract for the Medication Reconciliation Post Discharge measure over the past 9 years. The plot lines are boundaries between the cut points (every contract below the B12 line was 1 star, every contract between B12 and B23 was 2 star and so on). Each dot on the plot line is the cut point for that year across that boundary.

Here are our key takeaways:

  1. Notice the size of these orange box plots! They are spread across the spectrum. This means there is no easy clear pre-described way that can be plugged into existing environments to get better at this measure.
  2. The relatively stability of the cut points across the years means the situation isn’t changing for plans to be able to do better at this measure
  3. Spreading out of the cut points in the recent years indicates that many plans are trying different ways to improve here but success is elusive.

So we have some clear data and that tells us that plans are continuing to struggle getting better at this measure.

n addition to staring at the chart very intently and hearing the story it was telling us, we at Lilac wondered why this was the case. Then we came up with a hypothesis. The key was asking ourselves, what do plans really need to do to get better in this area?

This measure needs health plans to have daily visibility into the health of their members and impact it at the right time for the right group of members. The accessibility of that data in a timely and reliable manner greatly influences the possible attainment of this measure. This data is not easily or reliably available.

The story here was so clear, we wrote a whole article about it: Flexing Your Star Muscles On Complex Clinical Measures

——————————————–

Lilac Software offers a cloud-hosted, data aggregation and analytics solution for health plans’ most complex problems. Our Star Performance Management Platform gives unprecedented insight into all aspects of the Medicare Advantage Star program, including individual measure performance, best practice remediations for all measures, and forecasting of aggregate scores and revenue.

Lilac’s platform has dedicated reporting and functionality to help manage the toughest Star measures. Reach out to us via the form in this link to learn how we can help your plan perform better in these critical areas.