"Relying upon the death rates of individual surgeons…may lead to 'false complacency'," The Daily Telegraph warns. It reports on an article in The Lancet which argues that recently published NHS data on surgical outcomes is too limited in scope to be useful.
The data, published in June 2013 on the NHS Choices website, currently consists of mortality rates for seven types of surgery.
The Lancet article highlights the fact that most surgeons do not perform enough of the individual procedures each year for patient death rates to be a reliable indication of poor performance. A far greater number of procedures per year would be needed to give enough “statistical power” to show which surgeons were truly performing worse than average.
With only a small number of procedures performed, the number of patient deaths per surgeon in any given year may be the result of chance. As a result, some surgeons may be wrongly identified as underperforming.
The Lancet article also highlights the fact that focusing solely on mortality rates is not particularly helpful for patients. For example, orthopaedic surgeries such as hip replacements have a very low risk of death, but complications from hip surgery are relatively common, such as loosening of the replacement joint, which may require further surgery to correct. These types of post-surgical outcomes should also have been included in the NHS data, they argue.
The authors of the Lancet article offer several other suggestions for how to give a more reliable indication of surgeon performance.
The authors of the Lancet paper suggest ways to increase the number of procedures analysed to give a better indication of performance.
They suggest:
Overall, this article is useful for both members of the public and professionals in highlighting the possible limitations of analysing patient death rates alone following surgical procedures. This, the authors argue, is a very crude indication of what constitutes a ‘good’ or a ‘bad’ surgeon.
This was a report authored by researchers from peer-reviewed medical journal, The Lancet. The report received no specific funding. This article was reported fairly by both The Daily Telegraph and BBC News.
The researchers report that, from June 2013 onwards, the patient death rates from certain surgical procedures are being reported for individual surgeons as part of the English NHS Commissioning Board’s new policy. Several US states already report similar data, and UK heart (cardiac) surgery mortality data has already been reported for a number of years. The intended aim of this is to allow patients to be better informed when choosing their surgeon.
However, as the authors of this article highlight, when the overall number of certain procedures performed is low, death rates are not necessarily a good indicator of the surgeon’s overall performance. They say that there is a danger “that low numbers mask poor performance and lead to false complacency”.
The aim of this article was to examine this issue by looking at patient death rates for individual surgeons for adult heart surgery, and also for three specific procedures in three other specialties:
The researchers wanted to answer the following questions:
The researchers then gave suggestions on how surgeon performance could be addressed meaningfully. They used figures on numbers of surgeries and deaths from national sources such as Hospital Episode Statistics and the National Institute for Cardiovascular Outcomes Research. As such, these are likely to represent the best national figures available.
The researchers’ calculations involved some assumptions about what would constitute poor performance. For example, they defined a surgeon whose surgical mortality rates were double the national average as performing poorly. If they had defined this differently it would affect the results of the calculations.
The median (average) number of heart procedures each heart surgeon performs per year is 128. For the other specific procedures examined, the median number of procedures performed per surgeon per year is far less:
Next, the researchers related this to how many procedures per surgeon would be needed to give the best statistical power to identify accurately the poorly performing surgeons.
That is, the probability that a surgeon with truly poor performance would be detected as having significantly poorer performance than average.
The higher the statistical power, the higher the probability of identifying the poorly performing surgeons. A power value of 80% would mean that out of 10 poorly performing surgeons, eight would be identified, while 60% power would mean that out of 10 poorly performing surgeons, six would be identified, and so on.
Of all the patients who undergo heart surgery across the UK, national mortality data shows that 2.7% die following the procedure. While the average number of heart surgeries per surgeon seems high at 128 per year, in fact:
For the other surgeries the figures are as follows:
Overall, the findings show that, given the small number of procedures performed per surgeon per year, using annual deaths as a measure of performance would miss many underperforming surgeons. If each surgeon were able to perform the large number of procedures required to give adequate statistical power, then death rates would be better at identifying the surgeons who are performing worse than average.
Based on the numbers of surgeries performed over three years, 75% of UK heart surgeons perform sufficient procedures to give 60% power to use death rates to identify the poorly performing surgeons. Just over half (56%) perform enough procedures to give the more reliable 80% power.
For hip surgery the numbers are similar, but for other procedures, the proportion of surgeons achieving high enough numbers of surgeries is much lower. Over a three-year period:
However, the researchers demonstrate that extending the time over which a surgeon’s figures are examined (to measure more procedures) gives better power.
The figures detailed above relate to data collected over three years. Increasing the observation period to five years would increase the proportion of surgeons who perform sufficient procedures to give the same levels of power. However, increasing the observation period would mean it would take longer to identify underperforming surgeons.
Conversely, if the time frame were decreased to one year rather than three, very few surgeons would have performed enough procedures to give adequate power – only 16% of heart surgeons have performed enough procedures in a year to achieve 60% power, 4% of surgeons performing hip surgery and no surgeons for the other two surgeries.
The researchers also highlight that even if a surgeon is identified as a poor performer using death rates, they may not truly have poor performance.
The exact number correctly identified will vary depending on how many procedures they do, how common poor performance is and the threshold set for considering a difference in performance to be statistically significant.
The authors estimated that if only one in 20 cardiac surgeons truly had poor performance, 63% would be correctly identified on the basis of the average number of procedures in three years. For the other procedures the corresponding figures would be:
The remainder of surgeons identified as having poor performance would only fall into this category due to chance.
There is also the possibility that experienced surgeons would be identified as having poor performance. A consultant with many years of experience may be more likely to operate in very high-risk cases where patients have multiple complex health problems, and these types of surgery have a much higher risk of mortality through no fault of the surgeon.
As these findings show, when using patient death rates, not all surgeons identified as having a higher number of death rates will necessarily have poorer performance, and vice versa.
The researchers suggest a number of options for improving the power to detect poor performance:
The researchers also make the point that mortality rates for types of surgery with a low risk of death may not be particularly useful when it comes to informed patient choice. Other post-operative outcomes, such as post-operative bleeding, infection or persistent pain, or emergency readmission rates, could provide a better assessment of surgical performance.
The authors conclude by making the following recommendations for better public reporting of surgeon outcomes:
Overall, this article is useful for both members of the public and professionals in highlighting some important limitations of using patient death rates following surgical procedures as the sole indication of ‘good’ or ‘bad’ surgeons.