Sign Out
Logged In:
Tab Image

Monitoring Surgical Performance

- a method that is accepted and used by cardiac surgeons

by Jocelyn Lovegrove, Chris Sherlaw-Johnson and Steve Gallivan

OR is ultimately dependent on the implementation of improved methods. This paper describes a project which has been successful in this, and considers how close collaboration and media attention have influenced the outcomeillustrates how close collaboration with clinicians was necessary.


In recent years the level of surgeonical interest and activity in clinical audit has greatly increased. This has been fuelled not only by recent high profile investigations and the introductionprospect of ‘league tables’ of surgical centres, but also theby a longer term recognition of the value of clinical audit as a valuable tool for improving professional standards and reducing complication rates.

By its very nature, cardiac surgery involves extensive and delicate procedures often carried out in circumstances where a patient has an immediately life threatening condition. SPome perioperative mortality (death during or within 30 days of an operation) will therefore not uncommon. Detecting and remedying poor performance is therefore particularly important.occur in spite of the skill and efforts of surgeons. In the context of clinical audit, the question which must be asked is: do the perioperative mortality records for an individual surgeon reflect acceptable performance or is there cause for concern?

Most analyses of cardiac surgical outcomes use a retrospective inspection of data from surgical records to compile perioperative mortality rates as an average over a period usutypically taken as one year. This is a coarse method of analysis with many disadvantages. For example, it may hide runs of good and bad performance and so delay the discovery that a surgeon’s performance has fallen below standard.

Pioneering work by de Leval introduced a concept new to cardiac surgery. An eminent surgeon, de Leval decided to analyse his own mortality figure for a particularly challenging operation known as the arterial switch. The procedure is carried out on new-borns with a particular heart defect which has an attendant high risk of death. He used the ‘cusum’ technique to examine cumulative mortality over a series of his operations. He also used the cusum technique to examine ‘near misses’ - complications that did not lead to death, but which gave cause for concern. The results of this analysis led him to conclude that his performance could be improved and he operations and concluded that something was wrong with his performance and that he should spend some time retrainingsubsequently spent time retraining to refine his surgical technique. Courageously, he published the results of this experience; this (de Leval et al, 1994) which has spurred other cardiac surgeons to investigate ways of carrying out similar analyses of their own performance.

One such surgeon is Prof.essor Tom Treasure of St George’s Hospital in London. Prof. who has a particular interest in surgical audit. Professor Treasure realised that assessing overall performance for a mixed surgical caseload presented difficulties. Different procedures are known to entail different risks, also patienta patient’s preoperative condition plays a large part in determining the chances of survival. An apparently poor cusum plot could well result if the case mix concerned was inherently more difficult than average.

Several studies have investigated preoperative risk factors with a view to constructing a scoring system reflecting the probability of perioperative mortality. For cardiac surgery, the accepted method is due to Parsonnet (Parsonnet et al, 1989) which used unconventional statistics (Spiegelhalter, 1992) and is based on data from a number of American centres. There have also been many surgical developments since Parsonnet’s original work. As a consequence the scoring system is recognised to overestimate contemporary British mortality risks, in some cases giving a patient’s probability of perioperative mortality as in excess of 100%.

Prof.essor Treasure wanted to be able to buildincorporate patient and procedure risks into the acceptedde Leval’s cusum method to provide surgeons with an more informative and comprehensible summary of their performance. He discussed this with his frequent collaborator, and my boss, Prof. Gallivan frommembers of the Clinical Operational Research Unit (CORU) at UCL, collaborators on many previous projects.

The resulting projectresearch has led to the development of a new graphical technique which can be used to examine surgical performance taking account of heterogeneity of case mix. The method is simple to understand and, most important, is acceptable to surgeons. Although some technical features have been used for estimating perioperative risk probabilities, the details of these remain hidden from the surgeon making the method easy to use. The method, termed a variable life-adjusted display (VLAD), weights both death and survival according to an objective and explicit prior estimate of risk, and provides a clear overall picture of surgical performance.

In this paper, we describe the development of the VLAD method and the close collaboration between clinicians and OR analysts required to achieve successful results. Technical details of the method have been described elsewhere (Gallivan et al, 1997). Here an overview of the research process is given which it is hoped will provide insight into the nature of clinical operational research.


Professor Treasure allowed the OR researchers access to the cardiothoracic surgery unit and to a large database of case records. This comprised routinely collected data for all patients who underwent cardiac surgery at St George’s Hospital in London during the four-year period to the end of 1995.

Before analysis could begin, a good understanding of what exactly the data represented was required - anincluding an in-depth understanding of the physiological processes behind the numbers. To this end a literature review was a good starting point, but there is no substitute for the detailedhere was an intensive period of familiarisation. This involved a literature review, numerous, question and answer sessions with staff at St George’s, visits to operating theatres, intensive care units and wards, and the (very)attendance of early morning seminars about all aspects of life in the cardiothoracic unit. Familiarisation provided the necessary context for research and gave an appreciation of the concerns and sensitivities of the staff involved. This immersion into the life of the unit was also essential in order to gain a view of what would be required of any new analytical methods if they were to be useful. Cardiac surgeons are very highly skilled but not, by and large, in mathematical fields. Although classical statistical methods are applicable to the overall analysis of mortality data, in the present context, they often obscure rather more than they illuminate. Complicated analysis leads to poor understanding and thus mistrust of the methods being used.

Data analysis

Once familiarisation was underway, the first analytical objective was to devise an alternative method of estimating the risks of perioperative mortality that reflected a contemporary British surgical practice and so improve estimation accuracy. This method was constructed and tested using 4,318 case records. The data was fully computerised, had been validated and was maintained to a very high standard by a clinician within the surgery unit.

The vast majority of cases were patients undergoing isolated coronary artery bypass graft (CABG) operations. In surgical terminology, ‘isolated’ is used to indicate that only one procedure is carried out during the operation to distinguish from cases where, say, a bypass and a valve replacement are performed. The remaining cases were evenly split between isolated valve operations, and all other procedures. These ‘other’ procedures include combined CABG and valve operationprocedures, heart transplantation, surgery of the aorta, and other surgical procedures for ischaemic heart disease.

To produce an estimation method that was adequate for the majority of cases, logistic regression models were derived for the perioperative mortality of these three categories of procedure: isolated CABG; isolated valve; and ‘other’. For each of the three models, a 70:30 split was used: the larger random sample being for initial analyses and model construction, and the smaller for model validation. Full technical details are reported elsewhere (Gallivan et al, 1997).

Display of surgical performance

The cusum method introduced to surgeons by de Leval monitors surgical performance by displaying actual cumulative mortality for a series of operations. As a method, the cusum has great strengths because surgeons understand and accept it, however it takes no account of the heterogeneity of case mix. The challenge was to construct something which is as easy to understand whilst incorporating the rather technical information encapsulated in the preoperative risk-scores.

A process of repeated prototyping was undertaken. Based on experience of the cusum method, it was clear that surgeons felt comfortable with graphical methods and preferred them to complicated summary tables. However, there are many different ways of displaying graphical information and the process of determining the most appealing and understandable method relied on much trial and error. This process is made difficult in view of the fact that what may be obvious to a mathematician is usually far from clear to a surgeon. A series of different designs were considered, refined and amended, technically and presentationally, initially within CORU and then together with our collaborators at St George’s. The final design eventually achieved the aims of combining simplicity of concept with sufficient information to take case-mix heterogeneity into account and, most important, acceptability to surgeons.

The method is summarised as follows: using risk scoring methods, we can calculate the probability of perioperative mortality for a given patient and hence expected cumulative mortality for a given series of operations which can be plotted in the same way as the cusum (Figure 1). The difference between the expected cumulative and actual cumulative mortality describes surgical performance over time. Plotting this difference (Figure 2) gives what has become known as a Variable Life Adjusted Display (VLAD).

The key to the VLAD is the direction of the line, rather than the height of a given point in relation to the horizontal axis, as the latter will depend on when monitoring began. In Figure 3, the VLADs of three hypothetical surgeons exhibit extremes of performance. Surgeon A has performance which is initially as good as, then becoming better than expected; Surgeon B shows good performance with a bad run which is then recovered; and Surgeon C shows performance that is below what would be expected. In their final sections, plots for Surgeons A and B share the same direction and so they are performing equally well, despite the chart for Surgeon A lying above the x-axis while that for Surgeon B lies below it.

Figure 1 : Actual and expected cumulative mortality for a single hypothetical surgeon

Figure 2 : Variable life-adjusted display for a single hypothetical surgeon

Figure 3 : Variable life-adjusted displays for three hypothetical surgeons

Computer software was developed to calculate and display the VLAD and is now used as part of the routine audit process being carried out in the cardiothoracic unit at St George’s Hospital, London.


Once a method had been developed that was accepted and used by surgeons at St. George’s, the final stage was to gain acceptance for the method in the wider surgical community. The experience of our clinical collaborators was invaluable in this. Dissemination of new methods takes place at many different levels. Formal avenues included the annual conferences the relevant societies and Royal College, as well as general and specialist medical journals. Equally important however were the informal contacts between surgical colleagues.

As a result, other centres in the UK have now implemented the VLAD method, combining it with their own risk scores. It has also been used as part of two inquiries including that carried out by UK’s Royal College of Surgeons into surgical performance at a hospital in Bristol (Treasure et al, 1997).


As a result of the close collaboration with the team at St George’s, coming to understand their culture of clinical audit and the context in which any new methods would be used, the VLAD has developed as a method which appeals to the surgeons who make use of it. It is seen as an easily understood graphical technique that summarises risk and outcome, enabling surgeons to monitor their own operative performance. This in turn allows early awareness of potential problems and appropriate remedial action.

Having been used as part of this inquiry, the VLAD method has joined the popular debate about surgical performanceThe VLAD method has contributed to the debate about surgical performance and, unusually for a purely analytical method, been described in a leading medical journal - the Lancet, no less (Lovegrove et al, 1997). This has resulted in an increased awareness and use of the method within surgical circles, as well as stimulating new research to investigate the use of similar tools in other specialties.urgical specialties. Work is currently underway investigating its use in breast surgery.


We are grateful to Professor Tom Treasure and all the surgeons at St George’s Hospital Medical School for their willing co-operation in the process of entering and validating data; and to Oswaldo Valencia who runs the unit database and is responsible for its completeness and validation.

For the interested reader

  • de Leval MR, Francois K, Bull C, Brawn W, Spiegelhalter D (1994) ‘Analysis of a cluster of surgical failures. Application to a series of neonatal arterial switch operations’, The Journal of Thoracic and Cardiovascular Surgery; 107(3): 914-924.
  • Gallivan S, Lovegrove J, Sherlaw-Johnson C, Valencia O, Treasure T (1997) ‘Risk and performance in cardiac surgery’ Proceedings of the 23rd meeting of the Operational Research Applied to Health Services (ORAHS) Working Group, Trondheim, Norway: 97-108.
  • Lovegrove J, Valencia O, Treasure T, Sherlaw-Johnson C, Gallivan S (1997) ‘Monitoring the results of the cardiac surgery by variable life-adjusted display’, Lancet; 305: 1128-30.
  • Parsonnet V, Dean D, Berstein AD (1989) ‘A method of uniform stratification of risk for evaluating the results of surgery in acquired adult heart disease’, Circulation; 779 (suppl. 1): 3-12.
  • Spiegelhalter DJ (1992) ‘Risk stratification for open heart surgery’, British Medical Journal; 305: 1500.
  • Treasure T, Taylor K, Black N (1997) ‘Independent Review of Adult Cardiac Surgery - United Bristol Healthcare Trust’.

JOCELYN LOVEGROVE has been a Research Fellow at the Clinical Operational Research Unit, UCL since completing her Msc in Management Science and Operational Research at the University of Warwick with distinction in 1995.

CHRIS SHERLAW-JOHNSON is a Senior Research Fellow at the Clinical Operational Research Unit, UCL. After completing his MSc in Operational Research at Lancaster University in 1985, he worked as a Higher Scientific Officer in the Department of Trade and Industry and a Senior Scientific Officer for HM Customs and Excise before joining CORU in 1990.

STEVE GALLIVAN (Professor) is the Director of the Clinical Operational Research Unit, based at University College London. After a PhD in Pure Mathematics he saw the light and turned to more practical endeavours. He worked first in Traffic Science, at both the Transport and Road Research Laboratory and at UCL. Then research interests changed radically and since 1985, he has been applying OR to problems associated with health care.

First published to members of the Operational Research Society in OR Insight July- September 1998