You are seeing this message because your Web browser does not support basic Web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.


ABOUT ARCHIVES
Advanced Search

Welcome   | My Account | E-mail Alerts | Access Rights | Sign In


  Vol. 134 No. 9, September 1999 TABLE OF CONTENTS
  Archives
  •  Online Features
  Original Article
 This Article
 •Abstract
 •PDF
 •Send to a friend
 • Save in My Folder
 •Save to citation manager
 •Permissions
 Citing Articles
 •Citing articles on ISI (21)
 •Contact me when this article is cited
 Related Content
 •Similar articles in this journal
 Topic Collections
 •Gastrointestinal Diseases
 •Physical Examination
 •Alert me on articles by topic

Clinical Benefit of a Diagnostic Score for Appendicitis

Results of a Prospective Interventional Study

Christian Ohmann, PhD; Claus Franke, MD; Qin Yang, PhD; and the German Study Group of Acute Abdominal Pain

Arch Surg. 1999;134:993-996.

ABSTRACT

Hypothesis  Clinical use of a diagnostic score improves decision making in acute appendicitis.

Design  A before-and-after trial comparing a group of patients undergoing standard diagnostic workup with no additional diagnostic support (phase 1) with a group of patients undergoing additional diagnostic support with a score (phase 2).

Setting  Eight departments of surgery in Germany and Austria.

Patients  Eight hundred seventy patients with acute abdominal pain in phase 1 (October 1, 1994, to April 30, 1995) and 614 patients in phase 2 (February 1, 1995, to August 15, 1995).

Interventions  Structured and standardized history and clinical investigation in all patients with computer-based documentation; introduction of the diagnostic score after phase 1 and computer-supported use of the score in phase 2.

Results  The 2 groups were comparable with respect to signs, symptoms, and investigations related to acute appendicitis. Diagnostic performance of the final examiner decreased with the score (specificity, 86% vs 78%; positive predictive value, 67% vs 50%; and accuracy, 88% vs 81%). There were no differences in the rates of perforated appendix, appendectomy with normal findings, and complications; however, the delayed appendectomy rate (2% vs 8%) and the delayed discharge rate (11% vs 22%) were significantly lower with diagnostic support by the score (P=.02).

Conclusions  Integration of a score into the diagnostic process may have unforeseen clinical effects. The tested score cannot be recommended as a standard tool for diagnostic decision making in acute appendicitis.



INTRODUCTION
 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Author information
 •References

THE EARLY and accurate diagnosis of acute appendicitis is still a difficult problem.1 Despite introduction of ultrasound and special laboratory investigations (eg, C-reactive protein), high diagnostic error rates are observed.2 As a consequence, perforation rates and rates of appendectomy with normal findings of 15% and more occur.3

In the last few years, several scoring systems have been developed for supporting the diagnosis of acute appendicitis.4-12 Initial evaluation studies have reported excellent results, indicating that scoring systems would be ideal as diagnostic aids because they have good performance and require no special equipment, being user-friendly and comprehensible to the clinician.1, 7, 10-12 However, the clinical benefit of a diagnostic score integrated into the diagnostic process has not been investigated so far in a prospective study with adequate methods. We therefore performed such a study with the use of a diagnostic score developed and evaluated in Germany.


PATIENTS AND METHODS
 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Author information
 •References

The investigation was performed as a multicenter prospective interventional study with 8 German or Austrian surgical hospitals, including 3 university hospitals. Included were all patients with acute abdominal pain within 1 week before hospital admission. Excluded were patients with postoperative acute abdominal pain, trauma, or hernia; children less than 6 years old; patients who gave no informed consent; and patients with no definite final diagnosis. Acute appendicitis was diagnosed only on histopathological grounds according to the following criteria: macroscopic signs: intravascular injection of the serosa; fibrinous, purulent film; edematous, hemorrhagic, necrotic changes of the wall; and blood (not sufficient) or pus on opening of the appendix; microscopic signs: focal or expanded erosion, ulceration, abscess, fistula, necrosis, or perforation.

Not sufficient were fibrosis taken as evidence of subsided inflammation, intravascular injection of the serosa as the only finding, and description of few granulocytes.

Perforation had to be proved on histopathological grounds. There was no option for diagnosing "chronic appendicitis" or "subacute appendicitis." In the case of outpatients, a follow-up was performed after 30 days (telephone interview).

In all patients, a structured and standardized history and clinical investigation were performed according to international standards. Data were documented with a user-friendly computer program and form-based data entry.13 In case of computer breakdown, forms were available for data collection.

The study was performed in 2 consecutive phases: phase 1, no additional diagnostic support (4 months); and phase 2, diagnostic support with a score based on history, clinical examination, and basic laboratory data (4 months) (Table 1).


View this table:
[in this window]
[in a new window]
Table 1. Diagnostic Score for Acute Appendicitis*


The diagnostic score was introduced after phase 1 into the hospitals in several ways: distribution of a publication, presentation in training sessions and clinical conferences, and by posting in the outpatient ward. The score was integrated into the computer program and automatically presented after data input of the history, clinical examination results, and basic laboratory data. After special laboratory investigations, ultrasound, and x-ray, the diagnosis of the final examiner after all investigations (in the majority of cases, a senior surgeon), the final diagnosis at discharge, and the outcome of disease were documented prospectively with the computer program. Comparability of the study groups was investigated for signs and symptoms related to acute appendicitis, the distribution of the final diagnoses, and the diagnostic investigations performed.

The outcome criteria were the diagnostic accuracy of the final examiner with respect to appendicitis (sensitivity, specificity, positive and negative predictive value, and accuracy), the perforated appendix rate, the rate of appendectomy with normal findings, the rate of laparotomy with normal findings, the delayed appendectomy rate, the complication rate, and the delayed discharge rate. For the outcome criteria, the following definitions were used: perforated appendix rate, proportion of patients with acute appendicitis who had a histologically proved perforation; negative appendectomy rate, proportion of patients with appendectomy in whom no appendicitis was found; negative laparotomy rate, proportion of laparotomies that were unnecessary (no intraoperative or histological diagnosis); delayed appendectomy rate, proportion of patients with appendicitis in whom the appendectomy was performed the second day or later after admission; and delayed discharge rate, proportion of patients with appendicitis who were discharged 10 days or later after admission.

Statistical comparisons between the 2 phases were performed with the {chi}2 test excluding missing data.

There are no general guidelines and rules in Germany for the performance of studies with formal decision aids based on routinely assessed clinical variables. We decided to give an information brochure to the patients explaining the study and to give them the option not to take part in the study.


RESULTS
 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Author information
 •References

Overall, 1484 patients could be enrolled in the study: 870 patients in phase 1, with no additional diagnostic support, and 614 patients in phase 2, with diagnostic support by the score (Table 2). The starting date of the study varied between centers; phase 1 began between October 1, 1994, and April 30, 1995, and phase 2 between February 1, 1995, and August 15, 1995. The frequency of appendicitis in phase 1 was 23.1% (n=201) compared with 18.6% (n=114) in phase 2. Major diagnoses were no specific abdominal pain (phase 1, 25%; phase 2, 27%), acute dyspepsia (8%, 10%), acute biliary disease (8%, 9%), ileus (4%, 5%), urolithiasis (3%, 5%), urinary tract infection (3%, 4%), and acute diverticulitis (3%, 4%). There were no significant differences between the 2 phases with respect to signs and symptoms related to appendicitis (Table 3). Study groups were comparable with respect to ultrasound of the abdomen (phase 1, 65%; phase 2, 64%) and ultrasound of the appendix (11%, 9%). Leukocyte counts were determined significantly more often in phase 2 as a component of the score (88%, 95%; P<.001).


View this table:
[in this window]
[in a new window]
Table 2. Number of Patients in the Study Groups



View this table:
[in this window]
[in a new window]
Table 3. Comparability of Study Groups for Signs and Symptoms Related to Acute Appendicitis


Clinicians' diagnosis of appendicitis changed after introduction of the score (Table 4). Specificity, positive predictive value, and accuracy were significantly lower with diagnostic support by the score. Before introduction of the score, appendicitis was diagnosed less often by the final examiner (31%) than after introduction of the score (36%) (P=.10), contrary to the frequency of appendicitis (23% vs 19%). There were no significant differences with respect to the perforation, appendectomy with normal findings, and complication rates. The delayed appendectomy and delayed discharge rates were significantly lower with diagnostic support. However, timing of appendectomy was not associated with the complication rate (24% in delayed appendectomy vs 10% in nondelayed appendectomy; P<.09; not differentiated between the study phases because of the small sample size). As expected, a higher complication rate was found in patients with delayed discharge than in those without delayed discharge (36% vs 5% in the total study population; P<.001).


View this table:
[in this window]
[in a new window]
Table 4. Clinical Outcome in the Study Groups


There was a linear relationship between the score values and frequency of appendicitis: less than 4.0 points, 3% (phase 1), 0% (phase 2); 4.0 to 5.5 points, 5%, 3%; 6.0 to 7.5 points, 11%, 10%; 8.0 to 9.5 points, 24%, 15%; 10.0 to 11.5 points, 32%, 24%; 12.0 to 13.5 points, 55%, 38%; and 14.0 points or more, 68%, 74%.


COMMENT
 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Author information
 •References

Despite all improvements (ultrasound, special laboratory values), routine diagnosis in acute appendicitis still poses a challenging problem. Major areas of concern are perforations (rate of up to 20%), negative appendectomies (rate of up to 30%), delayed operations, complications after operation, and late discharge.3, 14 Therefore, several diagnostic scoring systems have been developed, characterized as noninvasive, understandable, user-friendly, and cost-effective.2, 4-8,10-12 Evaluation studies have demonstrated a good performance for some of these scores, indicating their potential for diagnostic decision making.6-8,10, 12 Testing of these scores on a prospective database of German cases revealed disappointing results.15 None of the scores fulfilled any of the given quality criteria. The lack of separate testing in a prospective study, small sample size, differences in the target population, and geographic variation of the incidence and presentation of the diseases were discussed as major factors.16 For that reason, a new score was developed on the basis of German data, which gave promising results in a first evaluation study.9

Unfortunately, the clinical benefit of none of the scores has been tested in an adequate controlled study, comparing diagnostic performance of the clinician with and without the score. Some reports indicate improvement concerning the negative appendectomy rate or the perforation rate, if compared with historical data. In one study, 2 different surgical units were compared. In the unit that used the score, a negative appendectomy rate of 7% was found, and in the unit not using the score, a negative appendectomy rate of 17%.12 These studies cannot be taken as evidence of the clinical benefit of diagnostic scores in acute appendicitis.17 The optimal approach in clinical research is the randomized controlled clinical trial. In evaluating scores, this design has several pitfalls. Randomization of patients may result in carryover effects, since the physician may be influenced when deciding to treat control patients. A possible solution is to randomize physicians, but previous studies have shown that randomization to the intervention group may motivate physicians more than randomization to the control group.18 An alternative design is to perform a prospective intervention study with a before-and-after design, an approach used in our study. This design may be undermined by secular trends or sudden changes, either in the outcomes to be measured or in characteristics of the study population that influence these outcomes. This type of bias can never be excluded with this design, but it is probably low in our study for the following reasons: uniform data collection according to standard definitions in both phases, no differences between the study populations in the 2 phases (Table 3), and the short duration of each phase (4 months).

Systematic reviews have shown that the effectiveness of clinical guidelines and decision support is critically dependent on 3 factors: development, dissemination, and implementation strategy.19 The probability of being effective is highest if guidelines are developed internally, disseminated by specific educational initiatives, and implemented as patient-specific reminders at the time of consultation. In our study, the majority of participating centers were involved in the development of the score.9 The score was disseminated by specific training sessions or during clinical conferences, and it was applied during the consultation. The score did change clinical practice, although the accuracy of the score as a diagnostic aid was not convincing. Which factors may have biased the results in our study? In a previous multicenter study we showed that standardized and structured data collection did not change clinical performance in 6 German hospitals, so a checklist effect can be discounted. Because of the study design, with 2 consecutive phases and introduction of the score in phase 2, no carryover effects could occur. Systematic feedback was not provided in the study.

From the results of the study, it can be hypothesized that the diagnostic behavior of the clinician was changed in a systematic way. Although occurring less often, possible acute appendicitis was suspected more often in the test phase, but the diagnostic decision was false positive in every second patient (positive predictive value, 50%). Although this did not influence the decision to operate (no difference in the negative appendectomy rate), it helped to avoid delayed but necessary operations. In Germany, the average hospital stay for acute appendicitis is rather long, as was demonstrated in our study. Financing in nonperforated appendicitis in Germany is performed per case (Fallpauschale). The calculation of reimbursement is based on an average hospital stay of 7.16 days for an open operation and 6.04 days for a laparoscopic operation. Only if hospital stay exceeds 14 days (open operation) or 13 days (laparoscopic operation) is additional reimbursement of costs possible (Grenzverweildauer). In our study, we defined a hospital stay of 10 days or longer as delayed discharge and could demonstrate that scoring improved with respect to this outcome criterion. In summary, scoring did not result in an improvement of the classic outcome criteria (negative appendectomy, perforated appendix, and complication rate). Even worse, scoring degraded diagnostic decision making of the final examiner, especially with respect to overprediction of acute appendicitis. However, decreased diagnostic performance did not result in poorer management and outcome; instead, positive effects on the timing of operation and duration of hospital stay were measured.

Two general conclusions and 1 specific conclusion can be drawn from this study. Testing of a score in new clinical environments is necessary before widespread application can be recommended. Integration of a score into the diagnostic process may have unforeseen clinical effects. The existing score cannot be recommended as a standard tool for diagnostic decision making in acute appendicitis.


AUTHOR INFORMATION
 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Author information
 •References

This work was supported by a grant (project number 01 EI 9606/0) from the German Ministry of Education, Science, Research, and Technology, Bonn, Germany, within the Medizinische Wissensbasen (MEDWIS) program.


The German Study Group of Acute Abdominal Pain

Joachim Walenzyk, MD, Georg Federmann, MD, Clinic of General Surgery, Kreiskrankenhaus Goslar, Goslar, Germany; Jörg Krenzien MD, Gabiele Hansdorfer, MD, Surgical Clinic, Klinikum Ernst von Bergmann, Potsdam, Germany; Cornelia Berner, MD, Joachim Eibner, MD, Department of General and Trauma Surgery, Robert-Bosch-Krankenhaus Stuttgart, Stuttgart, Germany; Matthias Kraemer, MD, Klaus Kremer, MD, Surgical Clinic and Policlinic, University of Würzburg, Würzburg, Germany; Heinrich Böhner, MD, Surgical Clinic, Elisabeth-Krankenhaus Essen, Essen, Germany; Martin Labus, MD, Surgical Clinic, Bürgerhospital Frankfurt, Frankfurt, Germany; and Anton Klingler, PhD, Theoretical Surgery Unit, Surgical Clinic, University of Innsbruck, Innsbruck, Austria.


Reprints: Christian Ohmann, PhD, Funktionsbereich Theoretische Chirurgie, Klinik für Allgemein und Unfallchirurgie, Heinrich-Heine-Universität, Moorenstr 5, 40225 Düsseldorf, Germany (e-mail: ohmannch{at}uni-duesseldorf.de).

From the Theoretical Surgery Unit (Drs Ohmann and Yang) and the Department of General and Trauma Surgery (Dr Franke), Heinrich-Heine-University, Düsseldorf, Germany.


REFERENCES
 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Author information
 •References

1. Hoffmann J, Rasmussen OO. Aids in the diagnosis of acute appendicitis. Br J Surg. 1989;76:774-779. ISI | PUBMED
2. Izbicki JR, Wilker DK, Mandelkow HK, et al. Retro- and prospective studies on the value of clinical and laboratory chemical data in acute appendicitis [in German]. Chirurg. 1990;61:887-894. ISI | PUBMED
3. Andersson RE, Hugander A, Thulin JG. Diagnostic accuracy and perforation rate in appendicitis: association with age and sex of the patient and with appendicectomy rate. Eur J Surg. 1992;158:37-41. ISI | PUBMED
4. Eskelinen M, Ikonen J, Lipponen P. A computer-based diagnostic score to aid in diagnosis of acute appendicitis: a prospective study of 1333 patients with acute abdominal pain. Theor Surg. 1992;7:86-90.
5. Van Way CW, Murphy JR, Dunn EL, Elerding SC. A feasibility study of computer aided diagnosis in appendicitis. Surg Gynecol Obstet. 1982;155:685-688. ISI | PUBMED
6. Alvarado A. A practical score for the early diagnosis of acute appendicitis. Ann Emerg Med. 1986;15:557-564. FULL TEXT | ISI | PUBMED
7. Arnbjörnsson E. Scoring system for computer-aided diagnosis of acute appendicitis: the value of prospective versus retrospective studies. Ann Chir Gynaecol. 1985;74:159-166. ISI | PUBMED
8. Fenyö G. Routine use of a scoring system for decision-making in suspected acute appendicitis in adults. Acta Chir Scand. 1987;153:545-551. ISI | PUBMED
9. Ohmann C, Franke C, Yang Q, et al. Diagnostic score for acute appendicitis [in German]. Chirurg. 1995;66:135-141. ISI | PUBMED
10. Lindberg G, Fenyö G. Algorithmic diagnosis of appendicitis using Bayes' theorem and logistic regression. Bayesian Stat. 1988;3:665-668.
11. Teicher I, Landa B, Cohen M, Kabnick LS, Wise L. Scoring system to aid in diagnoses of appendicitis. Ann Surg. 1983;198:753-759. ISI | PUBMED
12. Christian F, Christian GP. A simple scoring system to reduce the negative appendicectomy rate. Ann R Coll Surg Engl. 1992;74:281-285. ISI | PUBMED
13. Ohmann C, Belenky G, Platen C. Integration of a data dictionary and a clinical database in an expert system for acute abdominal pain. Medinfo. 1995;2:943-946.
14. Blind PJ, Dahlgren ST. The continuing challenge of the negative appendix. Acta Chir Scand. 1986;152:623-627. ISI | PUBMED
15. Ohmann C, Yang Q, Franke C. Diagnostic scores for acute appendicitis. Eur J Surg. 1995;161:273-281. ISI | PUBMED
16. deDombal FT, Staniland JR, Clamp SE. Geographical variation in disease presentation: does it constitute a problem and can information science help? Med Decis Making. 1981;1:59-69.
17. Johnston ME, Langton KB, Haynes B, Mathieu A. Effects of computer-based clinical decision support systems on clinician performance and patient outcome: a critical appraisal of research. Ann Intern Med. 1994;120:135-142. FREE FULL TEXT
18. North of England Study of Standards and Performance in General Practice. Medical audit in general practice, I: effects on doctors' clinical behaviour for common childhood conditions. BMJ. 1992;304:1480-1484.
19. Grimshaw JM, Russell IT. Effect of clinical guidelines on medical practice: a systematic review of rigorous evaluations. Lancet. 1993;342:1317-1321. FULL TEXT | ISI | PUBMED






HOME | CURRENT ISSUE | PAST ISSUES | TOPIC COLLECTIONS | CME | SUBMIT | SUBSCRIBE | HELP
CONDITIONS OF USE | PRIVACY POLICY | CONTACT US | SITE MAP
 
© 1999 American Medical Association. All Rights Reserved.