Report of the Working Group of the Advisory Committee to the Director to review the Office of Medical Applications of Research

Report of the Working Group of the Advisory Committee to the Director
to review the Office of Medical Applications of Research

Members:
Alan Leshner, Ph.D. (Chair)
Ezra Davidson, M.D. (Member, ACD)
Peggy Eastman
Scott Grundy, M.D.
Barry Kramer, M.D.
Joan McGowan, Ph.D.
Audrey Penn, M.D.
Stephen Straus, M.D.
Steven Woolf, M.D.

June 1999

Summary:

A working group of NIH and non-NIH members was charged with reviewing the activities of the Office of Medical Applications of Research (OMAR), particularly the Consensus Development Conferences (CDC). The group believes that CDCs play an important role and should be continued, but that a major shift in the CDC program is needed in order to improve their rigor and usefulness. In particular, CDCs should incorporate systematic reviews of evidence and be more deliberate, more thorough, and more prolonged in the preparation for and conduct of conferences, including final written products. The group identified several key areas of needed improvement and developed recommendations and a model for consideration by the Advisory Committee to the Director of NIH. These areas include how topics are chosen, how consensus panels are constituted, how the panels review evidence and develop findings, and how the CDC results are disseminated. The principal findings included that:

CDCs can continue to serve an important role in informing health care professionals and the pubic about improvements in medical practice and public health, and in clarifying inconsistencies between practice and supporting evidence.
An advisory group should be formed to review topics and conference plans and to provide advice to NIH on proposed CDCs. This group would review and approve topics, panel membership, and planned speakers to assure balance and objectivity.
The topic selection process for Consensus Development Conferences (CDCs) should be more formalized and rigorous, and the number of conferences limited.
The scientific rigor of CDCs should be improved by including systematic reviews of evidence.
Planning and preparation for CDCs should include preliminary meetings to permit time for panels to study, review and comment on the quality and meaning of the evidence, and to pose additional questions for consideration.
New, more flexible models of conducting the conferences should be developed to include experimentation with panels with different types or levels of expertise, and to alleviate the pressures of late-night worksessions.
Release of the report and related press and public education activities should be separated in time from the conference itself.

In summary, the working group concluded that the NIH should continue to support Consensus Development Conferences, and other types of meetings, but that there need to be significant changes in procedures to improve their scientific rigor and ultimate effectiveness. The recommended changes to the existing consensus development conference process focus on formalizing the topic selection process, on garnering advice and review on topics and conference plans, lengthening the process to include opportunity for improved panel preparation and outside input, including a systematic review of the scientific evidence, and separating the conference itself from the conference reporting activities. It is recognized that these changes would significantly lengthen the process and consequently the costs of conducting the program, although some of the recommendations could be implemented with little or no additional resources.

1. Background and Introduction

In October, 1998 the Director of NIH, Harold Varmus, established a working group of NIH and non-NIH representatives (see Appendix for complete membership) to review the activities of the Office for Medical Applications of Research, particularly the consensus development process. The working group was established under the aegis of the Advisory Committee to the Director (ACD) and was asked to submit a report to the ACD at its meeting in June, 1999. The goal was to identify how the process could be improved to better serve both the NIH and the health care system, and to suggest possible alterations or improvements to achieve these goals.

Prior Reviews

NIH held its first consensus development conference in 1977, and the Office for Medical Applications of Research was officially established in 1978. The new office's mandate stated that OMAR: a) advises the NIH Director on medical applications of research; b) promotes systematic identification and evaluation of clinically relevant NIH research information; c) promotes effective transfer of this information to the health care community and other agencies; d) provides a link between the technology assessment activities of the NIH and Office for Health Technology Assessment (now in the Agency for Health Care Policy and Research [AHCPR]); and e) monitors the effectiveness and progress of the assessment and transfer activities of the NIH.

The consensus development program has been the subject of five previous reviews. The first was in 1980, when then Director of NIH Donald Frederickson, prompted by the formation of the National Center for Health Care Technology in HHS and its mission to do technology assessments in health care, convened a committee chaired by the Director of the National Heart, Lung, and Blood Institute, Dr. Robert Levy to review the program. The committee recommended the continuation of OMAR while it also recommended clarification of the roles of the Institutes, OMAR, and other government agencies concerned with technology assessment and transfer. The Levy committee also recommended that OMAR should have a more senior advisory committee comprising several Institute Directors or Deputies (Levy RI et al., 1980).

The Rand Corporation did a comprehensive program evaluation in 1983. They conducted a content analysis of 24 published consensus statements produced between 1979-1983 and more in-depth analysis of eight consensus conferences and statements that took place between 1979-1980. The major findings from this very extensive evaluation were related to the impact of the conferences on change in medical care. They concluded that the impact was modest at best. They suggested several ways to improve the dissemination process, such as direct mailings of statements and publication in specialty journals, and tracking the impact of the conferences, such as tracking changes in prescriptions and in third party reimbursement practices (Kanouse et al, 1989).

Third, a University of Michigan study conducted in 1987 focused on the consensus program's process by interviewing participants on the planning committees, panel chairs and NIH staff, and by analyzing data from questionnaires to panelists, speakers and discussants. Among other things, they suggested more time for the panelists to write; that questions should include ethical and social aspects of medical science; and that there be more systematic and consistent topic identification (Wortman et al., 1988).

A 1990 study of the consensus program process was carried out by the Institute of Medicine (IOM). The IOM study observed all steps in planning the conference, the conference itself and the writing session. The IOM panel recommended that the scope be broadened to include social, economic and ethical issues; that topic selection should consider the state of clinical practice; that a research program to measure impact should be undertaken; and that evidence should be graded as to its quality when presented to the Consensus Panel. They also suggested an external advisory council for OMAR; that OMAR's reporting relationship to the Director of NIH should be strengthened; and that topic selection should be more formalized (IOM Report, 1990).

In 1994, the Inspector General (IG) carried out a study focused on the impact of the NIH consensus program on continuing medical education (CME) activities in medical schools. Department chairs in medical specialties were found to be more familiar with the program than were chairs in family medicine. The IG concluded that there was limited public and practitioner familiarity with the consensus program and that the use of the consensus statements for CME programs was limited. They recommended that NIH take steps to more effectively communicate with those responsible for CME programs and encourage the incorporation of consensus findings into continuing education activities. These recommendations have been instituted. Additionally, CME quizzes and credits are available for most consensus statements. These are available in the printed consensus statement booklets and on the OMAR website (Miller D. et al., 1994).

Current Review

Many recommendations from the previous reviews led to program changes and improvements. However, there were new concerns about the appropriate timing of some conferences, the adequacy or readiness of the data base, and the need for additional scientific rigor. In addition, changes in NIH over the past several years, the continuous, dramatic changes in the health care system, and the fact that there are now other bodies charged with doing consensus development and practice guidelines and assessments of new treatments, all suggested the need for a new review of consensus development at NIH.

The Working Group received its charge from Dr. Harold Varmus, Director, National Institutes of Health, who asked the group to review the goals, activities, and outcomes of CDCs while considering such issues as the role of the CDC process within NIH in promulgating recommended clinical practices. Specific issues and questions raised by Dr. Varmus and the work group are outlined in the discussion sections below.

The working group met twice, December 18, 1998 and February 16, 1999. During these meetings the group met with the Director of NIH as well as several others with detailed knowledge of, responsibility for, and experience in the consensus development process at NIH. These included Dr. John Ferguson, Director of OMAR, Dr. William Harlan, Associate Director for Disease Prevention, NIH, and Dr. Leon Gordis, Johns Hopkins University, and member and/or chair of several CDCs. In addition the working group heard presentations from Dr. Douglas Kamerow, Director of the Center for Practice and Technology Assessment, Agency for Health Care Policy and Research; and Dr. Steven Woolf, Medical College of Virginia, who also served as a member of the working group. The project was also discussed with several NIH Institute Directors to obtain their input on concerns and issues about the consensus process that needed to be addressed.

Through these meetings and discussions the work group identified several key areas that should be addressed in improving the consensus process at the NIH. These included consideration of the role of the NIH process among the Federal and non-Federal activities in these areas. Specific aspects of the current process were also considered, including topic selection, the planning and conduct of consensus conferences themselves, and post-conference activities, including the public release of the report as well as dissemination and evaluation activities. These issues and suggested recommendations for addressing them form the remainder of this report.

2. General Issues

Scope and usefulness

Are NIH Consensus Development Conferences still necessary?
Do the goals need to be modified or updated?
Should CDCs include review and recommendations of policy?
How should other Government agencies (e.g., Agency for Health Care Policy and Research, Centers for Disease Control and Prevention) or non-Government groups, (e.g., American Medical Association, American College of Physicians) be involved?

Workgroup Discussion and Conclusions:
The NIH consensus conference process, as currently conducted, carries the imprimatur of the NIH and can serve an important role in informing health care professionals and the public about improvements in medical practice and the public health, and in some cases about limitations and remaining questions. In general, the workgroup believes CDCs should be used to coalesce or promulgate evidence-based information on a topic, rather than to develop consensus where the science does not warrant, to make policy recommendations, or to develop practice guidelines. CDCs can be used to target the moment when the science is there but the practice is not.

An important exception to this generalization is that under some circumstances NIH should also address topics for which the meaning of the evidence is not yet clear, or not as clear as popularly perceived. There are many examples in medicine of interventions that have been widely used despite an inadequate data base upon which to justify wide scale use. Given the enormous costs of health care in the US that stem directly from widespread use of technologies and treatments of unproven value (often fueled by the mistaken impression of clinicians and payers that there is supporting evidence), a statement from the NIH that evidence is inconclusive is of great importance. These topics should be considered on a case by case basis and would be an exception to the general rule.

The goal of CDCs should be to create a statement that advances broader understanding of the issue and that will be useful to health professionals and the public. While not the primary goal, development of evidence-based statements and transmittal to another group or body should be useful for further application and development of health policy or improved medical and public health practice. Other agencies, such as the Centers for Disease Control and Prevention and the Agency for Health Care Policy and Research, and professional medical practice organizations should be able to utilize consensus statements to support and develop recommendations and guidelines based upon scientific evidence while continuing to consider other issues, such as service delivery systems, cost-effectiveness, regulatory issues, financing, etc.

NIH should limit its role to the careful selection and consideration of scientific questions that might be used to help inform policy decisions, medical practice, and the public health. In general, NIH should address questions on whether or not a treatment or intervention has an effect, and not on questions about whether or not a particular treatment should be widely used, or how its use should be implemented or financed in detail. Questions such as the latter are more appropriate for other agencies specifically charged with health policy formulation, perhaps in conjunction with the NIH. At the same time, CDCs should include consideration of economic, social, legal, and ethical aspects of issues, but these should not be the primary focus of what is essentially, and should remain, a scientific evaluation. The timing of conferences is critical and should not be so early in the development of a new technology that there is not sufficient data upon which to draw conclusions, nor so late that the conference merely reiterates a consensus already reached by the field.

3. Conference Planning

Oversight, review, approval

Should there be an advisory group to assist in the selection of topics, approval of chairs, panels, speakers, etc?
How would such a group function?
Who should be members?
How and to what extent should an individual Institute's involvement be integrated into the process?

Workgroup Discussion and Conclusions:
While the OMAR has had the benefit of advice from a committee of Institute representatives since 1977, the working group believes that the consensus process would benefit from a different advisory process for external review of topics and conference plans. The workgroup was sensitive to the difficult burden of assuring adequate and fair representation on panels, a strong and appropriate chair, and appropriate balance in the final presentations to reflect the full range of relevant data.

The workgroup felt that some of this burden would be relieved and the overall quality of CDCs would be enhanced by having an Advisory Committee review and approve CDCs at several critical points in their development. One model that the working group believes should be considered strongly includes a process in which an Advisory Committee would review the proposed topics at one or more times per year to make a recommendation as to whether or not the topics meet the criteria for a consensus conference. Formal prospective criteria should be in place to facilitate the decision. Those that are recommended for further development would proceed in the normal fashion, with the selection of a planning committee, nomination of chair, drafting of questions and drafting of a conference agenda. Once at this stage, the Conference Plan would again be reviewed by the Advisory Committee. The Advisory Committee would review the Plan and make a preliminary advance determination of the sufficiency of scientific evidence for reaching central conclusions, the balance of presentations, the appropriateness of the panel members in terms of types of expertise represented and the lack of conflicts of interest, and the timeliness of and need for the conference.

Several ideas were discussed about the make up of such an Advisory Committee. Most support was generated for a high level, internal group, made up of 5-6 Institute Directors, who would serve on a rotating basis with other Institute Directors. The group could also contain non-NIH members or seek the ad hoc advice of outside experts, be made up of a subcommittee of the Advisory Council to the Director (ACD), or of non-NIH representatives (similar to other Advisory Councils at the NIH).

Selection of topics

How can the process for selection of CDC topics be improved?
Should the process be formalized and include a periodic solicitation from OMAR, as in the past?
Should topics be reviewed?
If so, by whom?
Should more topics be suggested by non-NIH entities?
Should new criteria be developed that topics would need to meet before being considered for a CDC?

Workgroup Discussion and Conclusions:
Selection of topics is critical to the success of the CDC program. They must be at an appropriate stage of readiness. There must be reasonable confidence that a consensus can be reached and that the result would be ready for wide scale dissemination. Currently topics are usually submitted by an NIH Institute on an ad hoc basis. OMAR and the sponsoring Institute would then make the determination of whether or not a topic is appropriate for a CDC. While OMAR has developed criteria for topic selection, including public health importance, existing controversy, adequate data base, cost impact, and several others, there was concern expressed that the criteria may be too difficult to objectively assess under the current system. In particular, it was suggested that an independent assessment of the readiness of a topic for consideration, such as could be performed by an Advisory Committee, would benefit the program by giving an additional level of objective review.

In addition, the workgroup recommends that the selection criteria be revised to make them more explicit. For example, the quality of the data could be assessed by using a ranking system weighting evidence by whether or not it comes from a randomized or other controlled trial or observational study. Also, consideration should be given to excluding non-published data except under certain conditions (such as its availability for review by the panel prior to the CDC). Included in any selection process should be an assessment and prioritization of the potential impact on medical practice. Criteria have been described for prioritizing assessments of clinical conditions and medical technologies that include such things as the potential to improve individual patient outcomes, affect a large patient population, reduce costs, and reduce variations in medical practice (Lara and Goodman, 1990) that could be used to extend or modify the existing criteria to aid in the selection process.

Selection of chair, panel members, and speakers

Currently the panel chair is selected by OMAR and the sponsoring Institute, and panel members and speakers are suggested by a planning committee, although OMAR has final approval.

Does this process need to be changed?
Are Institutes too involved in forming panels?
Could the process benefit from outside review?
As conference topics become more complex, should the balance and specificity of expertise of the panel members be changed?

Workgroup Discussion and Conclusions:
In general, the majority of the working group believes that the current procedures for nominating and selecting panel chairs, members, and speakers is working well. However, the group believes that NIH might benefit from experimenting with different types and degrees of expertise represented on panels, particularly as the issues addressed become more complex. For example, the possibility of utilizing "expert panels" may be better than "non-expert" panels, especially when, as in the current process, there is a short time available to hear evidence. Alternatively, it may be worth considering having a professional panel experienced in analyzing data but without particular subject matter expertise, who could serve as an unbiased "jury".

Number of CDCs

Currently, CDCs are convened on an as-needed basis and have become variable in number and timing for any given year.

Should there be more predictability in the number and timing of CDCs (eg certain number of CDCs per year awarded competitively to Institutes)?

Workgroup Discussion and Conclusions:
The process might be improved by having OMAR solicit ideas from the Institutes to compete for a flexible, but limited, number of CDCs to be conducted each year. This is similar to the process used by OMAR in the past, and it may be especially useful to have the topics for CDCs generated once per year or at certain times in order to facilitate review by an Advisory Committee.

4. Conference Process

Consensus development process

Is the process used to develop consensus adequate?
Should there be alternate methods of consensus development that utilize new techniques for systematically reviewing data applied to particular kinds of issues or questions?
Should standards be applied for data presented at a CDC (eg. published) or other steps taken to assure an evidence-based process?
Should additional or revised standards or criteria be developed to assure an adequate spectrum of points of view of panel members and speakers?

Intensity of panel work

Is there a need to re-design the conference process, specifically the panel session to develop the consensus statement?
Are there other methods, such as preconference meetings to review evidence and develop preliminary statements, or lengthening the conferences, that would be less stressful and lead to an improved product?

Workgroup Discussion and Conclusions:
While the current model of CDCs has certain advantages (heightened sense of urgency and focus, production of an statement in a short time) and has generally led to excellent results, there is a clear need to revise the process to allow more deliberation and improve its scientific rigor. The current approach dates back to the 1970's and the field of systematic reviews and guideline development has progressed far since that time, but has not been incorporated into the CDC process. Other models, while taking additional time and resources, may be more appropriate. In addition, the current model of two and one half days of presentation, questions, and drafting has been characterized as unduly grueling by many participants, at times resulting in less thoughtfully developed conclusions. It also does not allow adequate opportunity for the evidence to be reviewed systematically beforehand. It also limits the opportunity for the experts to present their data to the panel in a balanced fashion, for panel members to thoroughly consider input from members of the public and representatives of interested organizations, for the panel to digest and reinvestigate the evidence, and for a report most suitable for public release to be properly crafted.

The workgroup felt strongly that the consensus development process itself would benefit from the application of new methods to systematically review data prior to a consensus conference, such as those used in the evidence-based approach used to establish practice guidelines in recent years. This technique involves a systematic review of the evidence, and while it would be more time consuming and involve a greater commitment from panel members, would improve the process and the product substantially.

A model of how such a more systematic approach could be integrated into the current process was discussed and outlined by the workgroup. In this model the OMAR would commission a systematic review of the scientific evidence on a topic as part of the expanded preconference preparation activities. This would allow an in depth study of the issues by the panel members and if provided early enough would allow them to make informed judgments about both whether the questions asked are sufficient and whether there are additional data gaps that should be addressed either by additional literature review or by the conference speakers. A presentation summarizing the results would also be made at the conference itself.

Elements of other models of consensus development should also be considered. One common model that OMAR might consider would draw from the system used by the Institute of Medicine and other organizations. In this model, panels convene several times over a period of a year or more to hear expert testimony as well as testimony from professional and public organizations and the public at large. The report is drafted, and when finalized, released to the public at a press conference. The working group believed that these kinds of more deliberate models, although extending the process and being more costly have clear benefits over the current process. They allow considerable time for panelists to become familiar with the subject, to pose new questions, and perhaps to identify and invite speakers to the conference to address specific questions about the data or particular studies that they feel must be answered to interpret the evidence. These more extended approaches also allow significant improvement in the ability of the panel to hear and consider input from the public and professional organizations. They also obviate the need to meet in late night sessions, and disconnect the writing of the draft report from release to the public and the press (see below).

The working group developed a model conference process for consideration that incorporates the features of the various models of developing consensus and recommendations. It is summarized in the following diagram:

a flow chart of a model conference process

A model such as this would afford the consensus panel a greater opportunity to achieve an evidence-based consensus. Some benefits include the conduct of a formal evidence-based review to inform the process, opportunity for panelists to study the systematic review and frame additional questions for the speakers, and an extended period of time for deliberations and reporting.

5. Timing and Post-conference activities

Conference Reporting and Products

Are there ways to increase the impact of the current scheduling of reporting of the consensus statements (public report immediately followed by press conference)?
Should the report be immediate?
Given the complexity of issues recently addressed and the likelihood that this trend will continue, should alternate kinds of products be considered, such as "interim statements"?

Dissemination of findings

Is the current set of activities (statement, press conference, journal publication) to disseminate the findings of CDCs adequate?
How could it be improved to accelerate the adoption of recommendations?

Evaluation of impact

Should there be standard follow-up processes to assess the impact of CDCs on health care?
What might they be?
How could NIH better stimulate such change in response to CDC recommendations?

Workgroup Discussion and Conclusions:
The workgroup believes strongly that there are many benefits to separating the conference from the report release and associated press activities. These include more time for thoughtful consideration of data and information presented at the conference, particularly from the public, and more time for preparation of press materials and other aspects of report release. The workgroup recommends that the report and press release be scheduled a reasonable time (6 to 8 weeks) after the conference itself is held. This might mean bringing either all or a subset of the panel back together for the release of the report; however, the workgroup thought that this would not be a significant barrier and that most panelists would be willing to commit to the extra time and travel, especially if the expectation were made clear at the beginning of the process.

The workgroup generally supported the current set of dissemination activities, but encouraged the NIH/OMAR to expand ways to disseminate the findings of the conferences. One possibility would be to draft an expanded press release that would be more of a 'press summary' and would perhaps facilitate additional press coverage. For its audience of medical professionals, NIH should consider producing a more substantial technical report of the conclusions, reflecting the elements of the systematic review that precedes the conference - bibliography, evidence summary, supporting charts, etc. In certain cases it would also be beneficial to include briefings on conference results for other health-related Federal agencies and professional societies.

In addition, the workgroup recognized that the audiences for the CDC statements have expanded greatly to include the general public as well as the medical and public health communities. The increased attention by the general public affects all aspects of CDC planning including topic selection and the target audiences should be considered during the planning phases and conduct of CDCs.

The workgroup also acknowledged the difficulty in assessing the global impact of CDCs on medical and public health practice but encouraged regular and systematic evaluation of the impact of the conferences.

6. Crosscutting Issues

NIH also has a role in convening other types of scientific meetings (e.g. Technology Assessment Conferences, with perhaps a new name). Workshops, seminars, and conferences are and should continue to be conducted on a wide variety of topics and the OMAR should continue to play a role in supporting and conducting these. This is especially true for topics that have cross-Institute relevance.

NIH/OMAR should continue to support a variety of types of scientific meetings. However, there needs to be clear criteria to distinguish them from CDCs, and NIH should reserve the title of "Consensus Development Conference" for only those conferences that meet established criteria and for which there is significant need. One type of additional conference might simply be to delineate new research issues and the state of scientific findings/progress in areas not yet ready for a consensus conference. These conferences should be not be structured and conducted like CDCs, and there should not be consensus recommendations based on the findings.

OMAR should be responsible throughout the consensus development process for interaction and collaboration, when possible, with other agencies (for example, AHCPR, CDC and HCFA) affected by or having an interest in the topics under review. These agencies could assist in the consensus development process and in disseminating the results of CDCs.

References:

Institute of Medicine. Consensus development at the NIH: Improving the program. Washington DC: National Academy Press, 81, 1990.

Kanouse DE, Brook RH, Winkler JD, et al. Changing medical practice through technology assessment: An evaluation of the NIH Consensus Development Program. Santa Monica, CA: The Rand Corporation, 1989.

Lara ME, Goodman C, (Eds.) National Priorities for the Assessment of Clinical Conditions and Medical Technologies. Report of a Pilot Study. Council on Health Care Technology, Institute of Medicine. National Academy Press. Washington DC, 1990.

Levy RI, Ringler RL, Scott DB, et al. Report of the Director's Oversight Committee on the Office of Medical Applications of Research. Washington, DC: National Institutes of Health, 1980.

Miller D, Corbett T, and Clarke MB. The NIH Consensus Development Program: Dissemination of findings through medical school continuing education activities. OEP 01-01-01760. Washington DC: DHHS, OIG, 1994.

Wortman PM, Vinokur A, and Sechrest L. Do consensus conferences work? A process evaluation of the NIH consensus development program. Journal of Health Politics, Policy, and Law, 13(3):469-497, 1988.

Appendix

Members of the Working Group of the Advisory Committee to the Director (ACD)
to Review the Office of Medical Applications of Research

Alan Leshner, Ph.D. (Chair)
Director
National Institute on Drug Abuse

Ezra Davidson, Jr., M.D. (Member, ACD)
Department of Obstetrics and Gynecology
King/Drew Medical Center

Peggy Eastman
Science Writer

Scott Grundy, M.D.
Director, Center for Human Nutrition
University of Texas Southwestern Medical Center

Barry Kramer, M.D.
Deputy Director, Division of Disease Prevention
National Cancer Institute

Joan McGowan, Ph.D.
Director, Musculoskeletal Diseases Branch
National Institute of Arthritis and Musculoskeletal and Skin Diseases

Audrey Penn, M.D.
Deputy Director
National Institute of Neurological Disorders and Stroke

Stephen Straus, M.D.
Chief, Laboratory of Clinical Investigation
National Institute of Allergy and Infectious Diseases

Steven Woolf, M.D.
Department of Family Practice
Medical College of Virginia