Since the passage of the Food, Drug and Cosmetic Act in 1938, the safety and effectiveness of medical diagnostics has been overseen by the Food and Drug Administration (FDA, 2006c). More specifically, the FDA has regulatory jurisdiction over any device or in vitro reagent that is “intended for use in the diagnosis of disease or other conditions, or in the cure, mitigation, treatment or prevention of disease, in man or other animals” based on the Medical Device Amendments of 1976 (Hackett and Gutman, 2005). To determine the “intended use” that is so key to its regulation, the FDA considers a device maker’s advertising, product distribution, labeling claims, product websites, and any form of promotional material on the product (Heller, 2006).
When the FDA asserts jurisdiction, this typically results in premarket submissions to the FDA under its premarket approval (PMA) or premarket notification (510[k]) requirements (Box 3-1). To help determine which route is most appropriate, the FDA evaluates how much risk the diagnostic poses, how it differs from other currently available diagnostics, and its intended use. Tests that pose the most risk, are the most innovative, or are intended “for a use which is of substantial importance in preventing
|
BOX 3-1 Premarket Approval and Premarket Notification at the FDA A PMA application usually requires manufacturers to submit clinical data showing that their device is safe and effective for its intended uses. For some tests, these clinical data can be published clinical studies and/or practice standards that can help determine the clinical performance of the test or retrospective comparisons of the diagnostic’s performance with that of another device that has already been clinically tested. But often the FDA requires prospective clinical studies to assess a new device’s safety and effectiveness. The Safe Medical Devices Act of 1990 authorizes the FDA to request data on clinical sensitivity, specificity, and predictive value for diagnostic tests that undergo a PMA review. These data are costly and time-consuming to procure, and they require clinical research expertise that many small companies lack. Most manufacturers try to avoid the necessity of a PMA review of their diagnostic tests and may even forgo bringing their test to market if a PMA application is required. Manufacturers can bypass the need for a PMA application if they can show that their device is substantially equivalent to one already on the market. This qualifies their device to enter the market via a 510(k) review process. This review requires manufacturers to submit data showing the accuracy, reproducibility, and precision of their diagnostic. Manufacturers also have to provide documentation supporting their claim that the diagnostic is “substantially equivalent” to a device already on the market. As is true for PMAs, there are no well-defined performance standards for 510(k) reviews, nor does the FDA clearly define the requirements for substantial equivalence. However, the agency has issued guidance documents that indicate the standards by which it will review a variety of types of diagnostics. It has also accepted the laboratory test standards set by other organizations, such as the Clinical Laboratory Standards Institute. None of these standards, nor the 510(k) or PMA review process itself, considers the clinical safety and effectiveness of the diagnostic. SOURCES: Gutman, 2000; Hackett and Gutman, 2005; IOM, 2005; FDA, 2006a. |
impairment of human health” are subject to the most regulatory scrutiny (Gutman, 2000; IOM, 2005).
Since the Medical Devices Act was enacted in 1976, a number of novel diagnostic tests based on genetics and other innovative molecular biology technologies have emerged. This created a large category of tests that would have had to undergo PMA review because there were no similar devices on the market on which to base a less onerous 510(k) review. The FDA Modernization Act in 1997 created a “de novo classification” for a device that is not equivalent to a legally marketed device. This classification allows manufacturers to bypass a PMA review for novel, low-risk devices. Such devices are reviewed for safety and efficacy by the FDA in a streamlined manner that usually does not require prospective clinical studies, relying instead on existing clinical literature to determine the device’s safety and effectiveness (Hackett and Gutman, 2005).
A PMA or 510(k) review may not be required if a cancer biomarker test is developed by a laboratory for in-house use (a “homebrew” test). The FDA historically has not regulated homebrew tests, and laboratories offering them must label their test results with a qualifier that indicates the tests have not been cleared or approved by the FDA (FDA, 2003b). The homebrew exemption can enable manufacturers to quickly bring their tests to market. For example, there are hundreds of genetic tests currently on the market, but only four have been granted FDA approval (Hudson and Javitt, 2006).
In 1992, the FDA attempted to exert more regulatory control over homebrew tests via its compliance policy guideline, which proposed applying general medical device regulation to homebrew tests. But due to strong objections from the laboratory community, which claimed that the proposed guideline would be an onerous duplication of regulations promulgated under the Clinical Laboratory Improvement Amendments (CLIA, see next section), the FDA stated that “the use of in-house developed tests contributed to enhanced standards of medical care in many circumstances, and that significant regulatory changes in this area could have negative effects on the public health” (DHHS, 2003). However, the FDA asserted that it had the authority to regulate homebrew tests should it wish to do so (Shapiro and Prebula, 2003), and it has been suggested that the agency’s choice not to regulate in-house tests was due to resource constraints (Heller, 2006).
Instead, the agency tried to ensure the safety and effectiveness of homebrew tests by regulating the building blocks, known as analyte-specific
|
BOX 3-2 FDA Regulation of Analyte-Specific Reagents In a ruling made effective in November 1998, the FDA subjected both the manufacturers of ASRs, as well as the laboratories using them, to regulation to ensure that ASRs would be made consistently over time according to the agency’s quality control requirements. It is the responsibility of the laboratory using the ASRs to develop a recipe for the homebrew test that incorporates the reagents, and it cannot share that recipe with other labs. All the testing using a homebrew diagnostic is done within the laboratory of the company or organization that developed it. Laboratories that produce ASRs must register with the FDA and satisfy the agency’s Quality System Regulations (good manufacturing practices), as well as report postmarket device failures. They are also required to indicate on the label for the ASR that its analytical and performance characteristics are not established. Makers of homebrew tests are not permitted to market their tests to other laboratories, nor can they sell packages of ASRs, or an ASR linked to a solid surface, with instructions on how to use the reagents in a testing procedure. Such packaging is considered to be a test kit subject to FDA review. SOURCES: FDA, 2003b; Shapiro and Prebula, 2003; IOM, 2005. |
reagents (ASRs),1 for these tests (Hackett and Gutman, 2005) (Box 3-2). In response to requests from manufacturers to clarify ASR regulations, the FDA recently issued draft guidance to better explain how FDA defines ASRs and to more clearly delineate the regulatory rules of these products for ASR manufacturers (FDA, 2006c). The document also provides examples of entities that FDA does and does not consider to be ASRs.
In addition to guidance documents, the FDA has also used warning letters to assert authority and establish precedent for oversight of new tests that manufacturers thought would be outside FDA’s jurisdiction. For example,
the FDA recently prevented Roche Molecular Diagnostics from registering their new microarray genetic test for drug metabolism (AmpliChip CYP450) as an ASR. The denial was based, in part, on an assessment that the intended use of the AmpliChip to identify genetic indicators of drug metabolism capabilities “is of substantial importance in preventing impairment of human health” (FDA, 2003a). Furthermore, the FDA does not regard a microarray, which uses multiple reagents to detect a genetic profile, as falling under its definition of an ASR (Hackett and Gutman, 2005). The FDA suggested seeking de novo classification for their AmpliChip, and Roche’s submission of previously published clinical literature on the genetic variants the AmpliChip detects and their clinically significant effects on drug metabolism led to FDA’s approval of the AmpliChip (Hackett and Gutman, 2005).
The FDA has also required manufacturers of preanalytical systems, which collect, stabilize, and purify RNA, to submit a 510(k) premarket notification for the devices (FDA, 2005c), and it recently asked the makers of a new serum protein test using mass spectroscopy for ovarian cancer screening (OvaCheck) to consult with the agency about the appropriate regulatory status of the test. The developers of the OvaCheck test expected it would fall under the homebrew exemption from FDA review. But the FDA indicated that the test may be subject to a 510(k) review because the software used to analyze the results could be considered a device intended for use in the diagnosis of disease and therefore subject to regulation (FDA, 2004b).
In September 2006, the FDA issued draft guidance for such tests that use complex mathematical formulas to interpret large sets of gene or protein data, referred to by the FDA as In Vitro Diagnostic Multivariate Index Assays (IVDMIAs) (FDA News, 2006a). The document notes that “the manufacture of an IVDMIA involves steps that are not synonymous with the use of ASRs and that are not within the ordinary ‘expertise and ability’ of laboratories that FDA referred to when it issued the ASR rule. Therefore, IVDMIAs do not fall within the scope of laboratory developed tests over which FDA has generally exercised enforcement discretion.”
The FDA also recently warned Access Genetics that several of the genetic test packages it manufactures and sells contain all that is needed to perform the tests, including lab assay protocols, and therefore are not homebrew tests, which are conducted only at the site at which they were developed (FDA, 2005e). The agency also notified the Nanogen Corporation that its NanoChip Molecular Biology Workstation, NanoChip Electronic
Microarray, and several ASRs were neither approved as a single system nor as separate components (FDA, 2005d; Heller, 2006). Furthermore, the FDA pointed out that some of the manufacturer’s publicity about its NanoChip system indicated that it could be used for clinical diagnostic applications and therefore could not be considered a research-only diagnostic exempt from FDA review, as the company expected (FDA, 2005d). If a test is used for research only, the FDA does not exert jurisdiction, but if the assay is used for a clinical purpose, such as for diagnosis, it is subject to regulation by the FDA. Neither the FDA nor CLIA offers any clear guidelines, however, for distinguishing the difference between a research-only diagnostic and a clinical diagnostic (Hackett and Gutman, 2005; Heller, 2006).
The FDA also appears to be more assertive now in requesting clinical data for its reviews of biomarkers linked to therapeutics. Biomarkers used in clinical trials to identify likely responders to drugs (pharmacogenomic tests) will be regulated as devices in parallel with their corresponding drug candidates, and those for higher risk conditions will require PMAs. The FDA guidance (2005a) recommends submitting pharmacogenomic data when the data will be used to make approval-related decisions and when the data are relied on to define, for example, trial inclusion or exclusion criteria, the assessment for prognosis, dosing, or labeling or used to support the safety and efficacy of a drug. If a test shows promise for enhancing dosing, safety, or effectiveness or will be specifically referenced on the label, the FDA recommends codevelopment of the device and drug and coordinated applications for FDA approval (FDA, 2005a). The experience with attempts to add pharmocogenetic tests for the drug metabolizing enzyme cytochrome p450 to the labels for drugs such as warfarin indicate just how great the challenge of validation can be (IOM, 2006).
In addition, in its February 2006 draft guidance on pharmacogenetic tests and genetic tests for heritable markers, the FDA stated that “For predictive screening in healthy or asymptomatic individuals, long-term follow-up (i.e., a longitudinal study) may be the only way to prove that the test was indeed predictive and to evaluate issues such as penetrance” (FDA, 2005a, p. 4). But this guidance also noted that for some genetic tests, there may be a sufficient clinical literature base to establish clinical validity of the new test without extensive new clinical studies.
Laboratory performance is overseen by the Centers for Medicare & Medicaid Services (CMS)2 under the Clinical Laboratory Improvement Amendments of 1988 (Hackett and Gutman, 2005). To be operational, a laboratory that conducts testing on human specimens for the purpose of providing information relevant to the diagnosis, prevention, or treatment of disease or physical impairments or for health assessments must be CLIA-certified (FDA, 2006b; Javitt, 2006). CLIA certification, which is renewed contingent upon inspection every two years, is intended to ensure the accuracy, reliability, and timeliness of patient test results from laboratories throughout the United States (FDA, 2006b; Box 3-3). But CLIA does not replace FDA regulatory authority over medical diagnostic tests; it does not address the clinical accuracy or usefulness of tests.
There are some state requirements that are more stringent than CLIA, as well as organizational guidelines and standards that can be voluntarily adopted by laboratories to further the accuracy of their testing (DHHS, 1999; Swanson, 2002). But most laboratories follow the minimum generic standards set by CMS under CLIA. The requirements for CLIA certification vary depending on whether laboratories conduct tests of moderate or high complexity. (Low-complexity tests, such as a urine dipstick, are simple enough to be performed by unskilled laboratory personnel or even by patients. These tests are waived from requiring CLIA certification.) The FDA determines the degree of complexity of in vitro diagnostics based on the amount of expertise, oversight, interpretation, and judgment required to perform the test, as well as the potential risk to public health if the test is inaccurately performed (FDA, 2006b).
|
BOX 3-3 Overview of CLIA Regulation of High- and Moderate-Complexity Tests
|
Both moderate- and high-complexity tests require laboratories to document the accuracy and reproducibility of their testing, the use of quality control procedures, and the proficiency training and testing of key personnel (Swanson, 2002; DHHS, 2003; CMS, 2006). The main difference between high- and moderate-complexity laboratories is that there are more stringent qualifications required for the personnel of high-complexity laboratories (DHHS, 2003). The FDA’s ASR ruling restricts sale of a reagent used for clinical purposes to laboratories designated as high complexity under CLIA, because these labs were thought to have the personnel and systems in place to allow for the reliable development of in-house tests.
Most moderate- to high-complexity tests fall under CLIA-specified specialty areas that require more specific proficiency testing programs. These include tests within the domains of microbiology and immunology.
Sanctions
SOURCE: CMS, 2005. |
But there are no specialty areas requiring proficiency testing indicated for molecular or biochemical genetic testing, despite the recognition by Congress that proficiency testing “should be the central element in determining a laboratory’s competence, since it purports to measure actual test outcomes rather than merely gauging the potential for accurate outcomes” (DHHS, 1988; CMS and DHHS, 2004; Javitt, 2006).
In 2000, the Secretary’s Advisory Committee on Genetic Testing generated a report that concluded that the oversight of genetics tests was insufficient to ensure their safety, accuracy, and clinical validity and recommended that CMS develop a specialty area for genetic testing under CLIA. Draft guidelines for genetic testing quality from the international Organisation for Economic Co-operation and Development also identified proficiency testing and lab quality as critical to ensuring health (OECD,
2006). In March 2006, the Genetics and Public Policy Center of Johns Hopkins University conducted a survey of laboratory directors of genetic testing laboratories and found widespread support for creation of a genetic testing specialty under CLIA (Hudson et al., 2006). Furthermore, the survey data found that proficiency testing was linked to greater accuracy of genetic testing, although at least a third of genetic testing laboratories fail to perform proficiency assessments for some or all of their tests. In April 2006, CMS proposed rule making that would create such a specialty area, but three months later CMS stated that existing CLIA regulations are adequate to protect public health, asserting that there is insufficient “criticality” to warrant rule making for genetic testing (Genetics & Public Policy Center, 2006; Hudson, 2006). The committee agrees with the need for oversight and recommends developing a specialty area for molecular diagnostics.
Clearly there is significant variability in the scrutiny of biomarker tests before and after entry into the market. This lack of consistency and transparency in the biomarker development process is problematic for two important reasons. First, the variability and uncertainty associated with oversight and assessment of biomarker tests are disincentives to innovation by developers. As noted above, the FDA previously has claimed legal authority to assert jurisdiction over diagnostic tests, but it has usually withheld its authority. Recently, the FDA has taken action to create clarification and precedent on a case-by-case basis regarding molecular diagnostics through letters or guidance documents. But when oversight is variable, evolving, and thus hard to predict, it can have a major impact on the risk of development. Unanticipated action by the FDA can result in delays and greatly increase the cost of development. As noted in Chapter 4, the variability and unpredictability of health care coverage adds an additional layer of risk and uncertainty for developers. Once a test enters the market, coverage decisions often depend on convincing evidence of clinical usefulness, but those decisions are made on an ad hoc basis and vary by payor, as there are no widely accepted guidelines for evidence standards.
Second, a lack of regulation and consistent assessment prior to market entry can lead to inappropriate adoption and use of biomarkers, unnecessarily increasing health care costs and potentially harming patients. Many diagnostic tests in use have not been validated or formally evaluated. Companies develop their own assessment criteria and standards for developing
and marketing diagnostic tests on a case-by-case basis and generally choose the path to market of least resistance. Competition tends to erode standards of evaluation, since the more rigorous the standard, the longer and more costly the development process and the less likely it is to be first to market. Most diagnostic tests enter the clinic as homebrew tests, which are exempt from FDA approval or clearance. Furthermore, even if a company seeks and obtains FDA approval, laboratories can develop their own in-house homebrew test and use that in place of the FDA-approved test. No federal agency currently enforces the accuracy of marketing claims made for homebrew tests.
Thus, there is a great need for a coherent strategy to make the biomarker development and adoption process more transparent, to remove inconsistency and uncertainty, and to elevate the standards and oversight applied to biomarker tests. No federal agency currently takes responsibility for ensuring the clinical validity of biomarkers, but oversight and ownership of the process are key to developing strategies and making effective and efficient progress in the field. The committee strongly urges the designation of an appropriate federal agency to provide leadership in the process and to coordinate and oversee interagency activities. The National Institute of Standards and Technology (NIST) is an appealing candidate. Although it has had a limited role in biomarker development to date due in part to financial restraints, it has the appropriate experience to play a broader role in the establishment of standards for biomarkers if given appropriate funding for that purpose. NIST standards work in health care and clinical chemistry is well established, and, more recently, NIST has begun some work related to cancer molecular genetics technology and standards, as well as work with the Early Detection Research Network of the National Cancer Institute (NCI) on cancer biomarker validation (Barker, 2003).
An important first step would be to convene all relevant government agencies (e.g., the National Institutes of Health [NIH], the FDA, CMS, the Agency for Healthcare Research and Quality, NIST) and non-government stakeholders (e.g., academia, the pharmaceutical and the diagnostics industry, and health care payors) to work together in developing a transparent process to create well-defined consensus standards and guidelines for biomarker development, validation, qualification, and use to reduce the variability and uncertainty in the process of development and adoption. For example, FDA, CMS, and industry should work together to develop guidelines for clinical study design that will enable sponsors to run a single study (or a minimal number of studies) to generate adequate clinical data
for review by both agencies. Optimizing clinical study design in this way could shorten time to market, reduce cost and risk, and strengthen the evidence base for evaluation. The FDA noted the importance of standards when formulating its Critical Path Opportunities List in 2006, by including several projects aimed at devising standards for microarray and proteomics-based identification of biomarkers, mapping the process and criteria for qualifying biomarkers for use in product development, and developing clinical trial data standards (DHHS and FDA, 2006).
Developing a complete set of guidelines and standards is an ambitious goal, as different guidelines will probably need to be developed for different stages of the development pathway, for different applications, such as for the different stages of drug development and clinical application (e.g., screening, diagnosis, treatment planning, response monitoring, and surrogate endpoints), for different technologies, and for single biomarkers versus panels or patterns. A flexible and adaptable process for monitoring the guidelines will also be needed so that they can be revised as technologies or evidence change, and they will probably require regular review and updates.
There are many informative examples that could serve as precedents to guide this process, as many professional organizations and collaborative groups have already proposed guidelines for various steps in the process (Box 3-4). These initiatives provide an excellent starting point, but there are numerous gaps in the continuum of biomarker development, adoption, and use that need to be filled. In addition, with so many sources of standards development, there is potential for overlap, with competing or conflicting standards that could lead to confusion. The situation could be greatly improved if a single entity took responsibility for providing overarching leadership in the area of biomarker development and use. Furthermore, adherence to most of these guidelines is voluntary, so it is important to devise strategies to ensure compliance, including both incentives and penalties. For example, reporting standards of the Standards for Reporting of Diagnostic Accuracy or STARD initiative (NCI Division of Cancer Prevention, 2006) have been adopted by a range of journals in reporting results of clinical studies of diagnostic tests. Similarly, the Minimum Information About a Microarray Experiment (MIAME) guide (MGED Society, 2005), developed by the Microarray Gene Expression Data Society, defines requirements for effective reporting on the entire process of collecting, managing, and analyzing microarray data so that the data can be reused and interpreted by others. The MIAME guide has been adopted by several
scientific journals as a requirement for publication, although the stringency of compliance enforcement by the journals likely varies. Broad adherence to these guidelines would be ensured if required for receipt of funding from federal agencies, including NIH.
It is also critically important for the FDA to clarify its authority over biomarker tests linked to clinical decision making and then establish and consistently apply clear guidelines for compliance with the requirements of biomarker test oversight. The committee recognizes that the FDA has limited resources and that additional funding will be necessary to make meaningful changes in this arena. That said, it should be noted that expanding FDA regulation to all homebrew tests would not be a wise use of limited FDA resources, and it is not desirable. However, it is desirable for the FDA to define a set of criteria for molecular diagnostic tests that would trigger additional oversight for those tests that are complex and are most likely to have an impact on the public’s health. The process for establishing these criteria and the associated regulations will need to be dynamic in order to adapt to rapid changes in technology.
The committee also strongly recommends that the appropriate federal agency (the Federal Trade Commission [FTC] or the FDA) effectively monitor and enforce marketing claims made for molecular diagnostics. The FDA’s ASR ruling limits ordering of homebrew tests using ASRs to a health professional or “other persons authorized by state law.” But the regulation does not preclude a health professional who is an employee of the company that offers a homebrew test from ordering the test for patients (Hudson and Javitt, 2006). CLIA oversight also does not restrict when and for whom a test may be performed, unlike the labels for FDA-approved drugs, which must specify indications for use in order to enter the market (Hudson and Javitt, 2006). This lack of regulatory definition of who may authorize the use of a homebrew test and for whom has opened the door for direct-to-consumer advertising of homebrew tests and their use by consumers without consulting their physicians.
At least eight companies currently promote genetic testing for health-related conditions directly to consumers through their websites and more are expected to in the future (Hudson and Javitt, 2006). The accuracy of this direct-to-consumer advertising of homebrew tests is not being regulated by the FDA, which has regulatory authority over advertising claims only for the products it reviews and approves or clears. The FTC prohibits false or misleading advertising and has claimed it will take action against such advertising of genetic tests. But the agency’s limited resources appear to be
|
BOX 3-4 Examples of Standards and Guidelines for the Development and Use of Biomarkers Microarray Gene Expression Data Society The initial goal of the Microarray Gene Expression Data (MGED) Society was to create standards for presenting and exchanging microarray data in order to improve the quality and reproducibility of microarray studies. MGED was founded as a grassroots movement in November 1999 and transitioned into an international nonprofit organization in 2002. MGED seeks to establish standards for microarray data annotation and exchange, facilitate the creation of microarray databases and related software, and promote sharing of microarray data. MGED plans to expand these goals to other functional genomics and proteomics high-throughput technologies in the future. Member organizations meet to exchange information and discuss goals in annual international conferences. MGED also supports seminars and tutorials for programmers and others interested in microarray design, quality control, and more. Seminars and conferences encourage attendants to contribute suggestions and improvements for MGED projects. MGED is currently pursuing six major projects in the form of working groups that conference via closed e-mail discussion boards. One major product of MGED is the creation of guidelines for the Minimum Information About a Microarray Experiment that should be reported so that others may unambiguously repeat and interpret microarray experiments. MIAME’s focus is on the content and structure of microarray information rather than on the format for capturing that information. While it serves as a guide to the development of microarray databases and data management software, it does not address different types of experiments. MIAME recommends that all reported microarray experiments provide annotation of samples, reliability estimates for particular data points, and standardized vocabularies and ontologies. MIAME is now a requirement for publication in several scientific journals, including the Nature Group, The Lancet, Cell, and EMBO Journal. The guidelines have also inspired the development of standards in other fields, including metabolomics and proteomics. SOURCES: Brazma et al., 2001; IOM, 2006; MGED, 2006. Microarray Quality Control Project Sparked by a 2003 paper by Margaret Cam that showed significant inconsistencies in microarray data across different platforms, the FDA initiated the Microarray Quality Control (MAQC) Project in 2005. The MAQC is part of the FDA’s Critical Path Initiative to modernize the sci- |
|
entific process by which potential drugs, medical devices, and biological products are developed into medical products. The MAQC Project seeks to validate microarray technology and publish standards for data from microarrays and other technologies, such as QRT-PCR, that will be made available to the microarray community. These standards will define thresholds and quality measures that can be used to assess the precision and comparability of data across different technology platforms. Such comparisons should help to identify and eliminate any systematic biases that may exist between microarrays and QRT-PCR. The MAQC Project is unique in that it is larger and more comprehensive than other comparisons and datasets generated thus far. The project involved six FDA centers, major producers of microarray platforms and RNA samples, the Environmental Protection Agency, the National Institute of Standards and Technology, and academic research centers. A total of 20 microarray products and 3 alternative technologies were used to perform over 1,300 tests at different labs. A total of 1,329 microarrays were used in the project (for a complete list of all microarrays used in the project, go to http://www.fda.gov/nctr/science/centers/toxicoinformatics/maqc/docs/MAQC_Summary_1stPhase.pdf). The results of the MAQC Project show that, overall, levels of variation between microarray runs at different sites and with different platforms were relatively low (total coefficient of variance ranged from 10 to 20 percent) and reproducibility was high (expression results overlapped 70–90 percent of the time). Although these results suggest that microarray technology may be more reliable for clinical applications than previously thought, some caution that the MAQC studies were conducted in “optimized” settings that may be difficult to recreate due to time and sample processing restrictions. Moreover, some argue that the vast majority of variability in the data is biological rather than technical. However, researchers can now compare their microarray technologies against the MAQC data to assess their genomic data quality. MAQC Project members met in September 2006 to discuss how to make microarray data useful in clinical settings. The MAQC results were released to the public on September 8, 2006, and published in the September 2006 issue of Nature Biotechnology. Those articles, which summarized MAQC findings and data sets, can be viewed free of charge at http://www.nature.com/nbt/focus/maqc/index.html. The FDA plans to publish guidance on microarray quality control and data analysis in December 2007. SOURCES: Tan et al., 2003; Couzin, 2006; DHHS, 2006; FDA, 2006d; Frueh, 2006; Perkel, 2006a,b. |
|
Human Proteome Organization Proteomics Standards Initiative The Human Proteome Organization (HUPO) Proteomics Standards Initiative (PSI), founded in 2002, aims to define community standards for data representation in proteomics to facilitate data comparison, exchange, and verification. It has numerous working groups that consist of academic, government, and industry researchers, software developers, publishers, and instrument manufacturers. The groups are developing a set of Minimum Information About a Proteomics Experiment (MIAPE) documents to provide guidelines on how to adequately report on various types of proteomics experiments. HUPO plans to eventually publish these documents, with the expectation that the requirements within will be enforced by journals, compliant repositories, and funders. The group is also working to develop formats for data exchange, as well as standardized vocabularies and ontologies. SOURCE: HUPO, 2006. American Society for Biochemistry and Molecular Biology Criteria for Publication of Proteomics Data The American Society for Biochemistry and Molecular Biology held a workshop in May 2005 to create a standardized set of criteria for the publication of proteomic data so that the entire proteomics community, including both specialists and nonspecialists, could confidently understand and use the standards for acceptable proteomics data. Attendees included members of the editorial advisory boards of major publishing groups and journals focusing on proteomics. Participants were divided into subgroups that were assigned different aspects of the guidelines. The guidelines created by editors of the Molecular and Cellular Proteomics journal, published in 2004, were used as a framework on which to build. During the workshop, each subgroup drafted and presented a preliminary set of criteria. The final criteria were published in March 2006. Proteomics guidelines for publication are now posted on the websites of Proteomics, Molecular and Cellular Proteomics, and the Journal of Proteome Research. The enforcement policies for adherence to the criteria are at the discretion of the journal’s editorial staff. SOURCES: Beavis, 2005; Cottingham, 2005. |
|
NIH Workshop: Standards in Proteomics Bethesda MD, January 4–5, 2005 The Standards in Proteomics workshop was a component of the Building Blocks, Biological Pathways, and Networks subdivision of NIH’s Roadmap for Medical Research initiative. The general goal of the meeting was to develop a community-based plan for consistent analysis, representation, dissemination, and publication of proteomic data. Participants also discussed a variety of logistical and technical issues related to implementing standardized data repositories for proteomics experiments. No definitive guidelines were derived from the meeting, but specific goals included the following:
SOURCE: NIH, 2006. Standards for Reporting of Diagnostic Accuracy The Standards for Reporting of Diagnostic Accuracy (STARD) evolved out of a 1999 Cochrane Colloquium meeting at which the Cochrane Diagnostic and Screening Test Methods Working Group noted the substandard reporting of diagnostic test evaluations. To improve the accuracy and completeness of reporting of studies on diagnostic accuracy, the working group created a steering committee of international experts to assess what should be included in a study report on diagnostic accuracy that would allow the reader to detect potential bias in the study, as well as to assess the applicability of the results. The steering committee did an extensive search of publications on the conduct and reporting of diagnostic studies and then convened a consensus meeting that included researchers, editors, methodologists, and professional organizations. |
|
The result was a 25-item general checklist and a flow chart that together help authors describe the essential elements of their design and conduct of the study, the execution of tests, and the results. The checklist specifies exactly what is needed for the title, abstract, introduction, methods, results, and discussion sections of a journal article written about a diagnostic study. The flow diagram indicates visually the process of sampling and selecting participants, the portion of participants that received the test or reference standard, and the portion of patients at each stage of the study. The STARD group thought that their general checklist for reporting studies of diagnostic accuracy applicable to research in any field would be more likely to be adopted by authors, peer reviewers, and journal editors than different checklists for each field. STARD was published in 2003 and offers voluntary guidelines for researchers and reviewers of research articles related to diagnostics tested clinically, although some journals have already adopted them as requirements for publication. SOURCE: Bossuyt et al., 2003. NIST–EDRN Workshop on Standards and Metrology for Cancer Diagnostics The National Institute of Standards and Technology and the Early Detection Research Network (EDRN) jointly sponsored a workshop in August 2005, with the aim of comparing the performance characteristics of different analytical platforms; to assess the needs for standard methods, assays, and reagents for cancer biomarker development and validation; and to make recommendations for the development of standard reference materials and standard operating procedures. The workshop focused on the following areas:
This workshop was one component of a broader effort on the part of EDRN to develop and test standards and paradigms for early cancer detection biomarkers. SOURCES: NCI Division of Cancer Prevention, 2005, 2006; Barker et al., 2006. |
|
Receptor and Biomarker Group of the European Organization for Research and Treatment of Cancer The Receptor and Biomarker Group (RBG) of the European Organization for Research and Treatment of Cancer (EORTC) is comprised of cancer researchers and clinicians from 18 European countries and primarily serves the European Community. The RBG-EORTC tries to establish and maintain the analytic and clinical validity of tumor biomarker tests by establishing quality assurance schemes that are obligatory for all markers used in the EORTC clinical trials, as well as by having its experts advise clinical cancer researchers on appropriate methodology and interpretation of results for tumor assays. The RGB-EORTC provides guidelines for the assay performance and handling of test materials collected in retrospective or prospective clinical trials and also provides procedures for preclinical laboratory testing. The group also provides biomarker laboratory reference materials to aid standardization of biomarker assays, as well as its own sensitive and specific assays for a number of biomarkers, including urokinase-type plasminogen activator and vascular endothelial growth factor. On a regular basis, the RBG-EORTC evaluates new tumor biomarkers and makes recommendations to international certifying boards and also to the European Commission for Registration of Biological Reagents. In addition, the RBB-EORTC tests new commercial kits for existing biomarkers and evaluates them, when appropriate, against currently accepted assays. SOURCE: Schmitt et al., 2004. NCI-EORTC REMARK (Reporting of Tumor MARKer Studies) GUIDELINES Recognizing that the number of biomarkers that become clinically useful is “pitifully small” compared with the number of reports on tumor markers, these 2005 guidelines were developed by the Statistics Subcommittee of the National Cancer Institute-European Organization for Research and Treatment of Cancer (NCI-EORTC) working group. The goal of the guidelines is to improve the reporting standards for published clinical tumor marker studies to allow adequate assessment of the quality of the study and the generalizability of study results and to improve the ability to compare results across studies. These are voluntary guidelines for researchers and reviewers of journal articles, although some journals may require adherence to the guidelines for publication. The guidelines focus on what should be reported for studies on clinical prognostic markers (those that predict |
|
clinical outcomes irrespective of treatment), although some of its requirements are also relevant to studies on predictive markers (those that predict response to specific treatments) and to tumor markers that are early in their development and have yet to be applied in a clinical setting. The REMARK guidelines were developed mainly for studies that evaluate single tumor markers and are not applicable to genomic or proteomic studies that simultaneously evaluate large numbers of markers. To develop the guidelines, the NCI-EORTC working group, comprised of statisticians, clinicians, and laboratory scientists, considered literature citing inadequate reporting or problematic analysis methods in published studies of tumor markers, as well as similar reporting guidelines developed for other types of medical research studies. The guidelines do not have specifications unique to tumor markers or the technologies used in their assays, but rather they list the relevant information that researchers should provide about their study objectives, materials and methods, study designs, statistical analyses, and results. The guidelines also suggest helpful presentations and analyses of data and require that a discussion include the limitations of the study and the clinical value of its results. As the working group noted, “high-quality reporting of a study cannot transform a poorly designed or analyzed study into a good one, but it can help to identify the poor studies, and we believe it is an important first step in improving the overall quality of tumor marker prognostic studies” (p. 9068). SOURCE: McShane et al., 2005. American Society of Clinical Oncology Clinical Practice Guidelines for the Use of Tumor Markers in Breast and Colorectal Cancer (2000 updated version) The American Society of Clinical Oncology (ASCO) developed conservative voluntary clinical guidelines for physicians that recognize their ultimate use/application depends on the physician’s judgment, taking into consideration each patient’s individual circumstances. To determine the clinical suitability of a cancer biomarker, the guidelines committee used the medical literature to evaluate how six tumor markers for colorectal cancer and eight for breast cancer affected such clinical outcomes as overall survival, disease-free survival, quality of life, toxicity, and cost-effectiveness of treatment. The guidelines committee considered the strength of the evidence from each study based on the quality of the study, with the most weight placed on evidence gathered from large, prospective, randomized controlled clinical trials. All clinical uses of the biomarker were considered, including screening, diagnosis, staging, surveillance, and monitoring response to treatment. |
|
The guidelines committee disregarded strong correlations between disease progression or disease response and a specific result on a biomarker test if physicians could not reliably use the result to alter a clinical course. For example, blood levels of the biomarker CA 27.29 tend to increase as breast cancer progresses, so that one well-designed study found it could detect recurrence about 5 months, on average, before other symptoms or tests. But that ability did not change therapy options or show a documented affect on disease-free or overall survival. Consequently, the guidelines do not recommend using CA 27.29 as a monitoring tool for breast cancer recurrence. Of the biomarkers, the committee recommended only the clinical use of hormone receptor (ER, PR) or HER-2 status of breast tumors to determine treatment. SOURCE: Bast et al., 2001. Tumor Marker Utility Grading System (TMUGS) and TMUGS-Plus The large number of preliminary studies, but few definitive studies, on tumor markers prompted some members of the ASCO committee that developed its practice guidelines for the use of tumor markers in breast or colorectal cancer to draft a clinical tumor marker utility grading system (TMUGS), which was published in 1996. The purpose of the TMUGS is to help research scientists design studies that will provide clinically useful data on tumor markers and to help expert reviewers evaluate published studies on tumor markers. The TMUGS requires researchers to specify how the tumor marker is evaluated, provide materials and methods details, and detail how the marker can be used clinically. From this it gives a 0 to 3 rating of the relative utility of a tumor marker for a specific use and outcome. Only those that received 2+ or 3+ were included in the ASCO guidelines. To get these grades, tumor markers had to be reliable and provide information that clinicians could use in their decision making. The reliability of the tumor marker was based on the quality of the studies assessing it, with more reliability attached to tumor markers evaluated in large, prospective, randomized studies or in meta-analyses of studies that provide lower levels of evidence. The TMUGS-Plus system was developed by British and American researchers and published in 1998. This system builds on TMUGS with the addition of a decision matrix in which weak, moderate, and strong prognostic categories are intertwined with weak, moderate, and strong predictive categories to enable reviewers to consider both when determining the clinical utility of a given tumor marker. The relative strength |
|
category in which a prognostic or predictive marker is placed is determined by how much it moves patients between prognostic stages or treatment response outcomes, respectively. TMUGS was designed not only to aid expert assessments of published data regarding the clinical utility of tumor markers, but also to help clinical investigators design tumor marker studies that will reveal the clinical utility of the marker. The authors state: “We do not suggest that this system is useful for application of a factor to an individual patient’s situation. Rather, we propose that the TMUGS-Plus system can be used to determine whether available data support the introduction of a tumor marker into routine clinical use. The individual physician and patient will then need to decide if the marker data are relevant to her particular situation” (p. 408). Both TMUGS and TMUGS-Plus are voluntary systems for evaluating clinical studies of tumor markers. SOURCE: Hayes et al., 1996, 1998. College of American Pathologists Conference on Solid Tumor Prognostic Factors The College of American Pathologists (CAP) convened a conference in 1999 to examine prognostic and predictive factors in breast, colon, and prostate cancers, with the aim of stratifying these factors into categories based on the strength of published evidence. Conference goals included reducing variation in methods, interpretation, and reporting, as well as developing strategies for implementing changes in how prognostic and predictive factors are evaluated and used. Working groups focused on cancer type–specific issues as well as issues common to all solid tumors. SOURCE: Hammond et al., 2000. Evaluation of Genomic Application in Practice and Prevention More than 1,200 genetic tests for diseases have been developed, and 950 are available for clinical use. Concerns over the safety and utility of these tests prompted the initiation of the Evaluation of Genomic Application in Practice and Prevention (EGAPP) in fall 2004 by the Office of Genomics and Disease Prevention at the Centers for Disease Control and Prevention (CDC). EGAPP draws from prior work conducted at the CDC by the ACCE projects (the name is derived from the four components of evaluation—analytic validity, clinical validity, clinical utility and associated ethical, legal and social implications), which proposed and |
|
tested a system for collecting, analyzing, and disseminating existing data on the safety and efficacy of DNA-based genetic tests. The overarching goal of the project is to develop a coordinated process for evaluating genetic tests and other genomic applications that are in transition from research to clinical and public health practice. In April 2005, an interagency steering committee of the Department of Health and Human Services established a nonfederal, independent working group of 13 multidisciplinary experts. The EGAPP working group is charged with providing clear linkages between scientific evidence and subsequent recommendations for genetic tests by serving on technical review panels, providing guidance on projects, establishing methods and protocols, and selecting topics for review. Current topics under review include:
The first three evidence reviews are being conducted by the Agency for Healthcare Research and Quality (AHRQ) Evidence-based Practice Centers, and the fourth is a more targeted review by a technical contractor. At a recent meeting (June 2006), the working group presented subcommittee draft reports and decided on final topics for the third year. The final expected products from the working group are three to five major reviews, two to three fast-track reviews, and a document on methods and evaluation. A long-term goal of the EGAPP is to create a sustainable process for pre- and postmarket assessment of genetic tests and other genomic applications in the United States. A critical component of EGAPP’s work is making these reviews and subsequent recommendations accessible to the public. One of EGAPP’s project activities is to develop informational messages that are targeted to specific audiences that would find the recommendations most relevant and useful. This information is intended to aid health care providers, payers, consumers, and policy makers to make informed decisions about the safety and efficacy of genetic tests and to safeguard against tests that may be released prematurely. SOURCE: CDC, 2006. |
preventing it from following through on its commitment. The FTC has yet to interfere with the direct-to-consumer claims made by the manufacturers of genetic tests, some of which appear to be false and misleading, according to some observers (Hudson and Javitt, 2006). Although the FTC, the FDA, and CDC recently issued a public alert to consumers about direct-to-consumer marketing of genetic tests, the message did not indicate that any actions were planned by any of those agencies (FTC, 2006).
Effective postmarket surveillance will also be needed to ensure the quality and accuracy of diagnostics, regardless of whether a biomarker test enters the market as a homebrew or with FDA approval or clearance. Although CLIA was intended to ensure the quality and accuracy of clinical laboratory tests, CLIA oversight appears insufficient to guarantee the accurate measurement and reporting of biomarker tests results. Experience with two well-established, prototypical cancer biomarkers that guide therapy
|
BOX 3-5 Estrogen Receptor—The Classic Cancer Biomarker For many decades the estrogen receptor (ER) has been used as a prognostic factor and as a predictor of response to endocrine therapies for breast cancer. It is considered a category I breast cancer prognostic factor by the College of American Pathologists, meaning that it is of proven prognostic importance and is useful in patient management. As such, accurate and reliable assessment of ER status is paramount for optimal breast cancer care. The usefulness of ER as a predictor of therapeutic response was first noted by retrospective review of patient data from many clinical trials conducted around the world. From those trials, 436 patients were identified in which the treatment response was recorded and ER was measured using one of several techniques, generally entailing some form of quantitative biochemical binding assay on fresh tumor samples. The results indicated that 55–60 percent of patients positive for ER responded to endocrine therapy, while those who were negative for ER had virtually no chance of responding. Since the late 1990s, a semi-quantitative test based on immunohistochemistry (IHC) has become the method of choice, primarily because of its ease of use, reduced cost, and the ability to perform the assay on small samples of fixed tissue. Although studies have shown that IHC is equivalent or superior to binding assays (which are also known to produce false results), there is widespread concern that the variability |
decisions for breast cancer patients—the estrogen receptor and the HER2 receptor—demonstrate the enormous challenges associated with the standardization and quality assurance of such tests (IOM, 2006; Boxes 3-5 and 3-6). There is a great deal of variation in the way these markers are measured and reported, and studies indicate a high rate of inaccurate results.
Surveillance and quality assurance activities, perhaps including proficiency testing, data collection and review, and/or inspections, could potentially be overseen by either the FDA or CMS and might be financed via user fees. A quality assurance testing program in place in the United Kingdom is an example of a possible model. This best practice program allows laboratories to compare their performance against reference materials and other laboratories and hence identify whether they have a testing problem (Ellis et al., 2004). The accompanying educational material and instructional assistance allows most laboratories to identify and rectify their
|
and inaccuracy of the test and interpretation of the results may lead to an unacceptably high error rate in determining ER status. The positive predictive value of ER tests is estimated to be in the range of 60 to 80 percent. The ER IHC test is not standardized, and many laboratories use FDA-approved reagents in different ways. In addition, there is no universal consensus on a scoring system for interpreting the results. A number of suggestions have been made to improve the reliability of the test results, including further improving and standardizing test kit reagents and controls, staining procedures, and scoring methods. Automated image analysis could perhaps also lead to more consistent and accurate results, but software programs must first be standardized and validated as well. The situation is likely to become even more complex as newer methods have been developed to measure ER mRNA rather than protein. However, these methods are not yet widely used, in part because most labs are not equipped to conduct the tests, and they also are not fully standardized. Furthermore, new endocrine therapies, including aromatase inhibitors, are now available to treat breast cancer, and emerging evidence suggests that optimal response to a particular endocrine therapy depends on the level of ER expression, not just whether it is positive or negative. SOURCES: McGuire, 1975; Fitzgibbons et al., 2000; Diaz and Sneige, 2005; Ross, 2005. |
|
BOX 3-6 Herceptin/HercepTest Development and Approval The drug Herceptin (trastuzumab) targets the HER2 protein (human epidermal growth factor receptor 2), which is overexpressed in about 25 percent of breast cancer cases due to amplification of the gene. It has been widely noted that the efficacy of trastuzumab could not have been demonstrated in clinical trials in the absence of a selective biomarker to identify the patients most likely to respond (those with elevated expression of HER2). Mathematical models indicate that the clinical efficacy of the drug would have been difficult, if not impossible, to demonstrate with the number of patients typically recruited for a clinical trial if the study population had not been enriched with responders via a biomarker test for HER2. However, the assay used by Genentech during the clinical trials of trastuzumab was deemed inadequate for commercialization, so the company approached Dako Corporation to co-develop a commercial immunohistochemistry kit (HercepTest). The kit was then validated by demonstrating equivalence to the clinical trial assay. The FDA approved both Herceptin and HercepTest in September 1998. Dako then launched a comprehensive education program for pathologists. Because evidence shows that women with high levels of HER2 overexpression are more likely to respond to trastuzumab, accurate reporting of patient test results is crucial for therapeutic decision making. However, studies have reported considerable variability in the accuracy of the test across different labs. In the general clinical population there are high false-positive and false-negative rates for the HerceptTest as well as the fluorescent in situ hybridization (FISH) test, which measures HER2 gene amplification. Although large central reference laboratories generally perform both these tests well with low false-positive and false-negative rates, small-volume laboratories, particularly those that use homebrew tests, have very high false-positive and false-negative rates. Discussions regarding the interpretation of the IHC test, as well as whether it is the most accurate test to use (i.e., compared with FISH testing for HER2 gene amplification) have also generated significant controversy. SOURCES: Jacobs et al., 1999; Pauletti et al., 2000; Paik et al., 2002; Ellis et al., 2005; IOM, 2006; Perez et al., 2006; Reddy et al., 2006. |
problems. In the UK HER2 Quality Assurance Program, which publishes its collective results (Rhodes et al., 2004), retesting of over 100 European laboratories on 6 successive occasions resulted, over a 2-year period, in a significant improvement in the number of laboratories achieving acceptable HER2 test results. The U.S. National Comprehensive Cancer Network HER2 Task Force recently recommended that “HER2 testing should be done only in laboratories accredited to perform such testing,” noting that “such proficiency testing will probably become mandatory for laboratory accreditation in the future” (IOM, 2006; Carlson et al., 2006).
Because aberrant cell growth is a hallmark of cancers, oncology drug development has traditionally focused on agents that inhibit the basic machinery of cell division. As a result, these drugs often have significant side effects due to activity against normal proliferating tissues in the body. Many new cancer therapies are being developed to target the specific molecular changes in cancer cells that allow them to bypass normal regulation of signaling pathways that control cell growth and survival, with the goal of greater efficacy and fewer side effects. However, because of the heterogeneity among tumors, it is important to develop accompanying diagnostic tests that can identify those patients with the specific molecular changes targeted by a drug, who are most likely to benefit from that particular drug. Yet development of biomarker-based tests has lagged and is often undertaken outside the company developing the drug.
A prime example of this phenomenon is the development process for the targeted drug trastuzumab (Herceptin) and the accompanying diagnostic, the HercepTest (Box 3-6). In that case, although a biomarker test used by Genentech to select patients for inclusion in the clinical trials of trastuzumab was invaluable for successfully demonstrating efficacy of the drug, the test was deemed inadequate for commercial clinical use. As a result, another company (Dako Corporation) was asked to develop a commercial test at a very late stage of the drug development process. In the case of the epidermal growth factor receptor (EGFR) inhibitor cetuximab, a biomarker test to measure EFGR expression was used to select patients for clinical trials and a commercial test was again developed by Dako, but the evaluation of EGFR expression by immunohistochemistry has since
been shown to be invalid for selecting responders, and a replacement test has yet to be proposed or developed (Box 3-7). It has been suggested that a diagnostic marker for EGFR inhibitors will need to incorporate various elements of the many signaling pathways that lie downstream from EGFR (Grunwald and Hidalgo, 2003). Another example is the development of antiestrogen therapies and the estrogen receptor biomarkers tests, which were completely separate in time and place (Box 3-5).
As noted in Chapter 2, the expectations of a biomarker test with respect to accuracy and performance vary depending on how it is used. A pharmaceutical company may have sufficient confidence in a biomarker they apply in phase I to establish a dose for a phase II trial even if the biomarker has not undergone stringent clinical qualification; the risk or benefit is theirs. But in the clinic, a responder/nonresponder stratification biomarker test that is going to be used to determine the appropriate treatment plan for individual patients must be highly accurate. Otherwise, a large number of patients could miss an opportunity for beneficial, life-saving therapy, while others could undergo expensive treatments and endure side effects with no chance of benefit.
Although patient stratification biomarkers are at the heart of personalized medicine, they can also create a conundrum for industry that does not arise with other biomarker applications. For example, few would argue that biomarkers that streamline the drug development process by facilitating earlier elimination of drug leads that are destined to fail or that improve dose selection should not be used. Similarly, even pharmaceutical marketing groups would support use of patient stratification biomarkers if the responder population was so small as to make stratification markers essential to demonstrate efficacy and FDA approval, as was the case for trastuzumab. But if a pharmaceutical company gets FDA approval for a drug without the use of stratification markers, it may debate whether to develop a biomarker test for patient stratification even if only one-fourth of patients respond well, as this is likely to limit the number of patients who take the drug (IOM, 2006).
Could progress in understanding cancers that enables cancer classification and thus patient stratification based on biomarker tests even lead to disincentives to drug development? For example, if the approximately 150,000 people diagnosed with colon cancer each year in the United States could be divided into multiple subsets, each with a different targeted therapy, then the market for any single drug is significantly reduced and companies will have less opportunity to recoup their developmental costs. In other words,
biomarker tumor classification could essentially convert common cancers into orphan diseases, because the number of patients with any single particular subtype would be too small for companies to justify the enormous investment needed to develop novel drugs (Rawson, 2006).
Drugs that target the EGFR again provide a case in point. Four EGFR inhibitors have been approved by the FDA to treat patients with specific types of cancer, and several more are in development (Grunwald and Hidalgo, 2003). However, to date, there are no valid biomarker tests to accurately predict which patients are most likely to respond to each drug (Box 3-7). Such biomarker tests would be enormously helpful to clinicians and patients who must make treatment choices, but the individual sponsoring drug companies historically have not had sufficient incentives to discover stratification markers, nor the expertise to do so and develop the markers as diagnostic tests.
Industry perspectives are slowly changing regarding the strategic value of stratification biomarkers, and some companies are now devoting considerable resources to discovering stratification markers and working with diagnostics companies to convert these into molecular diagnostic tests. They are appreciating that patient stratification is both better medicine and better business, even when stratification is not essential for approval of the drug. First, health care resources are limited, and they risk unfavorable reimbursement decisions if expensive cancer drugs are effective for only a small fraction of the patients. In Great Britain and other countries with government-funded medicine, cost-effectiveness is considered in making coverage decisions (IOM, 2006; see Chapter 4). The drug company can either lower its price or stratify the patients to increase the cost-effectiveness. In essence, they may achieve a higher price if they can direct their therapy to those patients who will benefit. Second, companies also realize that if they do not stratify their patients, their competitor may do so and rapidly take over the market share. Finally, the corollary to identifying the responder population is identification of a nonresponder group for which appropriate therapy can then be developed. Genentech and other companies are now devoting considerable efforts to biomarker development aimed at patient stratification (Waring, 2006). But new strategies, methods, and infrastructure are needed to leverage and integrate the available data to better inform the biology, as noted in Chapter 2.
Public funding is also being directed toward filling the stratification biomarker gap. For example, the new NCI Lung Cancer Program has announced that it will undertake a clinical trial that will attempt to define
|
BOX 3-7 EGFR Inhibitors—The Quest for Targeting Biomarkers Most clinical trials conducted with EGFR inhibitors have not selected patients on the basis of a specific molecular marker. This is perhaps not surprising, given that EGFR overexpression is not tightly correlated with cancer progression, in contrast to HER2 expression in breast cancer. Available data are inadequate to determine what biomarkers might reliably indicate EGFR dependence and thereby select specific subsets of patients for treatment. Gefitinib and Erlotinib Gefitinib, owned by AstaZeneca, and erlotinib, marketed by OSI Pharmaceuticals in partnership with Genentech and Roche, are two examples of small molecule drugs that entered clinical trials without the use of a biomarker. A small clinical trial of gefitinib demonstrated a 10 percent response rate in patients with lung cancer, and the FDA granted accelerated approval in 2003. However, in December 2004, the FDA released a statement notifying of the failure of a large clinical trial of gefitinib to show an overall survival advantage compared with placebo in treating patients with lung cancer. In June 2005, FDA issued a new label for gefitinib “that limits use to patients with cancer who in the opinion of their treating physician, are currently benefiting, or have previously benefited, from gefitinib treatment.” Nonetheless, researchers express optimism for the drug if appropriate biomarkers can be identified and validated for selecting a responsive patient population. Research shows that some patients who respond to gefitinib have amplifications and/or mutations in the EGFR gene, although response to treatment was quite variable, even for the same EGFR mutation type. Erlotinib, approved by the FDA in 2004, showed an average two-month survival benefit for patients with nonsmall cell lung cancer compared with placebo. Further analysis showed that survival benefit correlated with EGFR status. In about one-third of the patients, tumor cells were examined to see whether they had high or low levels of EGFR. Among the approximately 55 percent who had high EGFR expression, the effect on survival was much greater than it was in people whose EGFR levels were low. In both of these cases, the targets of the aforementioned EGFR inhibitors were not validated and biomarkers were not used to assess whether the drugs were actually working for patients. The result has been a lack of empirical, clinically derived data and inefficient adoption into clinical practice. |
|
Cetuximab Cetuximab, a monoclonal antibody made by ImClone,was approved by the FDA in 2004. Approval was based on a clinical trial that used an immunohistochemisty test for EGFR expression (EGFR pharmDx made by Dako and approved simultaneously by the FDA in 2004) to select colorectal cancer patients likely to respond to cetuximab. Patients were not entered into the clinical trials of cetuximab unless they had a positive result in the EGFR test (i.e., 1 percent or greater tumor cells showing positivity). The tumor response rate was 22.9 percent in patients who received cetuximab in combination with irinotecan, and 10.8 percent in patients who received cetuximab alone. However, no trials were performed with EGFR-negative patients, and further evaluation has shown that therapeutic response does not correlate with EGFR positivity, either by the number of positive cells or by staining intensity, perhaps because the staining pattern for EGFR is often quite heterogeneous. In March 2005, Chung et al. reported that EGFR-negative colorectal cancer patients treated with cetuximab in a nonstudy setting had a 25 percent response rate, suggesting that exclusion of patients from cetuximab treatment based on EGFR status is unwarranted. Thus, the EGFR test may have increased the probability that cetuximab would be approved, but it is not a valid test for making treatment decisions in the clinic. However, the FDA-approved drug label still specifies that it be used for the treatment of EGFR-expressing colorectal cancer. Panitumumab Panitumumab, a fully humanized monoclonal antibody against EGFR made by Amgen Inc., received accelerated FDA approval in September 2006 for treatment of EGFR-expressing metastatic colorectal cancer in patients with progression following chemotherapy. A randomized controlled trial of 463 patients demonstrated a significant improvement in progression-free survival in patients receiving panitumumab (mean of 96 days versus 60 days for patients receiving best supportive care). There was no difference in overall survival however, and the approval stipulates that the manufacturer must conduct a postmarketing trial to determine whether the drug improves survival in patients with fewer prior chemotherapies. Enrollment in the phase III trial was limited to patients whose tumors were positive for EGFR expression, defined as at least 1+ membrane staining in > 1 percent of tumor cells by the Dako EGFR pharmDx test kit (approved by FDA in September 2006 to assess patient eligibility for panitumumab as well as cetuximab). The majority of patients’ tumors exhibited EGFR expression in 10 percent or |
|
more of tumor cells, with no evidence of a correlation between either the proportion of cells expressing EGFR or the intensity of EGFR expression. SOURCES: FDA News, 2003; FDA, 2004a,c; FDA News, 2004a,b; Miller, 2004; Chung et al., 2005; Hirsch and Witta, 2005; Takano et al., 2005; Amgen, 2006; FDA News, 2006b; Hsieh et al., 2006. |
a panel of genomic and proteomic pharmacodynamic markers to predict response to EGFR inhibitors in patients with nonsmall cell lung cancer. The trial will be supported by funds from the NCI director’s discretionary budget reserve, and it will be conducted in conjunction with the FDA and CMS (Goldberg and Golderg, 2006; Niederhuber, 2006).
Progress in this field could be accelerated by better coordinating the development of biomarker diagnostics and new drugs. Such coordinated development could help companies choose the most promising drug leads, optimize clinical trial designs, and facilitate rapid and effective adoption into clinical practice (FDA, 2004b). However, there are many challenges to be addressed before this ideal approach becomes reality (IOM, 2006). For example, the cost and risks of diagnostic development are significant when clinical validity and utility must be established, and they add substantially to the existing high cost of drug development (estimated at $400–800 million, on average (Frank, 2003)). Companies may be unwilling to take the risk of investing in diagnostic development in the earlier phases of drug development, when approval of the drug is so uncertain. (On average, only 1 out of 5 Investigational New Drugs achieves FDA approval; Dimasi, 2001). But timing is key for the coapproval and marketing of drug-diagnostic combinations. Companies need to find better ways to integrate basic and clinical research efforts and emphasize the search for subpopulations based on theoretical and empirical evidence prior to phase III to avoid the rush near end of drug development (i.e., immediately prior to drug approval) to develop and validate the accompanying diagnostic.
Strategies to minimize the costs of diagnostics development and to facilitate risk sharing between pharmaceutical and diagnostics companies would also encourage development efforts. One possibility would be to
link FDA approvals of therapeutics and the associated response-predicting diagnostics, such that one is contingent on the other. For example, one possible approach might be to provide contingent FDA approval of a drug by requiring postapproval reporting on diagnostic performance and subsequent submission of a PMA or 510(k) application for the diagnostic (IOM, 2006; Lipshutz, 2006). However, it is not clear that the FDA could compel a diagnostics company to sponsor a submission when the drug is sponsored by an unrelated pharmaceutical company. Furthermore, it seems unlikely that the FDA would rescind approval for a drug if the biomarker is subsequently shown to be invalid, as in the case of cetuximab.
The FDA should more clearly delineate the expectations and requirements for approval of diagnostic-therapeutic combinations. The FDA’s “Critical Path” white paper placed high importance on personalized medicine and the codevelopment of diagnostics and therapeutics, noting that new trial designs and methods are needed, but it did not lay out specific plans for how to how to facilitate codevelopment (FDA, 2004b). In its April 2005 concept paper on codevelopment, the FDA noted that codevelopment applies when the use of an in vitro diagnostic is mandatory for drug selection for patients, or when optional use during drug development may assist in understanding disease mechanisms and in selecting clinical trial populations. Furthermore, codevelopment applies to a device-drug combination product, as well as to in vitro devices and drugs sold separately. The concept paper explicitly stated that drug selection biomarkers, particularly for high-risk conditions, were expected to be subject to PMA reviews (FDA, 2005b). In response, industry representatives expressed concern that the paper proposed higher hurdles for diagnostic approval than current requirements and that clinical utility is not explicitly defined in the act (Hinman et al., 2006). A new guidance document specifically focused on diagnostic-therapeutic combinations is being drafted by the FDA, taking into account feedback on the concept paper (Woodcock, 2006), but the content, impact, and enforceability are unknown at this time.
Because more than one FDA center will often be involved in the approval or clearance decisions in the case of diagnostic-therapeutic combinations, the agency should also clarify the roles of each center and focus on ensuring coordination between the centers to facilitate clearance or approval of molecular diagnostics. In addition, the FDA needs more dynamic ways of changing a drug’s label when new data for selecting appropriate target populations emerge. When a biomarker test linked to a drug is found to be invalid (as in the case of cetuximab), the FDA should move quickly to make
the necessary label changes. Conversely, when new biomarkers are found to aid therapeutic decisions for existing drugs, a formal mechanism is needed to evaluate the evidence and consider appropriate label changes.
The standards used to demonstrate the validity of biomarkers vary considerably, in part because there is no overarching leadership in the field to set uniform consensus standards for biomarker development. The FDA and CMS have some authority over diagnostic tests, but oversight has been variable and unpredictable and, in many cases, inadequate to ensure the safety, effectiveness, and value of tests on the market. Oversight by federal agencies has been evolving recently, and FDA in particular has taken some positive initial steps, but there is still a need for clarification, uniformity, and leadership in this area. The process of biomarker development and evaluation could be improved by making it more transparent, consistent, and effective.
First, government agencies, including NIH, the FDA, CMS, and NIST, and non-government stakeholders, including academia, the pharmaceutical and diagnostics industry, and health care payors, should work together to develop a transparent process for creating well-defined consensus standards and guidelines for biomarker development, validation, qualification, and use to reduce the uncertainty in the process of development and adoption. NIST or another appropriate federal agency should provide a leadership role in the process, coordinating and overseeing interagency activities.
Second, the FDA should clarify its authority over biomarker tests linked to clinical decision making and then establish and consistently apply clear guidelines for the oversight of those tests. In addition, the appropriate federal agency (e.g., the FDA or the FTC) should monitor and enforce marketing claims made about molecular diagnostics. Variability and unpredictability in oversight can reduce interest and investment in developing innovative diagnostics, while inadequate evaluation and oversight could lead to harm for patients and unnecessary costs for society.
Third, the FDA and industry should work together to facilitate the codevelopment of diagnostic-therapeutic combinations. The FDA should more clearly delineate the expectations and requirements for diagnostic-therapeutic combination approval, and companies need to better integrate basic and clinical research rather than waiting to contract biomarker development in the late stages of phase III testing. Coordinated development of
diagnostics and therapeutics is an important component in the quest for personalized medicine; it could help companies choose the most promising drug leads, optimize clinical trial designs, and facilitate rapid and effective adoption into clinical practice.
Finally, CMS should develop a specialty area for molecular diagnostics under CLIA. In contrast to other high-complexity tests, CMS has not created a specialty area for molecular diagnostics that could mandate, among other requirements, participation in specified proficiency testing programs. The minimum generic standards set by CMS under CLIA are inadequate to ensure high-quality, accurate test results.
Amgen. 2006. Vectibix (panitumumab). [Online]. Available: http://wwwext.amgen.com/pdfs/products/vectibix_pi.pdf [accessed October 2006].
Barker PE. 2003. Cancer biomarker validation: Standards and process: Roles for the National Institute of Standards and Technology (NIST). Annals of the New York Academy of Sciences 983:142-150.
Barker PE, Wagner PD, Stein SE, Bunk DM, Srivastava S, Omenn GS. 2006. Standards for plasma and serum proteomics in early cancer detection: a needs assessment report from the National Institute of Standards and Technology–National Cancer Institute Standards, Methods, Assays, Reagents and Technologies Workshop, August 18–19, 2005. Clinical Chemistry 52(9):1669-1674.
Bast RC Jr, Ravdin P, Hayes DF, Bates S, Fritsche H Jr, Jessup JM, Kemeny N, Locker GY, Mennel RG, Somerfield MR. 2001. 2000 update of recommendations for the use of tumor markers in breast and colorectal cancer: Clinical practice guidelines of the American Society of Clinical Oncology. Journal of Clinical Oncology 19(6):1865-1878.
Beavis R. 2005. The Paris consensus. Journal of Proteome Research 4(5):1475.
Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, Lijmer JG, Moher D, Rennie D, de Vet HC. 2003. Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. Annals of Internal Medicine 138(1):40-44.
Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC, Gaasterland T, Glenisson P, Holstege FC, Kim IF, Markowitz V, Matese JC, Parkinson H, Robinson A, Sarkans U, Schulze-Kremer S, Stewart J, Taylor R, Vilo J, Vingron M. 2001. Minimum information about a microarray experiment (MIAME)—Toward standards for microarray data. Nature Genetics 29(4):365-371.
Carlson RW, Moench SJ, Hammond ME, Perez EA, Burstein HJ, Allred DC, Vogel CL, Goldstein LJ, Somlo G, Gradishar WJ, Hudis CA, Jahanzeb M, Stark A, Wolff AC, Press MF, Winer EP, Paik S, Ljung BM. 2006. HER2 testing in breast cancer: NCCN Task Force report and recommendations. Journal of the National Comprehensive Cancer Network 4 (Suppl 3):S1-S22; quiz S23-S4.
CDC (Centers for Disease Control and Prevention). 2006. Evaluation of Genomic Applications in Practice and Prevention (EGAPP): Implementation and Evaluation of a Model Approach. [Online]. Available: http://www.cdc.gov/genomics/gtesting/ACCE/fbr.htm [accessed June 2006].
Chung KY, Shia J, Kemeny NE, Shah M, Schwartz GK, Tse A, Hamilton A, Pan D, Schrag D, Schwartz L, Klimstra DS, Fridman D, Kelsen DP, Saltz LB. 2005. Cetuximab shows activity in colorectal cancer patients with tumors that do not express the epidermal growth factor receptor by immunohistochemistry. Journal of Clinical Oncology 23(9):1803-1810.
CMS (Centers for Medicare & Medicaid Services). 2005. Interpretive Guidelines for Laboratories. [Online]. Available: http://www.cms.hhs.gov/CLIA/03_Interpretive_Guidelines_for_Laboratories.asp [accessed July 2006].
——. 2006. CLIA Interpretive Guidelines for Laboratories, Subpart K—Quality System for Nonwaived Testing. [Online]. Available: http://www.cms.hhs.gov/CLIA/downloads/apcsubk1.pdf [accessed May 2006].
CMS, DHHS (Department of Health and Human Services). 2004. CLIA Laboratory Requirements. 42 CFR Part 493.
Cottingham K. 2005. Universal proteomics guidelines debated. Journal of Proteome Research 4(4):1051.
Couzin J. 2006. Genomics. Microarray data reproduced, but some concerns remain. Science. 313(5793):1559.
DHHS (Department of Health and Human Services). 1988. H.R. Rep. No. 100-899, at 28.
——. 1999. Public Health Service Act. 42 USC 263a:353.
——. 2003. Medicare, Medicaid, and CLIA programs; Laboratory requirements relating to quality systems and certain personnel qualifications; Final rule. Federal Register 68(16):3698.
——. 2006. Federal Register 71(77):20707-20708.
DHHS, FDA (Food and Drug Administration). 2006. Critical Path Opportunities List. [Online]. Available: http://www.fda.gov/oc/initiatives/criticalpath/reports/opp_list.pdf [accessed August 2006].
Diaz LK, Sneige N. 2005. Estrogen receptor analysis for breast cancer: current issues and keys to increasing testing accuracy. Advances in Anatomic Pathology 12(1):10-19.
Dimasi JA. 2001. Risks in new drug development: approval success rates for investigational drugs. Clinical Pharmacology and Therapeutics 69(5):297-307.
Ellis CM, Dyson MJ, Stephenson TJ, Maltby EL. 2005. HER2 amplification status in breast cancer: A comparison between immunohistochemical staining and fluorescence in situ hybridisation using manual and automated quantitative image analysis scoring techniques. Journal of Clinical Pathology 58(7):710-714.
Ellis IO, Bartlett J, Dowsett M, Humphreys S, Jasani B, Miller K, Pinder SE, Rhodes A, Walker R. 2004. Best practice no 176: Updated recommendations for HER2 testing in the UK. Journal of Clinical Pathology 57(3):233-237.
FDA (Food and Drug Administration). 2003a. Letter from OIVD to Roche Molecular Diagnostics Re: AmpliChip. [Online]. Available: http://www.fda.gov/cdrh/oivd/amplichip.html [accessed July 2006].
——. February 26, 2003b. CDRH, OVID, Analyte Specific Reagents, Small Entity Compliance Guidance, Guidance for Industry. [Online]. Available: http://www.fda.gov/cdrh/oivd/guidance/1205.html [accessed July 2006].
——. February 2004a. New Device Approval: DakoCytomation EGFR pharmDx-P030044. [Online] Available: http://www.fda.gov/cdrh/mda/docs/p030044.html [Accessed August 2006].
——. July 2004b. Letter to Correlogic Systems, Inc. [Online]. Available: http://www.fda.gov/cdrh/oivd/letters/071204-correlogic.html [accessed July 2006].
——. December 2004c. FDA Statement on Iressa. [Online]. Available: http://www.fda.gov/bbs/topics/news/2004/new01145.html [accessed July 2006].
——. March 2005a. Guidance for Industry: Pharmacogenomic Data Submissions. [Online]. Available: http://www.fda.gov/CbER/gdlns/pharmdtasub.htm [accessed July 2006].
——. April 2005b. Drug-Diagnostic Co-Development Concept Paper. [Online]. Available: http://www.fda.gov/cder/genomics/pharmacoconceptfn.pdf [accessed September 2006].
——. August 2005c. Class II Special Controls Guidance Document: RNA Preanalytical Systems (RNA collection, stabilization and purification systems for RT-PCR used in molecular diagnostic testing. [Online]. Available: http://www.fda.gov/cdrh/oivd/guidance/1563.html [accessed July 2006].
——. August 2005d. Letter to Nanogen Corporation. [Online]. Available: http://www.fda.gov/cdrh/oivd/letters/081105-nanogen.html [accessed July 2006].
——. August 2005e. Warning Letter to Access Genetics. [Online]. Available: http://www.fda.gov/cdrh/oivd/letters/080105-access.html [accessed July 2006].
——. 2006a. CDRH (Center for Devices and Radiological Health) Search Guidance Database. [Online]. Available: http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfggp/search.cfm [accessed July 2006].
——. 2006b. CLIA—Clinical Laboratory Improvement Amendments. [Online]. Available: www.fda.gov/cdrh/clia/ [accessed July 2006].
——. 2006c. FDA history. [Online]. Available: http://www.fda.gov/oc/history/ [accessed July 2006].
——. 2006d. MicroArray Quality Control (MAQC) Project. [Online]. Available: http://www.fda.gov/nctr/science/centers/toxicoinformatics/maqc/ [accessed August 2006].
FDA News. May 5, 2003. FDA Approves New Type of Drug for Lung Cancer. [Online]. Available: http://www.fda.gov/bbs/topics/NEWS/2003/NEW00901.html [accessed August 2006].
——. February 12, 2004a. FDA Approves Erbitux for Colorectal Cancer. [Online]. Available: http://www.fda.gov/bbs/topics/NEWS/2004/NEW01024.html [accessed August 2006].
——. November 19, 2004b. FDA Approves New Drug for the Most Common Type of Lung Cancer. [Online]. Available: http://www.fda.gov/bbs/topics/news/2004/NEW01139.html [accessed August 2006].
——. 2006a. FDA Drafts Regulatory Direction to Industry for Active Ingredients Used in Medical Tests. [Online]. Available: http://www.fda.gov/bbs/topics/NEWS/2006/NEW01444.html [accessed September 2006].
——. September 27, 2006b. FDA Approves New Drug for Colorectal Cancer, Vectibix. [Online]. Available: http://www.fda.gov/bbs/topics/NEWS/2006/NEW01468.html [accessed October 19, 2006].
Fitzgibbons PL, Page DL, Weaver D, Thor AD, Allred DC, Clark GM, Ruby SG, O’Malley F, Simpson JF, Connolly JL, Hayes DF, Edge SB, Lichter A, Schnitt SJ. 2000. Prognostic factors in breast cancer. College of American Pathologists Consensus Statement 1999. Archives of Pathology and Laboratory Medicine 124(7):966-978.
Frank RG. 2003. New estimates of drug development costs. Journal of Health Economics 22(2):325-330.
Frueh FW. 2006. Impact of microarray data quality on genomic data submissions to the FDA. Nature Biotechnology 24(9):1105-1107.
FTC (Federal Trade Commission). 2006. At-Home Genetic Tests: A Healthy Dose of Skepticism May Be The Best Prescription. [Online]. Available: http://www.ftc.gov/bcp/edu/pubs/consumer/health/hea02.htm [accessed July 2006].
Genetics & Public Policy Center. 2006. News Releases: Lax Oversight of Genetic Tests “A Risk to Public Health”—Public Policy Groups File Petition for Rulemaking with CMS. [Online]. Available: http://www.dnapolicy.org/news.release.php?action=detail&pressrelease_id=61 [accessed October 2006].
Goldberg K, Goldberg P, eds. 2006. The Cancer Letter 32(24).
Grunwald V, Hidalgo M. 2003. Developing inhibitors of the epidermal growth factor receptor for cancer treatment. Journal of the National Cancer Institute 95(12):851-867.
Gutman S. June 19, 2000. Presentation at the meeting of the IOM Committee on Technologies for the Early Detection of Breast Cancer. Washington, DC.
Hackett JL, Gutman SI. 2005. Introduction to the Food and Drug Administration (FDA) regulatory process. Journal of Proteome Research 4(4):1110-1113.
Hammond ME, Fitzgibbons PL, Compton CC, Grignon DJ, Page DL, Fielding LP, Bostwick D, Pajak TF. 2000. College of American Pathologists Conference XXXV: Solid tumor prognostic factors—Which, how and so what? Summary document and recommendations for implementation. Cancer Committee and Conference Participants. Archives of Pathology and Laboratory Medicine 124(7):958-965.
Hayes DF, Bast RC, Desch CE, Fritsche H Jr, Kemeny NE, Jessup JM, Locker GY, Macdonald JS, Mennel RG, Norton L, Ravdin P, Taube S, Winn RJ. 1996. Tumor marker utility grading system: A framework to evaluate clinical utility of tumor markers. Journal of the National Cancer Institute 88(20):1456-1466.
Hayes DF, Trock B, Harris AL. 1998. Assessing the clinical impact of prognostic factors: When is “statistically significant” clinically useful? Breast Cancer Research and Treatment 52(1-3):305-319.
Heller M. March 21, 2006. The basics and direction of the regulation of molecular diagnostic and prognostic devices. Presentation at the IOM workshop on Developing Biomarker-based Tools for Cancer Screening, Diagnosis, and Treatment. Washington, DC.
Hinman LM, Huang SM, Hackett J, Koch WH, Love PY, Pennello G, Torres-Cabassa A, Webster C. 2006. The drug diagnostic co-development concept paper Commentary from the 3rd FDA-DIA-PWG-PhRMA-BIO Pharmacogenomics Workshop. The Pharmacogenomics Journal 6(6):375-380.
Hirsch FR, Witta S. 2005. Biomarkers for prediction of sensitivity to EGFR inhibitors in non-small cell lung cancer. Current Opinion in Oncology 17(2):118-1122.
Hsieh MH, Fang YF, Chang WC, Kuo HP, Lin SY, Liu HP, Liu CL, Chen HC, Ku YC, Chen YT, Chang YH, Chen YT, Hsi BL, Tsai SF, Huang SF. 2006. Complex mutation patterns of epidermal growth factor receptor gene associated with variable responses to gefitinib treatment in patients with non-small cell lung cancer. Lung Cancer 53(5):311-322.
Hudson K, Javitt GH. 2006. Federal neglect: Regulation of genetic testing. Issues in Science and Technology 22(3):59-66.
Hudson KL. 2006. Genetic testing oversight. Science 313(5795):1853.
Hudson KL, Murphy JA, Kaufman DJ, Javitt GH, Katsanis SH, Scott J. 2006. Oversight of U.S. genetic testing laboratories. Nature Biotechnology 24(9):1083-1090.
HUPO (Human Proteome Organization). 2006. HUPO Proteomics Standards Initiative. [Online]. Available: http://psidev.sourceforge.net/ [accessed September 2006].
IOM (Institute of Medicine). 2005. Saving Women’s Lives. Joy JE, Penhoet EE, Petitti DB, eds. Washington, DC: The National Academies Press.
——. 2006. Developing Biomarker-based Tools for Cancer Screening, Diagnosis, and Treatment: The State of the Science, Evaluation, Implementation, and Economics. A Workshop. Patlak M, Nass S, rapporteurs. Washington, DC: The National Academies Press.
Jacobs TW, Gown AM, Yaziji H, Barnes MJ, Schnitt SJ. 1999. Specificity of HercepTest in determining HER-2/neu status of breast cancers using the United States Food and Drug Administration-approved scoring system. Journal of Clinical Oncology 17(7):1983-1987.
Javitt G. 2006. Institute of Medicine Committee on Developing Cancer Biomarkers. Presentation at the meeting of the Committee on Developing Cancer Biomarkers, Meeting 2. Washington, DC.
Lipshutz, R. 2006. Coordinating the development of biomarkers and targeted therapies: A diagnostics industry perspective. Presentation at the IOM workshop on Developing Biomarker-based Tools for Cancer Screening, Diagnosis, and Treatment. Washington, DC.
McGuire WL. 1975. Current status of estrogen receptors in human breast cancer. Cancer 36(2):638-644.
McShane LM, Altman DG, Sauerbrei W, Taube SE, Gion M, Clark GM. 2005. Reporting recommendations for tumor marker prognostic studies. Journal of Clinical Oncology 23(36):9067-9072.
MGED (Microarray Gene Expression Data). 2006. MGED Workgroups. [Online]. Available: http://www.mged.org/Workgroups/index.html [accessed July 2006].
MGED Society. September 2005. MIAME Checklist. [Online]. Available: http://www.mged.org/Workgroups/MIAME/miame_checklist.html [accessed June 2006].
Miller RT. 2004. Epidermal growth factor receptor and Erbitux. Pro Path Immunohistochemistry.
NCI Division of Cancer Prevention. 2005. The Early Detection Research Network: Translational Research to Identify Early Cancer and Cancer Risk. 3rd edition. DHHS. [Online]. Available: http://edrn.nci.nih.gov/docs/progress-reports/edbi6.pdf/download [accessed July 2006].
——. 2006. NIST-EDRN Workshop on Standards and Metrology for Cancer Diagnostics. [Online]. Available: http://www.cancer.gov/prevention/cbrg/edrn/workshop/index.html [accessed August 2006].
Niederhuber JE. 2006. Director’s update: New focus on lung cancer research. NCI Cancer Bulletin 3(21).
NIH. 2006. NIH Roadmap for Medical Research: Standards in Proteomics. [Online]. Available: http://nihroadmap.nih.gov/buildingblocks/proteomics/ [accessed July 2006].
OECD (Organisation for Economic Co-operation and Development). July 2006. Draft Guidelines for Quality Assurance in Molecular Genetic Testing. [Online]. Available: http://www.oecd.org/dataoecd/43/26/37103271.pdf#search=%22quality%20genetic%20site%3Aoecd.org%22 [accessed October 2006].
Paik S, Bryant J, Tan-Chiu E, Romond E, Hiller W, Park K, Brown A, Yothers G, Anderson S, Smith R, Wickerham DL, Wolmark N. 2002. Real-world performance of HER2 testing—National Surgical Adjuvant Breast and Bowel Project experience. Journal of the National Cancer Institute 94(11):852-854.
Pauletti G, Dandekar S, Rong H, Ramos L, Peng H, Seshadri R, Slamon DJ. 2000. Assessment of methods for tissue-based detection of the HER-2/neu alteration in human breast cancer: A direct comparison of fluorescence in situ hybridization and immunohistochemistry. Journal of Clinical Oncology 18(21):3651-3664.
Perez EA, Suman VJ, Davidson NE, Martino S, Kaufman PA, Lingle WL, Flynn PJ, Ingle JN, Visscher D, Jenkins RB. 2006. HER2 testing by local, central, and reference laboratories in specimens from the North Central Cancer Treatment Group N9831 intergroup adjuvant trial. Journal of Clinical Oncology 24(19):3032-3038.
Perkel JM. 2006a. In search of microarray standards. The Scientist 20(4):78.
Perkel JM. 2006b. Six things you won’t find in the MAQC. The Scientist 20(11):68.
Rawson K. 2006. Getting personal: FDA’s plan to save the drug industry. The RPM Report 1(9).
Reddy JC, Reimann JD, Anderson SM, Klein PM. 2006. Concordance between central and local laboratory HER2 testing from a community-based clinical study. Clinical Breast Cancer 7(2):153-157.
Rhodes A, Borthwick D, Sykes R, Al-Sam S, Paradiso A. 2004. The use of cell line standards to reduce HER-2/neu assay variation in multiple European cancer centers and the potential of automated image analysis to provide for more accurate cut points for predicting clinical response to trastuzumab. American Journal of Clinical Pathology 122(1):51-60.
Ross JS. 2005. Improving the accuracy of hormone receptor assays in breast cancer: An unmet medical need. Future Oncology 1(4):439-441.
Schmitt M, Harbeck N, Daidone MG, Brynner N, Duffy MJ, Foekens JA, Sweep FC. 2004. Identification, validation, and clinical implementation of tumor-associated biomarkers to improve therapy concepts, survival, and quality of life of cancer patients: Tasks of the Receptor and Biomarker Group of the European Organization for Research and Treatment of Cancer. International Journal of Oncology 25(5):1397-1406.
Shapiro JK, Prebula RJ. 2003. FDA’s regulation of analyte-specific reagents. Medical Device & Diagnostic Industry. [Online]. Available: http://www.devicelink.com/mddi/archive/03/02/018.html [accessed August 2006].
Swanson BN. 2002. Delivery of high-quality biomarker assays. Disease Markers 18(2):47-56.
Takano T, Ohe Y, Sakamoto H, Tsuta K, Matsuno Y, Tateishi U, Yamamoto S, Nokihara H, Yamamoto N, Sekine I, Kunitoh H, Shibata T, Sakiyama T, Yoshida T, Tamura T. 2005. Epidermal growth factor receptor gene mutations and increased copy numbers predict gefitinib sensitivity in patients with recurrent non-small-cell lung cancer. Journal of Clinical Oncology 23(28):6829-6837.
Tan PK, Downey TJ, Spitznagel EL Jr, Xu P, Fu D, Dimitrov DS, Lempicki RA, Raaka BM, Cam MC. 2003. Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Research 31(19):5676-5684.
Waring P. March 20, 2006. Therapeutics industry perspectives/realities (examples of successes and difficulties/failures of targeted therapy). Presentation at the IOM workshop on Developing Biomarker-based Tools for Cancer Screening, Diagnosis, and Treatment. Washington, DC.
Woodcock J. Deputy Commissioner for Operations FDA. 2006. Personal communication.