Joint Hearing on the
"Database and Collections of Information Misappropriation Act of 2003"
Wm. A. Wulf
National Academy of Engineering
on behalf of the
National Academy of Sciences
National Academy of Engineering
Institute of Medicine
Subcommittee on Commerce, Trade, and Consumer Protection
Committee on Energy and Commerce
Subcommittee on Courts, the Internet, and Intellectual Property
Committee on the Judiciary
U.S. House of Representatives
23 September 2003
My name is Bill Wulf. I have been asked to testify on behalf of the U.S. National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine (the "Academies"). As you know, the three Academies were chartered by Congress to provide advice to the federal government and to the nation on scientific, technical, and medical issues. My testimony is also being given on behalf of the Association of American Universities, the American Library Association, and the Association of Research Libraries.
I am grateful to have the opportunity to testify to you today about the draft legislation called the "Database and Collections of Information Misappropriation Act of 2003." This proposed legislation concerns a topic about which the scientific, research, education, and library communities have had an abiding interest and continuing concerns. Indeed, this is the third time that the Academies have testified on congressional legislation in this area since 1997, and both the Academies and their operating arm, the National Research Council (NRC), have published extensively on these issues over the past seven years. A list of recent relevant NRC reports and my biographical summary are provided at the end of this statement. Copies of the referenced NRC reports, as well as the Academies’ previous testimony, letters to Congress, and background analyses that we have written on previous versions of this legislation, are available on request.
Although I am authorized to speak only on behalf of the organizations that I represent here today, the issues I wish to raise with you pertain broadly to our nation’s scientific, research, education, and library concerns. And although I do not address directly the important issues raised by this legislation for the commercial sector, which are the focus of other testimony before you, my remarks are cognizant of the broader implications to our nation’s economic and social progress.
My testimony makes the following points, which build on our previous analyses:
Ø As a matter of public policy, there are several key principles that must inform the process of crafting any new legislation in this area, including the following:
1.) The public-domain status of factual, non-copyrightable information must be preserved, and any new protection regime should leave a wide buffer zone to ensure that factual information will not be subjected to proprietary claims.
2.) Only significant problems of unfair competition and market failure that have been proven should be addressed, and negative unintended consequences must be avoided.
3.) A reasonable balance of interests among all stakeholders in the information economy should be maintained. Congress should proceed cautiously in creating new protection regimes, because once created, a new protection regime is virtually impossible to dismantle.
4.) Healthy competition in the information industry needs to be promoted, while the further strengthening of unwarranted monopolies should be avoided.
5.) Exclusive control, either de jure or de facto, by private parties over information and databases produced by the government must be prevented.
6.) New protection regimes should not create any doubt or controversy about the lawfulness of traditional and customary access to and use of factual information for not-for-profit science, research, and education. Effective exceptions must be adopted.
7.) The important role and functions of our nation’s libraries must not be undermined.
Ø The draft legislation includes a number of improvements over previous versions of this legislation that have been introduced by the House Committee on the Judiciary since 1996.
Ø There are still major problems and ambiguities in the current draft bill that can and should be addressed, assuming that the creation of a new statutory remedy is still deemed necessary.
Ø The Academies and the other organizations represented in this testimony remain committed to playing a constructive role in helping Congress to consider the issues of database protection in a way that is consistent with the principles identified in this testimony and that avoids negative unintended consequences.
* * *
A. Key Principles
1) The public-domain status of factual, non-copyrightable information must be preserved, and any new protection regime should leave a wide buffer zone to ensure that factual information will not be subjected to proprietary claims.
As we have noted in previous testimony on this issue, access to and use of factual data in the public domain is essential to furthering our understanding of nature, to the validation of scientific claims, and to the progress of science and our nation’s system of innovation. The advent of digital technologies for collecting, processing, storing, and transmitting data has led to an exponential increase in the size and number of databases created and used. A hallmark trait of modern research is to obtain and use dozens or even hundreds of databases, extracting and merging portions of each to create new databases and new sources for knowledge and innovation.
Not only researchers and educators, but all citizens with access to computers and networks, constantly create new databases and information products for both commercial and noncommercial applications by extracting and recombining public-domain data and information from multiple sources. The rapid and continuous synthesis of disparate data by all segments of our society is one of the defining characteristics of the information age. Moreover, the serendipitous nature of research and the need of scientists and others to make transformative uses of non-copyrightable facts are such that one cannot predict when or how a database will be used. The ability of individuals and organizations to use information in a wide variety of innovative ways is also a measure of success of the original data-collection efforts.
Society uses the fruits of such research and innovation to expand the world’s base of knowledge and applies that knowledge in myriad downstream applications to create new wealth and to enhance the public welfare. Indeed, the policy of the United States has been to support a vibrant research enterprise and to assure that its productivity is exploited for national gain. Thus, freedom of inquiry, the availability of scientific and other factual data in the public domain, and the open publication of results are cornerstones of our research system that U.S. law and tradition have long upheld.
The results of these wise policies have been spectacular. For many decades, the United States has been the leader in the collection and dissemination of scientific and technical data and in the discovery and creation of new knowledge. Our nation has used that knowledge more effectively than any other nation to support new industries and applications, such as the biotechnology industry and the discovery of new diagnostics and cures for hereditary and other diseases.
In addition to the critical importance to our progress in science and innovation for factual information to remain in the public domain, it also is essential for many other compelling American values and needs, including 1st Amendment rights of freedom of expression, the promotion of the information economy, democracy and good governance, and other public-interest uses by consumers and society generally.
Because of the overriding importance of non-copyrightable factual information remaining in the public domain, any new legislation in this area must be limited to remedying unfair conduct in commerce rather than extending any exclusive property rights in the factual information itself.
Where there is uncertainty or doubt about the effect of potential new legislation, Congress should be careful to err on the side of caution. When the subject matter consists of the fundamental building blocks of knowledge, science and expression, the cost of over-protection far exceeds the cost of under-protection.
2) Only significant problems of unfair competition and market failure that have been proven should be addressed, and negative unintended consequences must be avoided.
Proponents of new database protection legislation have long argued that the misappropriation of databases is a major problem in the U.S. information industry and that existing methods of protection and remedies are inadequate. We find both assertions to be of increasingly dubious validity.
There is little evidence since the last time we testified on this issue before Congress that databases or other collections of information are routinely stolen or that there is massive market failure in the information industry. Indeed, database producers already enjoy a broad range of legal, technological, and self-help methods--many of which have been further strengthened in recent years--that protect the fruits of their investments. Available legal remedies at the federal level include traditional copyright law, new rights to prevent the circumvention of technological protection measures granted under the Digital Millennium Copyright Act, and the new Computer Fraud and Abuse Act. Under state law, many jurisdictions have a common law prohibition against misappropriation of "hot news," and a claim for trespass to chattels to protect databases.
Contracts and licenses are now used universally by database owners to make their products available under a range of custom-tailored, restrictive conditions. Technologies that protect digital databases and help enforce the existing statutory and contractual rights of owners are constantly being refined and strengthened, including such methods as encryption, online database access controls, software and hardware based trusted systems, and digital object identifiers and electronic watermarks. Indeed, these contracts and technologies are increasingly employed to limit uses of data and information that would otherwise be permitted by law. Congress should carefully monitor their use and consider whether limits on their use are needed to preserve the balance between access to and use of factual information and the incentives to invest in the collection of such information, both of which are essential to the vigorous growth of science and knowledge.
Finally, market based protections of databases through self-help business practices such as frequent updating and customizing can help make misappropriation less effective. Taken together, these database protection methods have helped make the commercial database market expand successfully in the United States.
The Academies, the Association of American Universities, the American Library Association, and the Association of Research Libraries nonetheless are committed to playing a constructive role in helping Congress to consider the issues of database protection in a way that is consistent with the principles identified in this testimony and avoids unintended negative consequences. The National Research Council reports referenced at the end of this testimony analyze the far-reaching negative implications to research and innovation that could result from legislation that is overly protective of data and non-copyrightable factual information.
3) A reasonable balance of interests among all stakeholders in the information economy should be maintained. Congress should proceed cautiously in creating new protection regimes, because once created, a new protection regime is virtually impossible to dismantle.
It is essential to consider fully and to promote a healthy balance of the interests of all the stakeholders in the information economy and society, including the general public. The trend in recent years has been to increase the breadth, depth, and length of all types of intellectual property protection. The creation of any new statutory rights, particularly for subject matter as sensitive as non-copyrightable factual information, must be done in full cognizance of the interaction of these rights with other parallel rights conferred by other statutes to avoid negative synergistic effects. In this regard, a major concern for the research community, as discussed further below, are the potential negative effects on access to and use of databases from unbridled, highly restrictive licensing practices, especially through increasingly legitimized adhesion contracts (e.g., shrink-wrap and click-on licenses), in concert with any additional new statutory rights in databases.
Further, history has demonstrated that once granted, intellectual property rights are rarely, if ever, reduced or limited. Thus, if there is uncertainty about the effect of any proposed new protection, it is important err on the side of caution and the preservation of the status quo.
4) Healthy competition in the information industry needs to be promoted, while the further strengthening of unwarranted monopolies should be avoided.
The promotion of competition is primarily an economic issue of direct interest to our colleagues in industry, but the benefits of competitive prices and increased quality accrue to the public. It is important, nonetheless, to emphasize that a preponderance of scientific databases are produced by sole sources, whether in the public or the private sector. For example, the vast majority of observational data sets of phenomena in the natural world, as well as all unique historical factual compilations, can never be recreated independently and are therefore frequently available only from a single, original source. In other cases, scientific databases are de facto unique natural monopolies because the cost of producing the data and the potential market are such that the economics will not support multiple sources. Even when data that are similar, but not identical, to original research results or observations are available for use in non-technical applications, researchers and educators are unlikely to consider an inexact replica of a database to be a suitable substitute if it does not meet fully the original specifications. For this reason, scientific databases are particularly prone to monopoly control. Any new legislation therefore must not enhance the market power of sole-source providers in any segment of the information industry without adequate public-interest safeguards.
5) Exclusive control, either de jure or de facto, by private parties over information and databases produced by the government must be prevented.
Consistent with principle #1 above, the public domain status of governmental databases and other information products is a key factor for the success of our nation’s research enterprise, as well as for other compelling national values and interests. Legislation that confers new rights on the private sector must fully exempt government databases from the scope of protection and avoid the possibility of exclusive capture by private-sector entities.
6) New protection regimes should not create any doubt or controversy about the lawfulness of traditional and customary access to and use of factual information for not-for-profit science, research, and education. Effective exceptions must be adopted.
Also in keeping with principle #1 above, it is important to provide clear immunity for customary non-commercial scientific, research, and educational uses from the scope of a database protection statute. Non-profit institutions should not be required to have expert intellectual property counsel looking over the shoulder of every scientist and scholar. Customary activities should not be chilled. Because in the case of databases, facts themselves are at issue, the legislation should include an express presumption that such customary uses are exempt from liability and the burden of proof on the plaintiff of demonstrating a violation should be heightened.
7) The important role and functions of our nation’s libraries must not be undermined.
Libraries traditionally have served the important public-interest function of providing access to information to our nation’s citizens, and performed essential preservation and archiving activities. Any new rights conferred by new legislation on database owners must not undermine the libraries’ ability to continue its role as public-interest intermediary for the access to and preservation of factual information resources.
* * *
B. Preliminary Comments on the Draft Legislation
We have not had sufficient opportunity to analyze comprehensively the draft "Database and Collections of Information Misappropriation Act of 2003." The issues and competing interests in this legislation are complex and difficult to reconcile. Although the process has been long and difficult, we believe that it has led to a deeper understanding of the issues, which was so palpably lacking when the first legislative proposal, based on the European Union’s database directive, was introduced in 1996. It also has demonstrated the inherent problems with introducing any new rights in this Constitutionally sensitive area and the importance of addressing adequately the competing legitimate interests of the many stakeholders in the information economy, not only the economic interests of the originators of commercial databases.
Our preliminary analysis of this new version of the legislation is consistent with the views expressed by the major university organizations in the September 9, 2003 letter from Nils Hasselmo, President of the Association of American Universities, to the two cognizant Committee Chairmen. We conclude that although improvements have been made over the previous legislative proposals introduced by the Committee on the Judiciary, very significant problems still remain to be resolved. Moreover, the current draft contains a number of new provisions whose intent and impact are ambiguous and which could have serious unintended consequences for the research and education enterprise.
We appreciate, in particular, several improvements that have been made in response to the concerns expressed earlier by the Academies and other parties to this process. The move toward a standard of liability grounded more in unfair competition law and the elimination of some of the most unacceptable aspects of previous versions of the Committee on the Judiciary’s proposed statutes, are certainly welcome. Among the specific improvements that we see are the elimination of qualitative substantiality, the effort to tie liability to direct competition in the same market as the existing database, the adoption of a knowledge requirement as a condition of liability, and a limitation to databases that require substantial effort to develop. The elimination of criminal penalties and the explicit recognition of the doctrine of misuse as a limiting factor on lawsuits are also positive developments.
Although the discussion draft addresses some of the concerns we identified previously, many serious problems remain nonetheless, while new ambiguities have been introduced by the recent changes. We note here only the issues of greatest concern to the scientific, research, education, and library communities, consistent with the principles articulated above, and also incorporate by reference the additional concerns expressed in the September 9 letter from Nils Hasselmo. In particular:
Ø With regard to the liability standard, the discussion draft could confer perpetual ownership rights in a wide variety of data by virtue of protecting investment based on open-ended maintenance of a database. In addition, the concept of "making available to others" appears to be overly broad, posing a threat to customary collaborative work within or among universities and research institutions. Moreover, a minimal amount of harm--even one lost sale or a single lost source of data--could lead to a finding of liability and to a chilling of the use of public-domain factual information, contrary to the values articulated under principle #1 above.
Ø The exception for educational, scientific, and research institutions applies only if the institutions are nonprofit and their "making available" is for nonprofit purposes. This would discourage joint research and development activities between nonprofit institutions and corporations. Especially troubling is that the exception can be overridden by a shrink-wrap or click-on license and render the exception meaningless--a major concern noted under principles #3 and 6. Any new legislation must preclude such a possibility. Finally, we continue to urge that the burden of proof of demonstrating that customary not-for-profit scientific, research, and educational uses of factual information are unreasonable should be a heavy one and should be borne by the plaintiff.
Ø The scope of the exclusion for government information in the discussion draft is uncertain as well. It appears that a publisher that incorporates government information in its database could prevent others from making available that government information -- even if it is not available from any other source, contrary to principle #5.
Ø By failing to address the problem of sole-source databases, the discussion draft increases monopolists’ control over competitive uses of information. This is of particular concern in the market for databases used in scientific research and education, as noted under principle #4. The provision on misuse, which could help mitigate harmful conduct of database monopolists, lacks any guidance for courts to determine whether misuse occurred. The misuse provision should specifically address the issue of sole-source databases. H.R. 1858 contained appropriate language in this regard.
While we believe that the Committees have made progress on this legislation, it is clear that the current discussion draft is still not ready to be adopted and would introduce serious problems in its present form for many stakeholders in the information economy, including the scientific, research, educational, and library sectors.
In closing, I would like to reiterate that the Academies, and all of the organizations I represent in my testimony today, have sought to play a constructive role in the congressional efforts to craft appropriate legislation in this complex and sensitive area. We look forward to working with Congress on this issue to develop a consensus on how best to move forward from here.
Thank you again for providing us with the opportunity to testify at this hearing.
* * *
Recent relevant National Research Council reports, published by the National Academies Press and all freely available at: www.nap.edu :
The Role of Scientific and Technical Data and Information in the Public Domain (2003)
The Digital Dilemma: Intellectual Property in the Information Age (2000)
A Question of Balance: Private Rights and the Public Interest in Scientific and Technical Databases (1999)
Bits of Power: Issues in Global Access to Scientific Data (1997)
* * *
BIOGRAPHICAL SUMMARY OF WM. A. WULF
Dr. Wulf was elected President of the National Academy of Engineering (NAE) in 1997. The NAE and National Academy of Sciences operate under a congressional charter to provide advice to government on issues of science and engineering.
Dr. Wulf is on leave from the University of Virginia, where he is a University Professor. His research spans computer architecture, computer security, programming languages, and optimizing compilers. In 1988-90 Dr. Wulf was also on leave to be Assistant Director of the National Science Foundation. Prior to joining Virginia, Dr. Wulf founded a software company, Tartan Laboratories, based on research he did while on the faculty at Carnegie-Mellon University.
Dr. Wulf is a member of the National Academy of Engineering, a Fellow of the American Academy of Arts and Sciences, a Corresponding Member of the Academia Espanola De Ingeniera, and a Foreign Member of the Russian Academy of Sciences. He is also a Fellow of four professional societies: the ACM, the IEEE, the AAAS, and AWIS. He is the author of over 100 papers and technical reports, has written three books, holds two US Patents, and has supervised over 25 Ph.D.'s in Computer Science.