IDs—Not That Easy:
Questions About Nationwide Identity Systems
Stephen T. Kent
Vice President and Chief Scientist, Information Security
Committee on Authentication Technologies and Their Privacy Implications
National Research Council
The National Academies
Subcommittee on Social Security
Committee on Ways and Means
U.S. House of Representatives
March 16, 2006
Good morning, Mr. Chairman and members of the Committee. My name is Stephen Kent. I am Vice President and Chief Scientist for Information Security at BBN Technologies and served as the chair of the Committee on Authentication Technologies and Their Privacy Implications of the National Research Council. This study committee authored the two reports, IDs—Not That Easy: Questions About Nationwide Identity Systems and Who Goes There? Authentication Through the Lens of Privacy, on which you have asked me to testify. The National Research Council is the operating arm of the National Academy of Sciences, National Academy of Engineering, and the Institute of Medicine of the National Academies, chartered by Congress in 1863 to advise the government on matters of science and technology.
It is a pleasure to be here to discuss these reports on large-scale identity systems. By way of background: the study committee originally planned to do only the Who Goes There? report. We decided on the IDs report about half-way through our study process after the September 11, 2001 terrorist attacks. In the wake of those attacks, numerous proposals for what identity systems could or should accomplish with respect to counterterrorism began circulating in the policy community and the media. The study committee believed that the persistence of public discussion about possible new ID systems and the expectation that other proposals would continue to be offered argued for an informed analysis and critique of the concept of a nationwide or large-scale identity system. The brief report on IDs was the result. It was intended to catalyze a broader discussion, and I am happy to be here today to continue that discussion.
I will start with a brief overview of the highlights of the IDs report and then address some of the specific issues that you asked me to consider in my testimony today.
Perhaps the most important message of our work on ID systems is that designing and building systems to ascertain identity is much more complex than it might appear and is indeed why we titled our IDs report “Not That Easy.”
A primary consideration is to understand the goals of a large-scale identity system. Before any decisions can be made about whether to attempt some kind of system, the question of precisely what is being discussed and what purpose it will serve must be answered. What problem or problems is the proposed system meant to solve? The high-level policy questions that the IDs report outlines include the following:
• What is the purpose of the system? What problem or problems is it attempting to address?
• What is the scope of the population that would be issued an ID? Related to this, how would the identities of these individuals be authenticated?
• What is the scope of the data that would be gathered about individuals in support of issuing an ID and how would it be correlated to data about them in any databases associated with the system?
• Who would be the users of the system? By this we mean not only those who would be issued an ID, but the government agencies, perhaps state and local governments, or even the private sector organizations that might rely on the IDs. What entities would be allowed to use the system? Who could contribute, view, and/or edit the data in the system?
• What types of use would be allowed? Who could demand an ID? Under what circumstances? What types of database queries about individuals would be permitted? Would data mining or analysis of the information collected be permitted? Who would be allowed to do such analysis? For what purposes?
• Would enrollment in and/or identification by the system (even if the individual had not formally been enrolled) be mandatory or voluntary?
• What legal structures protect the system’s integrity as well as the ID holder’s privacy and due process rights? What structures determine the government and relying parties’ liability for system misuse or failure?
Answers to all of these questions (and more) will have ramifications for the technological underpinnings of the system, including what levels and kinds of system security will be required.
Implicit in all of these questions is the notion of a “system” and not merely an “ID card.” The fact that any identity management proposal necessarily implies a “system” may be one of the most important (and less discussed) aspects of many of the identity system proposals that we have seen. These systems, at the scale that they are proposed, necessarily imply the linking together of many social, legal, and technological components in complex and interdependent ways. The success or failure of such a system is dependent not just on the individual components (for example, the ID cards that are used, or the biometric readers put in place) but on the ways they work, or do not work, together. For example, are card readers located where they need to be? How well do the readers operate under various environmental and load scenarios? Who will operate the systems and how will they be trained and vetted? Do enrollment policies align with the security needs envisioned for the system? And so on. How well these interdependencies are controlled along with the mitigation of security vulnerabilities and the unintended consequences of the deployment of a system, will be critical factors in its overall effectiveness.
In addition to the questions above, the committee outlined several cautions to bear in mind when considering the deployment of a large-scale identity system:
• Given the costs, design challenges, and risks to security and privacy, there should be broad agreement in advance on what problem or problems the system would address.
• The goals of the system should be clearly and publicly identified and agreed upon, with input sought from all stakeholders.
• Care must be taken to explore completely the potential ramifications of deploying a large-scale identity system, because the costs of fixing, redesigning, or even abandoning a system after broad deployment would likely be extremely high.
That is a brief overview of some of the highlights from the IDs report. The study committee urged that proponents of large-scale identity systems present a compelling case addressing the issues raised in these reports and solicit input from a broad range of stakeholder communities. The IDs report elaborates on these issues and also considers some of the technological and security challenges inherent in large-scale identity systems. Some of the issues you asked me to address in my testimony today are more specific than what I have presented here so far, and to the extent that our reports address them, I will briefly discuss them.
Tamper-Proof ID Cards
Cards are often suggested as a means of binding an “identity” within a system to an individual. The question being: if someone presents a valid card, how do you know first, that the card is valid, and second, that the card belongs to the person presenting it? To the first question, the goal of a counterfeit-resistant, long-lasting, easily-replaceable ID card presents difficult technical challenges. Magnetic stripe cards are trivially easy to counterfeit. Memory cards or smart cards are more difficult, but not impossible, to duplicate or forge. Use of cryptographic technologies and digital signatures can help, but for any technology, some degree of imperfection will exist. I have already mentioned that a key notion to keep in mind is that these systems are in fact systems—they would likely encompass databases, processes and procedures, cards, card readers, architectural requirements, security needs, and much more, not to mention the people who are a part of any technical system. Any ID card that is issued is only a component of the system. One question that must always be asked is what is the perceived threat? By threat I mean what set of adversaries do we believe we need to thwart, what are their capabilities, and what are their goals? If we cannot answer that question, we have no rational basis for deciding if any proposed system will likely be adequate, or whether it will be overkill.
To the question of ensuring that the person presenting the card is the same person identified with the card, a picture on the front of the card might be some assurance, but people sometimes have a hard time matching faces to pictures. “Two-factor authentication” in which an individual presents a card along with additional information (such as a PIN or thumbprint—either of which could be compared to data on the card) is another possibility. Another scenario might be to have the person interact with a biometric scanner and present the card that contains reference information for the biometric in question. Both pieces of information are validated in combination against a backend server. This, however, creates a requirement for high availability and a dependence on a secure, reliable network and communications infrastructure. Also, unless the scanner is itself a secure device (and known to be so through some kind of formal evaluation process) or the scanner is closely monitored, the system may be compromised. Even then, the system will not be fool-proof. (I am informed, by the way, that the NRC is conducting a large study on biometric systems that should be released later this year)
A decision on thresholds for false rejection and false acceptance rates (which is, first, a policy decision) will need to be made—and those thresholds cannot really be zero for any technology. Moreover, even the best-designed systems are subject to social engineering (there are numerous examples of personnel being tricked into issuing credentials without adequate proof of identity or authorization) and insider threat attacks—and thus one cannot rely on technological solutions alone. The entire system and implications of policy decisions at all levels must be thought through carefully.
One of the challenges that arises repeatedly with a large-scale identity system designed for a specific purpose (or set of purposes) is that there are almost always forces in play that push the systems to be used for things that they were not originally designed for. A familiar example of this is the state driver’s license, which does not merely enable one to legally drive on public roads, but is also relied on to provide “proof of age” for alcohol purchases and “proof of identity” to board an aircraft for domestic travel in the U.S.
Most systems do not explicitly guard against secondary uses, although occasionally there are legal requirements or contractual relationships that limit secondary use (such as credit card agreements.) There are at least two ways in which secondary use might happen. In some cases, the card presented may be used for additional verification purposes in contexts unrelated to the original purposes. In other instances, the data collected in support of card issuance may be used in ways that have little to do with the original purpose. Unintended uses of an identity system and its associated technologies can always have inadvertent side effects. There are numerous examples in the literature of this, and the expansion over time in use of the Social Security Number (SSN) is a well-known instance. For example, the proposed ID might become the new, de facto photo ID for individuals, potentially exposing SSNs to a very wide range of organizations at a time when states are eliminating the SSN from driver’s licenses.
If any new ID system is deployed, chances are that there will be uses found for it that were not originally intended. While this might seem an efficiency on the surface, in fact, such unplanned-for multiple uses may cause problems.
• A particular challenge resulting from unplanned-for uses is when technology or an ID system designed for a specific security context, user population, and so on is used (intentionally or unintentionally) without a determination as to whether the original security, privacy, and usage assumptions still hold in the new context. Secondary uses are implicitly relying on whatever assurances, security models, and privacy protections the original designers and implementers were working with. These may not align with the needs of the secondary user. For example, access to a health club may require a different usability or privacy model than access to secured facilities at an airport. One size cannot fit all.
• A significant context consideration is the security of the system. The original system was designed with a particular threat model in mind; this threat model may not apply to secondary uses of the system.
• Another problem is that the data collected for the original purposes may not be what is needed, or at the appropriate quality or reliability levels, for the new secondary uses.
• Depending on inappropriate assumptions is not a challenge just for the secondary user, but also for the primary users of the system. An ID system that is used for multiple purposes with multiple types of threats, not all of which were designed or planned for, can make it difficult to respond to a known attack on the system. This is because with secondary uses, the universe of possible motivations behind the attack is much larger, making it difficult to ascertain what is an appropriate response to an attack. If your database is hacked, was it individuals desiring a fake ID to purchase alcohol, for example, or individuals with more nefarious purposes in mind?
The privacy implications of large-scale identity systems can be significant. While casual discussions of IDs or ID cards may assume simple, unique pairings of information and individuals, the reality is often more complicated. A major privacy challenge, even when a given system has been designed and is operating in a secure and privacy-sensitive fashion, is the ability to cross-reference and link information across databases in different systems. In many cases, an identity in a given system will include a common cross-reference, such as a Social Security Number, that makes it trivially easy to link it to other identities associated with other systems (presumably designed for other purposes.) In addition, questions arise as to how reliable the linking would be—some institutions may not mind if suggested linkages are only approximate (for example, a vendor attempting to do targeted marketing), whereas others demand high levels of accuracy.
Identity theft is also a major concern, especially in the case of centralized databases or systems used for multiple purposes—the more useful or “powerful” an ID is the more tempting it is as a target. Identity theft is an individual’s fraudulent claim that he or she is the person to whom the information in the system refers, allowing him or her to derive some benefit from another party who is relying on that claim. One reason for the problem is the expanded use of SSNs for purposes that were not originally intended coupled with the assumption that they are ‘secret’ or should act as a ‘key.’
When designing a system to lessen impacts on personal privacy, the study committee made a number of recommendations, including:
• Be clear about the purposes of the system.
• Minimize the scope of the data collected to that which is essential for the purpose of the ID system.
• Minimize the retention interval for data collected in association with use of the card.
• Clarify who will have access to the collected data.
• Clarify what kinds of access to and use of the data are allowed.
• Ensure that use of the system is audited to protect against illegitimate uses as well as to monitor for security threats.
• Provide means for individuals to check on and correct the information stored about them.
All of that said, many times there are important uses of data that are unanticipated when the data are collected. For these as for other important uses, it is a question of balancing the risks to privacy and confidentiality against the benefits of the uses, especially when the uses are for research to inform public policies or for national security. The Academies have long studied the issues here for important research uses of data. A recent study is Expanding Access to Research Data: Reconciling Risks and Opportunities from the Academies’ Committee on National Statistics. For the case of national security purposes, the Computer Science and Telecommunications Board has joined with the Committee on Law and Justice and the Committee on National Statistics to launch a major study to balance the risks and benefits. The Academies would be pleased to offer more information on these and other studies that may be relevant to your inquiry or to help with further investigations of interest to you.
The establishment of an identity in an identity-system is another challenging but critical part of the process. There is a tangled web of government-issued identity documents used as foundational documents that allow the government and other organizations to issue other identity documents. Many of these foundational documents, used to acquire an SSN or Passport, for example, are subject to fraud and forgery themselves. Birth certificates are particularly problematic, in that they are issued by thousands of different jurisdictions across the country, making them both easy to forge and difficult to verify and thus very poor to use as an identification document from a security perspective. Moreover, no aspect of a birth certificate binds it to an individual in any strong security sense. The types of possible attacks on identity documents vary and include the following:
• An individual acting as an imposter.
• Forged or fraudulent documents.
• Tampering with existing documents.
• Compromise of confidential information (for example, in an identity system database) that is then used to create a false identity.
• Modification of computerized records to support a false identity.
Moving to, for example, digital credentials or biometrics will not change these basic avenues of attack and fraud. As technology and perhaps ID cards become ever more sophisticated, the issuing process will remain extremely important. All the security in the world cannot overcome deficiencies in this step—the system will only be as good as the data that goes into it. The best that any system can provide is a compelling connection with some previous verification of identity. Essentially, trust in the integrity of the system is based not so much on any single verification when an individual presents a claim of identity as it is on increasing confidence when multiple transactions happen over time and all previous transactions with that particular individual have worked out.
You asked me to comment in particular on the issue of modifying the SSN card so that it is tamper- and counterfeit-resistant as part of efforts to prevent unauthorized immigrants from gaining lawful employment in the United States. While the National Research Council’s reports did not address this specific question, such an approach clearly falls within the realm of large-scale identity systems that the study committee was considering. The framework that we presented can be applied to this question.
For example, once the purpose of a system is clearly articulated—in this case the prevention of unauthorized people from gaining lawful employment in the United States—then a next question to ask is what information would accomplish the goal of ascertaining whether an individual is qualified to work in the United States? Who has that data? Who collects it? Who can access it? If a system with that sort of data were deployed, how would it be regulated? What penalties or liabilities would be associated with misuse? How could individuals correct their own data within the system? What kinds of security would be needed? What are the likely threat models for such a system? How could potential threats of identity theft (in this case “worker-identity”) be mitigated? Who would be authorized to ask to see the ID card associated with this system? Are there other likely abuses and how could the possibility of those be mitigated? If the system is to be built on top of another existing identity system (such as the SSN)—which poses its own very serious challenges since this basically would be an unintended, unplanned-for, not designed-for use of the SSN—then what can be assumed about the underlying data in the current system? Layering even the best current security on top of old data only gives the old data an appearance of being more trustworthy—the data has the same quality and reliability that it had prior to the security being added.
Mr. Chairman and members of the committee, our study committee wrestled with questions of identity, authentication, identification, and large identity systems for many months—not new issues, but ones that were brought into sharp focus after September 11, 2001. In the study I have described, we have attempted to lay out our thinking and analysis of these issues. As the report title, IDs—Not That Easy, suggests, none of these issues is simple, and any large-scale identity system poses numerous questions that should be carefully thought through—not only from a privacy perspective, but also from security, usability, and effectiveness perspectives. Our reports attempt to lay out some of these questions that must be addressed and to illustrate the complexities that can arise.
You can find more information about these and related studies on the Web site of the Computer Science and Telecommunications Board of the National Research Council at http://www.cstb.org.
Thank you. That concludes my comments. I would be happy to take any questions you may have.