SO IT WAS ESPECIALLY ODD TO FIND MYSELF, ON JUNE 26, 2000, BEING treated as a minor celebrity. Even before I’d given my talk the photographers were crawling up and saying, ‘Psst, John! This way!’ and I had a little taste of what it’s like being in the public eye. After the talk I was just surrounded by lenses all pointing at me, and I had to be dragged away because interviews were lined up. I thought, ‘How strange—they’ve cottoned on to this, they do believe it’s important.’
It was the day on which joint press announcements were made in London and Washington that the draft human genome sequence had been completed by both the publicly funded Human Genome Project and by Celera Genomics. I had gone with my Sanger Centre colleagues, dressed in unaccustomed suits, to face the assembled press in the lecture theatre of the Wellcome Trust headquarters in Euston Road. It was packed. The director, Mike Dexter, said a few words, followed by the science minister Lord Sainsbury, Michael Morgan, and members of the lab. In the afternoon some of us went off to 10 Downing Street for a Bill and Tony show over a video link with the press conference that was going on in Washington. We
were a bit outraged by Tony Blair’s speech, which we had no hand in, because amazingly it mentioned Celera but not the Sanger Centre. But the deal was that we were going to be cheerful and nice—that was the official Wellcome Trust line.
My main aim was to say to the journalists that I thought the people at the Sanger Centre had done a wonderful thing for humanity, and I thought that they ought to be supported. To be fair, we did get some halfway reasonable coverage afterwards. I had been incensed by the lack of appreciation that had been shown until then. Everyone had focused on the ‘race’—were our methods better than Celera’s, who’s going to get there first?—and hardly anyone was saying, ‘Hang on a minute, there’s one bunch of people who are actually doing this for the benefit of humankind and another bunch who are trying to do it for their personal gain.’ And since then there have been more articles drawing that distinction.
Moon landings excepted, it is almost unprecedented for any head of state, let alone two at once, to identify themselves so closely with a scientific advance. The Human Genome Project is politically sensitive on at least three counts: it is perceived as expensive (though not by comparison with moon landings); access to the information is of immense commercial value; and there is widespread public concern about the way that information might ultimately be used. For these reasons all of us in the field are aware that the direction of our science to some extent depends on attitudes in Downing Street and the White House.
As the champion of private enterprise, Craig Venter had lost no time in attempting to get the United States Congress on his side. All through that period between 1998 and 2000 it was very important that the British-based Wellcome Trust was so influential. If the HGP had been just an internal United States matter, then, with enough support for Celera in Congress, I feared that the role and views of the National Institutes of Health might have been sup-
pressed.
It was only gradually that the political pressures on the G5 became apparent. To begin with, we were much more concerned with setting our scientific goals and getting into place everything that was needed to reach them. In February 1999 the G5 met at the Baylor College of Medicine in Houston to plan its strategy for producing a draft sequence. I went with Michael Morgan to represent the Sanger Centre. On the agenda was a radical plan to move forward even faster than had already been announced in September: Francis was now talking about producing a ‘working draft’ sequence within little more than a year.
That was a turning point meeting. I had come to the conclusion that this was what we had to do about a week before, and hadn’t even told my own staff what I was going to propose. I made the case that we should focus on getting draft coverage, and based on the capacity that I thought this group of five had, that we might be able to do that in a year.
Francis wasn’t sure how his proposal would be received. He was pretty sure that Eric Lander would be in favor of it, as he had previously argued along the same lines, and Richard Gibbs at Baylor would probably go for it too. But he was much less sure about Bob Waterston or the Department of Energy genome center, and he didn’t know where I would come down.
It was pretty clear that John’s opinion was going to be definitive. If he was against it, especially as he is somewhat uncompromising, it was going to be difficult to carry the day.
The discussion went back and forth for most of the day. Bob, who wasn’t at the meeting but was in touch by conference call, was really not convinced that what Francis proposed was feasible. He was
afraid that we might just end up by shooting ourselves in the foot by deviating from the pathway that had already produced about 15 percent of the sequence in finished form, and which could get us about half of it by the end of 2000 if we stuck to it. Much of the effort to accelerate the mapping part was going to fall on his head, and producing all the mapped clones needed to go into the sequencing pipeline was going to be a huge job for him to take on.
I knew that many of my Sanger Centre colleagues would agree with Bob. But I said I thought we should go for it. I was no less committed than Bob to getting fully finished sequence eventually, but on the other hand I’d always been in favor of pushing out unfinished sequence as fast as possible. It is useful to the biological community—the worm project had proved that—and, more importantly in the light of the Celera threat, it put the sequence itself, though not possible future uses of it, beyond the reach of patents.
In Francis’s words, that ‘turned the tide of the conversation.’
Bob came around, and by the end of the day John was up at the blackboard dividing up the chromosomes, and we had a strategy—it was pretty much clear from that point on what the pathway was going to be.
The accelerated timetable was announced in mid-March, and succeeded in taking almost everyone by surprise. For once the HGP caught Celera on the back foot; the company’s target completion date at this stage was still 2001. Reporting the development in the New York Times, Nicholas Wade wrote, ‘If met, the new date set by the consortium could allow the public venture to claim some measure of victory over its commercial rival, the Celera Corporation of Rockville, Md.’ Craig, uncharacteristically lost for a pithy re-joinder, said that the new timetable was ‘nothing to do with reality’, and that it was just ‘projected cost, projected timetables.’ The irony of that remark, coming as it did from someone who had almost
managed to scupper the entire publicly funded project with a press announcement based entirely on the projected performance of machines he had not yet even bought, was evidently lost on the writer.
Just as surprised as Craig were five of the United States genome centers that had participated in the pilot programs, but that now seemed to be left out. The NIH’s announcement included the award of grants worth $81.6 million in total over ten months to just three centers: Bob’s lab at Washington University in St. Louis, Eric Lander’s at the Whitehead Institute at MIT and Richard Gibbs’s at the Baylor College of Medicine in Houston. Glen Evans of the University of Texas Southwestern Medical Center probably made the understatement of the year when he told Science magazine, ‘It’s kind of upsetting for all of us.’ Francis assured them that there would be another funding round soon that would include the smaller centers, but there was no getting round the fact that, with the funding plan he had presented the previous autumn, he had deliberately created a two-tier structure within the NIH program that greatly favored some at the expense of others.
The internal dynamics of the consortium were now completely altered. Until the launch of Celera, St. Louis and the Sanger Centre had been the biggest sequencers in the world. We each thrived on friendly rivalry with the other, but in all important matters—the regional approach to selecting clones for sequencing, free release of data—we saw absolutely eye to eye. We had no need to set each other targets, because every individual in each of the two labs was eager to stay just a little bit ahead of the other. But there were no secrets between us, and if we found a better way of doing something, we shared it. We also made a point of encouraging other international centers, for example in France, Germany and Japan, that were having a much tougher time than we had in persuading their governments to fund sequencing projects that adhered to the Bermuda Principles.
Now things were very different. Bob no longer held such a pre-eminent position in the United States sequencing community. Ever since we had launched our unsuccessful bid to go for a draft sequence in 1994–5, the two of us had anticipated that the St. Louis—Sanger axis would lead the charge when the moment finally came. We had thought in terms of a third of the genome for the Sanger Centre, a third for the Genome Sequencing Center in St. Louis and a third for everybody else. But it gradually emerged through 1998 that Bob was not going to get enough of the genome project funding to sequence a third of the genome. When Francis announced the share-out of the funds he had won from Congress to support the accelerated draft strategy until 2000, it was Eric Lander who came out on top. He was awarded $34.9 million to Bob’s $33.3 million, with Richard Gibbs at Baylor College of Medicine taking $13.4 million. (At the same time the Department of Energy put $40 million into sequencing at its Joint Genome Institute.) But Bob was not surprised at the outcome.
I got all that I had requested. I had basically failed to persuade our group that we should aim higher, and there was a sense of wanting to ‘play fair.’ Also our building had begun to run out of space by that time. We had missed our moment.
Eric was proposing to spend almost all of his money on shotgun sequencing, while the Sanger Centre and St. Louis retained their commitments to map, sequence and maintain a certain level of finishing—in addition, in Bob’s case, to providing mapped clones for Eric and everyone else. Everything was now in place for the Whitehead genome center to become the biggest, measured strictly in terms of raw sequence output.
Now, too, Bob was seriously ill. The year had begun with the shocking news that he had cancer of the bowel. Although his doctors thought his chances of survival after chemotherapy and surgery
would be at least 80 percent, I had to face the prospect that I might lose my closest colleague and a very dear friend. Bob was incredibly courageous, and never stopped working; he couldn’t travel, but he joined in the Houston meeting by conference call and continued to participate in almost all of the regular Friday conference calls that had begun to take place among the G5 to coordinate activities. He went in for surgery in April, and even while convalescing surrounded himself with a mini-office so that he could keep in touch. As so often before, Bob dwarfed everyone with his sheer capacity to cope. ‘He’s a soldier,’ Richard Gibbs commented to me. To our enormous relief his treatment was a complete success and by the middle of the year we knew he was out of immediate danger. Bob was and still is a very fit man, regularly cycling from his home to his lab—which may be normal in Cambridge but is regarded as decidedly eccentric in St. Louis.
Like the smaller United States labs, the other international centers were completely taken aback by the announcement that the target date for a working draft was now spring 2000. The very existence of the G5 was a slap in the face to colleagues who had participated in the Bermuda meetings since 1996 and regarded themselves as partners in the consortium. André Rosenthal, who headed the Institute of Molecular Biology in Jena, Germany, was particularly bitter about it. ‘The policy was not agreed upon in the same international spirit as had [been cultivated] in the past,’ he told Science magazine. ‘This announcement gives the impression that [we’re] not needed.’ André’s bitterness was understandable. Under pressure from the British and American scientists, he had put his career on the line in fighting the German government to agree to the policy of free data release, and was expecting about 7 percent of the sequence—including a large part of chromosome 8—to be produced at his institute. Now it seemed that his defense of international co-operation was being rewarded with an almost complete takeover of human sequencing by the large and well-funded United States and British
centers. Japan was in much the same situation, with Yoshiyuki Sakaki of the Human Genome Center at the University of Tokyo close to completing part of chromosome 21 and already stocking up with capillary sequencers ready to make a substantial contribution to the draft. Jean Weissenbach in Paris had so far done little human sequencing, but had done important work on other genomes and on human mapping.
With the Sanger Centre as the only non-United States member of the G5, Michael Morgan and I felt a particular responsibility to represent the interests of the rest of the international genome sequencing community. Looking back through my correspondence, ‘What about the rest of the international partners?’ seems to be a constant refrain. I thought it was important that they remained part of the project; but at the same time I knew they would have to keep up. They all finally got together at the 1999 international strategy meeting, which again took place at Cold Spring Harbor in May. Time was tight, and it seemed a good idea to tack the strategy meeting on to the annual genome sequencing symposium as most of the participants would be there anyway. I didn’t go, but Jane, Richard and David reported that the meeting began with everyone in a state of high anxiety. Francis offered apologies all round—but he then said bluntly that 1999 was going to be a make-or-break year. There was no doubt that Celera would announce that the genome was ‘complete’ within little more than a year, and that Congress was under real pressure from some quarters to shut down the publicly funded genome effort. The only hope was to move fast and in a tightly coordinated way, putting most of the resources available into a small number of big centers.
What the French, German and Japanese groups needed more than anything was a statement from the project’s leaders that it remained an international enterprise in which they had a role. By the end of the meeting they had won recognition of their claims on certain parts of certain chromosomes, but only as long as they could
keep up with the rest. There would be no question of leaving whole chromosomes to groups that did not have the funding and resources to contribute working draft sequence by the spring 2000 deadline. They were on board, but there was no doubt at all who was in the driving seat.
The allies in the publicly funded project were tightening their organization to turn the tide against the threat from the invader. It would be easy to see this as an overreaction to the entry into the field of one competitor, as against the combined forces of the international consortium. And indeed, Celera’s PR operation was fond of presenting the company as a David up against the federal Goliath, Craig Venter the maverick entrepreneur against the mighty National Institutes of Health establishment. The political riff plays well to this day, and is much beloved of some British reporters, as well as a majority of the United States ones. The truth, of course, is very different: Celera was much more powerful than it appeared. Representing private enterprise, the company could count on the backing of many in Congress who were philosophically opposed to state-funded projects. By repeatedly hinting that the government was wasting its money, Craig clearly aimed to influence congressional policy on the funding of the Human Genome Project, if possible to the extent of shutting it down. We could not allow that to happen.
The Sanger Centre had managed to avoid being left out in the cold with the rest of the non-United States groups simply by being big—at that time the biggest, in terms of sequence output. In June 1998 we had held a party for everyone at the center to celebrate passing 100 megabases of finished sequence (including all species), a landmark which we were the first to achieve. (The event had bouncy castles and loads of drinks and nibbles, and the tradition continues: in 2002 we marked the passing of the first gigabase—1000 megabases.) But the new draft deadline was still going to be a huge challenge. We
had to treble our output of sequence in less than a year, and unlike Eric we intended to keep our mapping and finishing activities going at the same level as before. I had really stuck my neck out to defend our claim to one-third of the genome. Now we would have to deliver under the close scrutiny of all our partners in the international consortium. It was all terrifyingly public. With assembled sequences of anything more than 1 kilobase being added to the public database every day, anyone could see what we had—or had not—done. In addition, the regular Friday G5 conference calls required each center to report on its production for the past week and make a prediction for what it would produce in the weeks to come. This was crucial to keeping everything on track, and the office of the National Human Genome Research Institute did a great job of book-keeping.
It was a very different way of doing science. Most projects take as long as they take, although scientists are usually in so much of a hurry to get results that there’s no need to crack the whip. But we had voluntarily put ourselves under tremendous time pressure. Not every member of the G5 cared about beating Celera, but as long as some of them did, the rest had to run at the same pace or risk losing sequence to faster-moving centers. And ramping up production could not be done overnight.
We needed not only new machines, but a new type of machine. Of course, all the sequencing centers now wanted to buy capillary sequencing machines of the type that Celera was installing. Perkin-Elmer had been very clever in launching Celera. At the time many argued that to go into competition with your customers was a questionable business move. But in fact Perkin-Elmer would win out whatever happened. If Celera put the HGP out of business, it would earn the financial rewards of monopoly ownership of the genome sequence. If, as it turned out, the HGP decided it wanted to match or exceed Celera’s effort, then it would have to go to ABI for the new capillary sequencers—at $300,000 a time—in order to do it. In other words, by launching Celera, Perkin-Elmer had hugely
increased its market for the 3700 machine, and for the expensive reagents with which it had to be fed. Tony White, chief executive of Perkin-Elmer, knew exactly what he was doing. He was later quoted in Forbes magazine as saying, ‘The day after we announced Celera, we set off an arms race…Everyone, including the government, had to retool, and that meant buying our equipment.’ ABI—which formally changed its name to PE Biosystems in the spring of 1999—could scarcely keep up with the demand, and Mike Hunkapiller found himself the target of complaints from each side that he was favoring the other. But I’m sure he found this a minor irritant set against the sales of over $1 billion that PE Biosystems reported in the year following the Celera launch.
As long-standing customers of ABI/PE Biosystems, we had known about the new capillary technology for some time. Indeed, it was in part the demand from the publicly funded centers for an alternative to slab gels that had driven this development. Nor was PE the only company making capillary sequencers, although after evaluating the others we (and most of the other genome centers) decided to go for the 3700. The change would have knock-on effects throughout the whole enterprise: the new machines used different chemistries to carry out the sequencing reactions, and needed different organizational procedures for handling the samples. We also needed more space to put them in, as well as building new, automated systems for picking clones and preparing the templates for sequencing. In theory, space was not a problem at the Sanger Centre. We had not yet occupied a wing of the building, called the West Pavilion, which had been built as a shell against the need for future expansion. But it needed to be fitted out with labs, services and equipment, all of which would take time.
There were challenges all through the lab. The mappers had to increase the clone supply; more subclones had to be made, more samples prepared, more data handled. The burden of implementing the sequencing scale-up fell on Jane Rogers and on Stephan Beck,
head of human sequencing. The clattering of the clogs that Stephan wears around the lab became more noticeable than usual. Jane hoped to limit the extent of our change-over, to limit the costs, and she initially ordered 30 of the new capillary machines. Eric ordered 125, further confirming his ambition to become the biggest center. We still had something like 140 of the old 377s, many of them now adapted to run 96 lanes per gel. We were running them three times a day, so that was almost 450 slab gels a day to prepare. That was the part that wasn’t easily scalable, and the reason why the new technology was attractive. At least we were able to keep a solid base of production going while the new capillary machines were being brought in—although with hindsight we could probably have gone faster in the end if we had replaced them. For much of 1999 the attempt to keep up the pace was a nightmare. Even the computer system crashed repeatedly for nearly a month in July and August. Phil Butcher, the Sanger Centre systems manager, and his group had a terrible time sitting up at nights waiting for the elusive hardware fault to recur. He eventually tracked down and replaced the offending hardware and after that everything was fine. One way or another we fell behind on our production targets, and our United States colleagues let us know that they were pretty unimpressed. Once things were working again, however, we rapidly increased our output.
As well as the need to meet our targets for the working draft, we had taken on another major responsibility. In April a consortium consisting of the Wellcome Trust and ten of the big pharmaceutical companies had launched a project to mine the sequence for variability between individuals. Each person has two copies of the genome: one from each parent. Between any two copies 99.9 percent of the sequence is identical—which is why we belong, recognizably, to the same species and can have children together. But the final 0.1 percent, roughly one nucleotide base in every thousand of the 3
billion total, differs from one copy to another. At a particular point two-thirds of the copies might have an A, for example, while the remaining third have a T. These differences are called single nucleotide polymorphisms or SNPs (pronounced snips). They are the exceptions in the human genome sequence that make us individuals rather than identical clones. And although we are so similar to every other human being on the planet, that still leaves us with millions of possible differences between ourselves and someone else. There are other sorts of differences, such as deletions and longer replacements, but SNPs are the most common.
In practice most SNPs probably have no effect at all, because they do not fall in protein-coding or regulatory regions of the genome. But others give us brown eyes rather than blue, affect our height, or influence the degree to which we are creative or impulsive. And some have significant effects on our health. They do not necessarily make genes defective in the same way as the rare mutations that cause muscular dystrophy or hemophilia, but they could cause subtle differences that influence our susceptibility to heart disease, for example, or how well we respond to a particular drug. And mostly it will prove to be combinations of SNPs, rather than individual SNPs in isolation, that cause these subtle effects.
SNPs are of enormous interest to drug companies, who want to use SNP maps for a variety of purposes, including the development of drugs tailor-made for a subset of patients according to their genetic profiles, and in the longer term the identification of new drug targets. Just as with ESTs and then with the complete human sequence, there was a fear that the gold-rush mentality would lead to large numbers of SNPs being tied up in patents and pulled beyond the reach of further research. That fear seemed to be justified in May 1998, when Celera announced that a ‘catalogue of human variation’ (that is, a SNP database) would be one of its flagship products. Other commercial operations, such as Incyte and Curagen, almost immediately countered by launching their own SNP initiatives,
focusing only on the protein-coding regions of the genome.
Realizing that for each company to build up its own SNP database would be enormously wasteful of time and resources, Glaxo Wellcome plc began to talk to a number of other big pharmaceutical companies about doing it jointly. Michael Morgan got wind of this, and offered the resources of the Wellcome Trust—money and the Sanger Centre’s sequencers—to help get the project off the ground. Despite the difficulties of getting ten competing companies to work together, and the need to avoid falling foul of the stringent anti-trust laws in the United States, the non-profit-making SNP Consortium was formally launched in April 1999, with a budget of $14 million from the Trust and $3 million from each of the ten companies. Alan Williamson, now retired from Merck, played a key role in the negotiations, in a reprise of his earlier success in brokering the funding of Washington University to carry out EST sequencing in 1994. The SNP Consortium commissioned the Sanger Centre, the Whitehead Institute and the Genome Sequencing Center in St. Louis to find 300,000 SNPs by 2001.
The SNP Consortium database was to be free and publicly accessible; in the jargon of the commercial world, it was to be seen as a ‘pre-competitive’ development and therefore not bound by the laws against collusion by companies in the same industry. Its announcement struck a timely blow for the common ownership of the genome and caused a blip in the share prices of the genomics companies. This public ethos meant that for the Sanger Centre to work for the consortium would not conflict with the principles we had established for sequencing, and the contract brought in valuable funds. In practice we had already started the work, as polymorphisms frequently turn up in the overlaps between one length of DNA and another as the sequence is assembled. Ian Dunham and his colleagues had also begun to look for SNPs more systematically, examining the overlaps in stretches of sequence on chromosomes 22, 13 and 6. But the consortium contract now obliged us to work more
comprehensively. The pilot work went well, but our problems with getting the new sequencing technology working meant that by the autumn we were seriously falling down on our targets and under threat of having our contract terminated. David Bentley averted the threat by honestly setting out our problems at the Sequencing Consortium meeting in October, and confidently asserting that we would catch up by the end of the year.
If we seemed to be having problems achieving our goals, it was only because we were trying to do so much. Throughout all the setbacks we were making steady progress with our human sequencing projects, and knew that we were building up to a major triumph. In August 1999 Ian Dunham, who coordinated the chromosome 22 sequencing project, sent round an e-mail telling everyone involved that the team now had a contiguous sequence of DNA 9 megabases long—5 megabases longer than any other human sequence in the world. They were well on target to publish the complete, finished sequence by the end of the year.
Chromosome 22 is one of the smallest chromosomes, containing less than 2 percent of total human DNA. The chromosome 22 sequencing project was an excellent example of the Human Genome Project in miniature. The Sanger Centre carried out over two-thirds of the sequencing, in collaboration with Bob Waterston’s lab in St. Louis, Nobuyoshi Shimizu at Keio University in Tokyo and Bruce Roe at the University of Oklahoma. Five other institutions in the United States, Canada and Sweden had worked on the mapping phase. During 1999, as we struggled to produce shotgun sequence for the ‘working draft’ of the whole genome, Ian and his colleagues were patiently working through the time-consuming finishing stage on chromosome 22. They had to link up all the sequenced clones, correct errors and look for gaps. Filling the gaps meant trying to find new clones that were clearly linked to landmarks on either side of the gap and sequencing those. With our map-based methods it was
possible to do this in a systematic way, and Ian’s group and the sequencing teams closed gaps remorselessly for month after month.
The chromosome 22 sequence was published in Nature on 2 December. It would always have been a major milestone, but with the publicly funded project under such pressure from the Celera PR machine, we had to make as much of it as we could. Just as we had with the worm, we held simultaneous press conferences in Washington and London, and this time there was also one in Tokyo to mark the important Japanese contribution. We’ve since been mocked for some of our hyperbole on that occasion. In newspaper interviews I made comparisons with Copernicus’s discovery that the earth goes round the sun, or Darwin’s theory that humans are relatives of apes, and Mike Dexter said something similar about the invention of the wheel. But of course I wasn’t just talking about chromosome 22—I was thinking about the whole enterprise of molecular biology, and how it is changing our view of ourselves. I’ve used the same comparison frequently, and I don’t think it’s overstating the case.
What was immediately important to us about the finishing of chromosome 22 was that it proved that the strategy we had adopted for the whole genome worked. Ian Dunham had his feet more firmly on the ground when he told Nature that the main significance of the publication was that ‘it shows that you can get very good finishing using the clone-by-clone approach.’ If we could do it for chromosome 22, we could do it for the whole genome, and suddenly that long-cherished goal seemed a lot closer.
Despite its small size, chromosome 22 had plenty of significance in its own right. Geneticists had already implicated it in at least thirty-five diseases, including schizophrenia, chronic myeloid leukemia and some forms of heart disease. Using the unfinished sequence that had been released over the previous few years, they had begun to pinpoint some of the genes involved. James Scott, professor of molecular medicine at the Hammersmith Hospital in
London, identified seven new genes relevant to cardiovascular disease in this way. ‘We could not have done this work without the chromosome 22 data,’ he told Nature. The publication of the complete sequence not only announced that almost 33.5 megabases of finished sequence was in the databases, but included the identification of 545 genes (based on comparisons with other gene sequences such as ESTs), more than half of which were previously unknown in humans.
The other important point about the chromosome 22 publication is that it brought home to us and to the scientific community at large what a very difficult business it was going to be to sequence the complete human genome. The published ‘complete’ sequence in fact ignored completely the short arm of the chromosome, and included eleven gaps ranging from 50 to 150,000 base pairs on the long arm. The short arm consists almost entirely of repeat sequences which make it all but impossible to reassemble it in the right order using any current technology. But its composition makes it very unlikely that it contains many protein-coding genes, if any. The 11 gaps in the sequence of the long arm were left because for reasons that we don’t fully understand, those regions refused to establish stable BAC clones. Some of these gaps have since been closed, and more will be closable in time, but it would have been unnecessarily punctilious to delay the publication until the most intractable problems had been solved. The sequence as published was still a major achievement.
It was very satisfying to be able to make a statement about the progress of the publicly funded genome project that really meant something. I had been very conscious throughout 1999 of the political high wire we had to walk in relation to Celera. Craig Venter’s agreement with Gerry Rubin to sequence the fly genome was announced in January 1999, including a commitment to make all the data publicly available. On the back of this, Craig immediately began to negotiate with Francis Collins about a
similar agreement on the human genome. From the moment I saw the draft agreement I could see that the advantages would all be in Celera’s favor. For example, while it included a general comment about commitment to making the sequence available to the international scientific community, it crucially left vague the question of free and unrestricted access. It also included avoiding ‘inappropriately adversarial comments’ about each other’s work, just as on our side we were beginning to talk about a more vigorous press stance.
I was very averse to the whole notion of entering into an agreement with Celera. I recommended to Michael Morgan that whatever Francis might do, the Wellcome Trust should not be party to it. Together with Bob Waterston’s lab in St. Louis, we had already had a direct approach from Gene Myers, one of the original proponents of a whole-genome shotgun approach to the human genome who now worked at Celera. He asked us to hand over all our trace data on C. elegans so that they could use them (minus any information about map locations) to test their whole-genome shotgun assembly program. It was not a trivial request—it would mean tying up someone’s time extracting all the trace data and saving it in a form that could be transmitted to Celera. Craig became highly indignant when we seemed reluctant to agree, arguing that as the work was publicly funded the data should be available to anyone. Of course, we had always released our assembled sequence freely, but now it seemed that wasn’t enough; he wanted the raw data, too. We continued to negotiate in a desultory way—for example, we proposed that Celera should make its shotgun assembly program available to us in exchange; they offered to pay for the extra work involved—but once they were up and running with the fly sequence they had no need of the worm data any more, and the subject was dropped.
In the United States, however, there were repeated attempts by the public funding agencies to come to some kind of compromise
with Celera. There was discussion of a plan, for example, to enable the company to deposit its data in a special section of GenBank, where anyone could look at it on the web, although researchers would not be able to download it as they could with the public data. This too came to nothing. The ‘public release’ that Craig had promised when Celera was launched seemed less and less likely to happen, at least in any form that bore meaningful relation to our definition of public release.
Meanwhile a relentless barrage of Celera press releases made it look as though they were simply blowing the public project out of the water. They started sequencing the fruit fly Drosophila in May 1999, and in September announced that the sequencing was ‘completed.’ This did not mean that they had fully finished, or even assembled, the 180 megabase sequence: it just meant they had run enough samples through the machines to cover the whole genome. But of course the message that came over was that the fly genome had been finished in four months, and needless to say Celera lost no opportunity to make unfavorable comparisons with ‘other early genomes’ (presumably including the worm) that had, in the words of its press statement, taken ‘over a decade’ to complete. (At the time of writing, two years later, the fly genome is still being finished, as one would expect.) More seriously, the apparently happy collaboration between Gerry Rubin and Craig Venter which had produced the fly sequence in such record time was held up as an example of what the human project ought to aim for. ‘It has been a win—win affair,’ said Nature in an editorial in December.
Six weeks later Celera announced that it had sequenced 1 billion base pairs of human sequence, and the pressure was on Francis Collins again to find some way of collaborating. There is no doubt that the opportunity to add Celera’s shotgun data to our mapped clones could be immensely valuable, and I had publicly supported this prospect on more than one occasion since the first Celera announcement. But not if it meant restrictions on who could use the
data and on what terms. In the case of the fly, Celera had eventually agreed to deposit the data in the public databases at the time of publication, with no restrictions on its use. Gerry had been firm in keeping Celera to the agreement, telling Craig that on no account could he copyright the database so that other commercial companies couldn’t use it. There was a brief panic when some of the fly researchers discovered that a notice had appeared on the website of the National Center for Biotechnology Information forbidding the redistribution of the fly data, but Mike Ashburner blew the whistle and once again Gerry stepped in to ensure that Celera stuck to the letter of its original agreement. The notice came down in less than two days. In November Celera invited forty fly biologists to an ‘annotation jamboree.’ Annotation is the process by which the raw sequence is analyzed for gene content and embellished with any extra information relevant to understanding its biological role. At the Celera jamboree geneticists, sequencers and bioinformatics people all got together, sitting at computer screens to add all the details they could of likely membership of gene families, comparable genes in other species and so on. The whole exercise was hugely valuable to the fly community and to biology as a whole, as many of the genes described would be relevant to other species including humans.
But the human data was a very different proposition to a commercial organization such as Celera. By late 1999 it seemed clear to me that the company was never going to agree to a joint database with completely unrestricted access. They might let people look at their data, but they weren’t going to let them add to it and pass it on. In other words, as far as the human was concerned, they wanted to keep control of the annotation stage. This was simply unacceptable. The raw, unannotated genome is not a useable tool in the hands of the average biologist. What the public project aimed to provide was much more than this. Tim Hubbard, head of sequence analysis at the Sanger Centre, and Ewan
Birney, who had moved from Richard’s group to the European Bioinformatics Institute next door, had been working throughout 1999 on a software tool called Ensembl that would automatically annotate the genome and display it in a user-friendly way. (It went on line for the first time in October that year, and has been regularly updated ever since, its development funded by the Wellcome Trust.) Providing an analysis of the genome was an essential part of putting it in the public domain, both to give users the best possible view of the data and to preempt trivial patenting based simply on sequence comparisons; handing responsibility for this step to Celera was out of the question.
Frankly, I thought that further negotiation would be pointless. But Eric Lander seemed to think that collaborating with Craig was the only way to avoid his being declared the clear winner in what was increasingly presented as a race to complete the genome, and we gradually realized that he had been discussing the idea with Celera’s representatives for some months. Eric saw some kind of collaboration as the only way to gain control of the situation.
I did think it would be a useful thing to try to defuse this and force the data out into the public domain. Also I thought that the wars going on during the project were very damaging. I wanted to see this as peaceful as possible, and Craig and I exchanged e-mails and conversations on all this.
We at the Sanger Centre knew nothing of this at first. Then I heard from Nicholas Wade on the New York Times that a proposed collaboration was being discussed, and wondered what was going on. It was not until Bob phoned me in mid-November that we realized how serious it was. He said that a conference call had been set up with Celera, that Eric had prepared a background paper, and that he thought I should be in on the call. He sent the paper to me for comment. A day or two later I got a call from Francis Collins
asking me to join the conference call, which was scheduled for the next day, a Saturday. But at the last minute it was called off. Everyone was rather evasive about why—Francis just said he had decided it would be ‘premature’—but it gradually emerged that Craig had refused to join in if I was on the call, and Bob had continued to insist that my presence was essential.
The following weekend was assigned for our annual board of management retreat. That year we went to Stamford in Lincolnshire. Lovely as Blakeney is, it’s hard for people from London to get there, and we wanted Mike Dexter and Michael Morgan to join us for part of the time. We assembled in the George, the splendid old inn that rambles through the center of the town. I brought Eric’s document and we went through it to see what we could commit to. The retreat was really for planning the future of the lab, but this issue was too important to wait. The proposal was for a merging of the data from the two sides, beginning the following spring or summer, with joint publication by the end of the year. But the document preserved the principle that the complete sequence should be freely available in the public domain, with no restrictions on its use.
Reading it, I doubted that Celera would actually sign up to such liberal conditions when it came to the point. It seemed absurd to suppose that progress could be made, but we had to go through the proposal legalistically just in case it became reality. We talked both about the minimum that we could agree to and about the safeguards that would be required to enforce the agreement if made. (I hadn’t forgotten the last-minute attempt to backtrack over the fly release—and these were human data we were contemplating.) I spent the day running between the meeting room and the phone in my hotel room to confer with Bob.
There were few differences of view among us all. The Trust and the directors of Genome Research Ltd were just as concerned as the BoM about data release and freedom of use; the Trust had not
invested its money to see the benefits going to a United States entrepreneur. No wonder Celera targeted the Sanger Centre and the Trust in press statements, accusing us of wasting money that should go to other kinds of research. The Trust was beyond the reach of political lobbying and so had to be attacked in other ways. Many of Craig’s jibes were ludicrously wide of the mark. For example, he told the New Yorker that ‘the Wellcome Trust is now trying to justify how, as a private charity, it gave what I think was well over a billion dollars to the Sanger Centre to do just a third of the human genome.’ In fact the total Trust grant for human sequencing up to the end of 2001 was £120 million, or $180 million. It used to bother me when Craig came out with this stuff, because I knew that British scientists were hearing it and some were disliking us for the stance we took. But the shriller the accusation, the more obvious it was that we were doing something necessary and had to stand firm. And the role of the Trust in defending the public position is just as important today.
Now that we were officially in the loop on the progress of discussions with Celera, we spent the next month scrutinizing successive versions of the ‘statement of principles’ originally drafted by Eric and refined by Bob. A meeting with Celera representatives was finally set up for 29 December. I felt very strongly that the Wellcome Trust should be represented, to ensure that all its investment in the principle and practice of free release—not to mention their investment in the Sanger Centre—should not be thrown away by an ill-thought-out agreement. For this role Mike Dexter nominated Martin Bobrow, a member of the Trust’s board of governors as well as of Genome Research Ltd, and someone on whom I’d come to rely for wise advice. The rest of the public side’s negotiating team consisted of Francis Collins, Harold Varmus and Bob Waterston. Celera fielded Craig Venter, Tony White, another Celera executive, Paul Gilman, and Arnold Levine of Celera’s scientific advisory board.
Eric and Francis hoped that the meeting would establish common ground, and sent Celera a copy of the ‘shared principles’ in advance. But it was obvious to Bob from the word go that sharing was not on the agenda.
We had been led to believe that they were seriously seeking some co-operation, and that they understood that if we were to co-operate we were going to have to continue to release data. But boy, when we got in there Tony White had a different view of what was possible. He just took a hard line—is this going to make us any money?
Led by White, the Celera team demanded that the public project stop producing and releasing its own data as soon as the combined effort had sequenced the genome to sufficient depth to assemble a complete draft. A joint database would be the only way people could get access to the sequence, but Celera wanted to control it. The company could not accept the condition that the pooled data would be available to everyone, even their commercial competitors, to repackage and sell if they wanted to. Tony White wanted the merged data set to be protected from commercial use by others for three to five years; Francis Collins was prepared to consider six months to a year at the outside. ‘It was so different from what we had been led to believe was the basis for us going there,’ says Bob. He wondered whether the Celera team had planned some kind of nice guy/heavy routine for Craig Venter and Tony White, but ‘in the end we only got the heavy.’
In the case of the fly, Celera had been prepared (at least under pressure) to put the data in the public databases. It gained a lot of credibility for accelerating Gerry Rubin’s project and operating in a genuinely collaborative fashion, while giving paying subscribers to Celera a few months’ advantage over the public at large in access to the data. But in the case of the human sequence, the stakes were much higher. Craig was already quite open about the fact that
Celera was going to combine the publicly available data with its own in the commercial product it produced, and the company needed no agreement from us to do this. On the other hand, Celera’s representatives were very negative about the possibility that the public project might use Celera data, which at this point they were talking about releasing on DVD, to help with the finishing of the sequence. (The DVD idea was dropped altogether soon afterwards.) And they did not accept the last principle on our list: that if there were to be a scientific publication containing data from both projects, then it should have authors from both sides. The fly project had been trumpeted as a model for what could be done in the human—but Celera was clearly unwilling to conform to its own model. There was no meeting of minds; it seemed that the only reason the Celera team had agreed to the meeting at all was that they did not want to be seen to be the ones who had cut off negotiations.
There was deep frustration among the public project’s scientists that with or without their agreement, Celera was going to profit from our work while simultaneously claiming to have ‘beaten’ us in the ‘race’ to the genome. As it began to look increasingly unlikely that any agreement would be signed, some of those on the public side began to wonder if there was any way we could give our data some measure of protection. Patenting had been dismissed early on in the discussions at the first Bermuda meetings, as had the idea of holding the intellectual property in some sort of trust. But our head of sequence analysis Tim Hubbard proposed a different model. I had compared Celera with Microsoft in its desire to corner the genome market, and it was in the software world that Tim found another analogy.
He was very struck by the ‘free software’ movement that had come up with a way of encouraging collaborative software development by ensuring that the results of the collaboration, the computer source code, would remain available to anyone in perpetuity and could not be turned into commercial property. The movement had
grown from its roots in 1984 to a point where its collective software could be put together to create a complete operating system popularly known as Linux. The movement is the antithesis of Microsoft, which jealously protects its source code as a commercial secret, and it has come to be seen as a counter to the hegemony of Bill Gates’s company. Anyone is free to download the software from the internet. The source code is ‘open’, rather than a commercial secret, and users are free to modify it and pass it on to others, either free of charge or for a fee. The only constraint is that users must agree, by signing a license, that the same conditions will apply to any modified version they pass on. (This kind of agreement is sometimes known as ‘copyleft.’) The result has been a spreading community of users and developers of free software, in which no-one can impose secrecy on their version or deny others the opportunity to develop it further.
As talks with Celera proved less and less likely to get anywhere, Tim and others began to work on the idea that we should use the open source model to protect our data. The idea was to put a note on the human genome data deposited in the public databases by the G5 genome centers, saying that anyone would be free to use the data in their own research or to develop products, and to redistribute it in any form. However, anyone who did this would not be allowed to put in place new restrictions on its further development or redistribution. Michael Morgan was rather taken with this idea, and the Wellcome Trust’s head of legal matters, John Stewart, spent a lot of time looking at the arguments and drawing up a draft license agreement.
But the idea met with a chorus of disapproval from those at the public databases. They argued that it went entirely against the principle, hard won over the previous decades, that data deposited in the databases were completely free for anyone to use without restrictions. They pointed out that other commercial companies, such as Incyte, had for years been selling commercially protected proprietary databases that included public data and no-one had ever
protested. They were vehemently opposed to encouraging the idea that anyone in future who wanted to deposit data in the public databases could impose their own set of conditions. They reminded the G5 that international collaborators who had won their countries’ acceptance of the Bermuda Principles only at some cost to themselves would justifiably feel betrayed if the G5 were seen to be retreating from those principles. And finally, far from being the PR coup that Tim had envisaged, they foresaw it as a PR disaster, easily interpreted by Celera as an ill-intentioned spoiling tactic.
My own feelings were confused as the discussion swayed to and fro. Looking back, I see that Tim and I were interested in the open data possibility in the same sort of way as Francis Collins and his colleagues had been in the memorandum of understanding with Celera, as a means to escape from an absurd situation. But our scheme wanted to change the world, whereas the memorandum would have recognized the world as it then was and changed the project to fit. Still, the critics were right, of course: in the end our whole-hearted commitment to public access and free use of the data by both industry and academic scientists was our biggest selling point, and to compromise it would have been disastrous. We dropped all discussion of open source licensing or any other form of restriction on the use of data from the public project. Meanwhile, in October 1999 Celera had announced that it had applied for patents on 6,500 new genes. Although these were provisional applications, and Celera claimed it would ultimately pursue only 200–300 of them, the fact remained that they could potentially give Celera title to a great deal more biological information than it had suggested when the company was launched.
Francis Collins was more concerned than ever to dispel the persistent media image that Celera and the HGP were engaged in a race, and an acrimonious one at that. The success of Celera’s clever press campaign reinforced the view that something needed to be done about it. In January 2000 the company announced that it had
sequenced 81 percent of the genome, and had combined this with the public data to produce 90 percent coverage. The instant impression, to the uninitiated, was that Celera had done nine times as much as the HGP.
It’s worth looking in some detail at this astute announcement. Remember that we need to sequence enough reads to cover the genome several times over in order to close most of the gaps. The Celera release was based on the fact that they had only 1.75-fold coverage in raw sequence reads at the time, which if distributed randomly would mean that 81 percent of the genome would be represented. This was purely a paper calculation, because there was no way of assembling such a thin coverage of reads to validate their distribution, but was probably about right. The extra 9 percent arose because the public project at this point had about half the genome represented in draft sequence from clones, and so on a random basis would be expected to make up half the remainder. But the two sorts of data were not comparable at all. Our side was working systematically, and the individual clones had been sequenced to a fourfold depth of coverage, which resulted in a useful level of assembly. (This could be further enhanced if it were to be combined with Celera’s reads.) So an objective statement would have said that half the genome was well represented and mapped by the HGP, and a further 40 percent could be found in sparse random reads from Celera.
I tried hard to explain this to journalists, but not many got it and most thought it was sour grapes on my part, though a few more careful writers noted that half of the data in the Celera database in fact came from the public databases. The result was that the media myth of Celera being ahead of the game became firmly established, even though (or perhaps because) all our data were there for all to see and use, while nobody had seen Celera’s data at all. To paraphrase Gilbert and Sullivan, ‘a press release, a press release, a most ingenious press release’!
It was the first time in my life that I’d been faced directly with such ruthless manipulation. I vaguely knew that such things went on, of course, and was sadly aware that political life entailed a measure of ‘economy with the truth.’ But that had all gone on somewhere distant from me. Now the information was coming from a company run by a highly intelligent person, a fellow scientist, someone who claimed to be working for the interests of humanity. Was it possible that he didn’t understand what he was saying? It was shocking: the methods of journalism were being used to report science. And of course the journalists loved it; these were good, clean, uncomplicated stories with none of the ifs and buts that mar real scientific reports for the purposes of the media.
At first I expected Francis Collins and Michael Morgan to deal with it, as we had agreed (I thought) beside the pool at Airlie House. If they needed professional PR support they could hire a firm to help. But they seemed unable to respond effectively. We were all thrown back on our own resources to present the case as well as we could. We suffered from lack of coordination and lack of time.
For a while it didn’t bother me greatly, because I expected that once people outside knew what was going on they would rise up in protest. As the months went by, and Celera was lauded by commentators, I continued to think that it was only because nobody knew the true situation. Little by little, I found myself, through answering questions, edging into the spotlight myself, trying to explain that Celera’s press releases weren’t painting the true picture. And insidiously I found myself going along with the media’s desire for an easily identifiable figurehead. I hadn’t wanted to take a lead on the PR front for the simple reason that, judging by the usual standards in the scientific community, the human sequence was not my work. But once I started giving interviews, making my points in what I hoped were clear and forceful terms, the whole thing snowballed.
A few people paid attention, but seemingly not many. For
instance, I was sadly disappointed, and still am, by the BBC’s news coverage: after all, as a publicly funded body, serving the U.K., surely it should have seen what we were doing and how important it was to keep genome data in the open? I began to realize that presentation matters enormously, that nobody has time or patience to examine the facts for themselves but rather takes up what is proffered most conveniently. So I began to adapt, to be more vehement. I still clung to the thought that if only people knew the truth they would come round to our point of view.
A significant part of Celera’s press announcement was the statement that it planned to stop sequencing once it had achieved fourfold coverage—each DNA base in the genome covered by an average of four sequence reads—rather than going for the full ten-fold coverage it had originally planned. The implication, clearly spelt out by Nature’s news team although not widely discussed, was that Celera was no longer going to rely on its whole-genome shotgun assembly program to put the sequence together. ‘Celera will need to hang its sequence data on the framework produced by the public project,’ said Nature. In other words, having initially declared that mapping was unnecessary, Celera was now preparing to use the public project’s map to help assemble its own sequence.
Celera described its use of our data as a ‘de facto collaboration’ (I had always thought that collaborations were two-way affairs, but time moves on), but Francis was still after something more formal. After the disastrous 29 December meeting he made repeated attempts to contact Craig, to keep negotiations going at some level. If he could not achieve the merging of the data from the two efforts, at least, he thought, he could try for simultaneous publication. But for two whole months Craig became mysteriously unable to return phone calls or e-mails. The most Francis achieved was a telephone conversation with Tony White, who made it clear that he was not prepared to move from the position he presented at the meeting. Knowing that Celera was blaming the public project for the break-
down of negotiations, Francis felt it was important to tell the other side of the story. He drafted a letter setting out the main points of difference with the company and recounting his unsuccessful attempts to restart negotiations. The letter, marked ‘Confidential’ and signed by the four negotiators—Francis, Harold Varmus (who had resigned as head of the NIH at the end of 1999 to become president of the Memorial Sloan—Kettering Cancer Center in New York, and had been replaced as acting director by Ruth Kirschstein), Bob Waterston and Martin Bobrow—went off to Craig on 28 February 2000. It set a deadline of 6 March for his response, adding that unless they heard from him by that date, the authors would assume he was no longer interested in collaboration.
Everyone involved understood that the contents of the letter were likely to be made public at some point, although exactly how and when this should happen was left vague at first. I genuinely believed that once the world saw how intransigent Celera had been in its negotiating stance, and how determined it was to keep control of the data, everyone would immediately be on our side.
In the intervening week Celera launched its first, highly successful share issue, which netted the company almost $1 billion. We felt we could not release the letter until the issue closed, as it might have been seen as an act of deliberate and illegal sabotage. But we were anxious not to wait too long. The letter was released to the press at the weekend before the Monday deadline for Celera’s response, and it was made known that the Wellcome Trust was the source. (It would have been politically disastrous for the National Institutes of Health to be involved in such a leak.) It had a huge impact, but not in the way we had anticipated. By jumping the gun by one day we inadvertently put ourselves in a bad light, which Celera was immediately able to exploit. Craig Venter and Tony White came out blazing with righteous indignation, calling the Trust’s action ‘slimy’ and ‘a low-life thing to do.’ Craig even taunted us for not having released the letter earlier and dented the share issue. They told the
Washington Post that they believed the public genome project was deliberately trying to sabotage any chance of collaboration with Celera because it wanted to make a deal with another privately backed consortium in order to get the genome done first. Francis Collins had indeed been negotiating with Incyte about contract sequencing, but this had nothing to do with the breakdown of the Celera negotiations.
Amazingly, it now seems with hindsight, we had not thought at all about how we would handle the follow-up interviews, or what Celera’s response might be. The United States press tracked Francis down at his family home and immediately put him on the defensive: no, NIH had not been involved in the leak, and the idea that it had been done for underhand motives was ‘fanciful.’ It sounded weak. Craig, for his part, issued a response couched in terms of pained dignity, insisting that he continued to be interested in pursuing ‘good faith discussions towards collaboration.’ He reiterated, however, his company’s need for assurance that other companies would not be able to repackage and resell its data. He emerged with his credibility intact, although not his bank balance. The news that a formal collaboration was almost certainly off worried the markets. The value of stocks in all genomics companies had been climbing at a ridiculous rate since the turn of the year, along with high-tech stocks generally. Celera’s own stocks increased in value from $186 to $258 on the strength of the January press release alone. Back in the autumn they had hovered around $40. In the share issue the week before the leak, PE Corporation had sold its 3.8 million shares in Celera for $225 apiece. But as soon as news of the impasse between Celera and the HGP appeared, biotech shares began to fall. (Why they should do so was a mystery to me, as with or without an agreement, Celera would still have full access to the public project’s data.)
The gloves really came off in the next few days (the Washington Post described the genome project as ‘a mud-wrestling match’). I was interviewed on the Today program on BBC Radio the following
Monday morning. I pointed out that our problem was that Celera not only collected their own data but would ‘hoover up all of ours’— which of course was publicly available—call it their own, and charge others for using it. ‘It’s a sort of con-job, if that’s not too rude a word,’ I added. From its place deep inside the interview BBC Online pulled out the word ‘con-job’ and flashed it around the world, to be seized on by journalists. And what did people say? Some approved. But many accused me of mud-slinging, jealousy, protecting my turf. I had been heard, but the world by and large divided along party lines.
I had entered the world of politics.
A week later Bill Clinton and Tony Blair made a joint statement saying that the human genome sequence should be freely available to all researchers. The statement was the result of careful lobbying for over a year, initiated by Mike Dexter at the Wellcome Trust—it was just coincidence that it finally came out a week after the leak debacle. We were delighted with the statement, but all it really amounted to was a government-level confirmation of the Bermuda Principle that primary genomic sequence should be freely released. It had no standing in law, and it did not threaten companies’ right to patent genetic sequences of proven utility—in fact it explicitly supported the protection of intellectual property. But on the day of the statement CBS Radio News reported that Clinton and Blair had agreed to ‘ban patents on individual genes’, following an early morning White House press briefing.
That proved to be the last straw for an already jumpy and still overvalued stock market. The Nasdaq, the index of biotechnology and other high technology stocks, suffered the second biggest fall in its history, more than 200 points. Thirty billion dollars was wiped off the value of just ten biotech companies in a day. It was left to Neal Lane, the President’s scientific adviser, and Francis Collins, to put things straight to a clamoring Washington press corps at a lunchtime
press briefing. The Nasdaq bounced back almost at once, but genomics companies such as Celera and Incyte stayed for some time at a more realistic level than the dizzying heights they had reached only two weeks before.
This time I was asked at the very last minute to appear on BBC TV’s late evening current affairs program Newsnight—Don Powell, the Sanger Centre’s press officer, had to come and drag me out of the pub. I was sitting in the remote studio in Cambridge, all prepared to talk about the impact of the announcement on patents, when the science editor Susan Watts introduced her package by saying, ‘No one disputes that Venter has made the crucial breakthroughs that mean he is now leagues ahead in the decoding game.’ I was amazed, because this was so far from the truth and yet it was being accepted as a starting point for the discussion. As soon as the presenter, Jeremy Paxman, brought me on I said, ‘The public program is actually ahead. We have already released two-thirds of the human genome.’ It was now Paxman’s turn to be amazed, and he said, ‘If that’s the case and it’s all in the public domain anyway, what are we worrying about?’ I answered that the Clinton-Blair statement was unlikely to make any immediate difference to the patenting of DNA because so much was already publicly available, but that it was a very valuable lead for future debates. I said that I thought most people agreed with the statement, and that it was a superb endorsement of the HGP effort—putting the genes in the public domain where they should be. Said Paxman: ‘I couldn’t agree with you more!’
A lot of people thought we came out of it rather well, but Susan Watts phoned up the next day and said it had been ‘disappointing.’ Jeremy Paxman is not supposed to agree with his interviewees; what the producers are after is passionate debate. I told her I was sorry, but she had altered the substance of the discussion by saying Celera was winning hands down. It wasn’t especially her fault: she was simply repeating the picture that had been so carefully fostered. On the
telephone with her beforehand I had been more passionate about the iniquity of patenting DNA sequences, but on camera I was forced to deal with her introduction. I really buttoned myself down, and exuded quiet confidence rather than the passion she was hoping for. It was altogether a fascinating insight into the workings of the media and the power of unchallenged press releases.
On 6 April, the day that Craig was once again to testify before a congressional subcommittee investigating the progress of the public and private efforts, Celera announced that it had completed the sequencing of the first human genome. It would assemble the sequence over the next few weeks, and was soon going on to sequence the laboratory mouse. Once again there was a most ingenious press release that talked about elevenfold coverage of the genome. That sounded very impressive, and appeared to be vastly more than the public domain had achieved. The correspondents, as intended, were immediately convinced that Celera had won. But we knew that such sequence coverage was impossible with the capacity that they had: we were all running neck and neck, and knew exactly what could be done in a given time. Rapid digging revealed that they were talking about clone coverage, not sequence. In other words, they had sequenced a few hundred bases at each end of enough clones to cover the genome eleven times, a step they used to build ‘scaffolds’ to help with assembly. Nevertheless many press reports gave credence to a confident Craig asserting that everything would be put together in six weeks, and as usual paid little attention to our comments on the changed definition of sequence!
A day or two later a HUGO meeting was held in Vancouver. Although it had in the end played little part in large-scale human genomics, HUGO had come into its own with its annual meetings, which are well attended and have become a major event in the calendar of professional activity. Francis was asked to speak on the HGP effort that year. During the questions at the end of his talk, a member of the audience asked why, if Celera had finished the
genome, the public effort was continuing. Francis explained that the measure being used by Celera did not amount to completed sequence, and pointed out that the assembly would be difficult and would require finishing. ‘You should not take at face value any claim by any group for at least two years that says we have finished the human genome sequence,’ he said. ‘It will not be true.’ That was a truthful and sober statement, and I was delighted to see it in circulation. Francis was not denying what Celera had achieved, just explaining the reality of producing sequence.
Celera shares, which had shot up in response to its press release, plunged once again. Soon after Francis returned to Washington his department issued a partial retraction, declaring that he ‘didn’t say anything critical of Celera in his speech.’ Following this episode, Francis pointedly refrained from making public statements of any kind about Celera.
I was appalled by what I interpreted as deliberate muzzling of the head of the Human Genome Project. It brought home to me forcefully that the strength of the industrial lobby in Washington means that no public servant can make statements that imply criticism of a commercial company (and of course, things are not greatly different in the U.K. and other industrialized countries). Francis agrees that the political reality of his situation was very different from my own.
The Sanger Centre has the support of Wellcome Trust, with whom they are philosophically very tightly aligned. That’s a less complicated position than I find myself in here, where there is a great interest in seeing private-sector efforts in biotechnology flourish, and anything that appears to be in any way critical of that can potentially be problematic. So it has been possible for John to speak his mind at times very bluntly, when some of the rest of us had to watch our language extremely carefully in order not to set off alarm bells.
Celera’s success in silencing Francis was very valuable to the
company. To give one example, Columbia University in New York had been due to hear from him at a seminar in June, which he had to cancel (presumably he could have gone along and given an expurgated account that accepted Celera’s claims at face value, but of course that would have been counterproductive for both science and the truth). Meanwhile, Gene Myers of Celera did give a seminar at the university, indicating that the human assembly was going immensely well, without giving much detail. The result was that most of the audience became convinced that Celera had done it. The same scenario, repeated across the country, established an aura of success that greatly helped the company in the build-up to selling its database.
Meanwhile Clinton was getting impatient with the continued brawling between Celera and the publicly funded scientists. The White House wanted something nice to happen about the human genome, which was now getting such a lot of press attention. It was a party political issue, with many of the Republicans supporting Celera, hoping to garner support from the biotech industry which had taken a beating in the markets, and many of the Democrats supporting the public side. Continuing conflict could be extremely bad for Vice-President Al Gore in his bid to succeed Clinton that presidential election year. Clinton sent a note to Neal Lane saying ‘Fix it…make these guys work together.’ Ari Patrinos of the Department of Energy played the part of honest broker and got Craig and Francis round to his house for beer and pizza at the beginning of May. After a few more rounds of this ‘pizza diplomacy’, as Time magazine called it, they agreed to a joint announcement of the completion of the sequence, simultaneous publication later in the year and a truce in the war of words about who had done what. The negotiations went on in complete secrecy, to Francis’s discomfort.
I felt pretty uneasy about doing that. It was clear that it had to be done under conditions of confidentiality or Craig wouldn’t be will-
ing to play ball. And yet as someone who is used to communicating with my colleagues at every little step, for a couple of weeks I wasn’t able to do that, and that put me in a very awkward position.
All we knew was that Francis was lying low, refusing to do any press interviews. Everyone seemed convinced that Celera would announce the completion of the draft genome in June. Our official line was that although we also hoped to reach our target of 90 percent in June, we would keep any announcement low-key and go for a big song and dance around the time of publication, which we expected to be in September. It was only in early June that we found out what had been going on. In addition to the beer and pizza sessions between Francis and Craig, the British science minister, David Sainsbury, had visited the United States and talked to the White House science advisers about making a simultaneous United States-U.K., public-private announcement of the completion of the draft human genome. The date of the announcement, 26 June, was picked because it was a day that happened to be free in both Bill Clinton’s and Tony Blair’s diaries.
It was not clear that the Human Genome Project had quite got to its magic 90 percent mark by then, and Celera’s data were invisible but known to be thin, so nobody was really ready to announce; but it became politically inescapable to do so. We just put together what we did have and wrapped it up in a nice way, and said it was done. We were sucked into doing exactly what Celera has always done, which is to talk up the result and watch the reports come out saying that it’s all done. Yes, we were just a bunch of phonies! But we were trapped by Washington politics.
Later that day I went round to the Channel Four news studio to be interviewed by Jon Snow for the seven o’clock news. As usual I talked about free data release and its importance. Then they brought in Craig Venter over a transatlantic link from Washington, and asked him about patents. ‘We never said we would patent thousands
of genes,’ he said, asserting that at most the number of patents they might be licensing to their pharmaceutical partners was in ‘small double digit figures.’ ‘The fact that the bar has been raised [by a change in the United States patent office guidelines] is good news for everybody,’ he added. It was tremendously different from the October 1999 press release with its 6,500 patent applications. It seemed to me that our actions in putting so much sequence into the public domain had indeed changed the business plan of the company, or at least the spin it wanted to put on its activities.
Of course, the 26 June announcement was a political gesture, but it genuinely didn’t feel like that on the day. I remember thinking that it really had worked. It didn’t matter that it was founded on the White House desire to get Al Gore elected. What mattered was that people were not talking about ‘the race’ anything like so much. They were actually talking about the implications of the work. And the Human Genome Project’s sequence, incomplete as it was, really was available to anyone to use.