The CDC should release their flu sequences to the world’s scientists
Having up-to-date sequence information on different strains of flu virus is essential for producing effective vaccines. There are many scientists in the United States and other countries who are expert at analysing these sequences. However, all but a few are not allowed to see them. Why? Because the CDC has a monopoly on the flu sequences it possesses. Samples sent by physicians from all over the country, and the world, to the CDC are sequenced, but then locked up in a secret CDC database. Only the small number of scientists at the CDC are allowed to look at these sequences.
Two excuses for witholding this information have been given:
- National security
- Too time consuming
Let’s take each of these in turn.
National security is not a legitimate excuse. The Influenza Genome Sequencing Project, a project sponsored by the National Institutes of Health (NIH), has deposited the sequences of over 1056 influenza viruses isolated from humans into a publicly accessible database, GenBank. The Naval Medical Research Unit No. 3 (NAMRU3) has deposited a H5N1 sequence from a patient in Iraq into GenBank. While refusing to deposit sequences from the ordinary flu that infects many people each year, the CDC has participated in releasing the sequence of the 1918 strain of influenza (ref), which killed over 50 million people worldwide and over 500,000 Americans.
Arguing that depositing flu sequences in GenBank is too time consuming is also not credible. As noted above, the NIH has deposited thousands of influenza sequences in GenBank. GenBank has software available for batch submission of large numbers of sequences. Futher, in order to use an internal database, which presumably the CDC has, the sequences must be in a standard format, likely FASTA. Creating a ftp site that would allow researchers to download the sequences would be easy to set up. Finally, some of the CDC sequences are deposited in the Influenza Sequence Database maintained by the Lawrence Livermore National Laboratory (LANL). There is both a secure, password protected side and a publicly available side. Most of the CDC sequences on this site are on the password protected side. To make these sequences publicly available the CDC merely needs to indicate to the LANL that they want to remove the restrictions.
The consequences? Because the CDC hid their sequence data from other scientists, including researchers at the National Institutes of Health (NIH), the flu vaccine for the 2003–04 season was badly matched to the strain of flu making people sick.
From the Nature article Flu researchers slam US agency for hoarding data
One such swap made the virulent Fujian strain, which hit in 2003–04, and to which the annual vaccine was poorly adapted. “The minute we got our hands on some open data, it jumped out that here was something people were not aware of,” says one NIH scientist. “The CDC didn’t know what was going on with the Fujian thing, and by the time they realized, it was too late to use if for a vaccine.”
The result? The vaccine didn’t work as well as it should have. It is likely that many suffered greatly, unnecessarily. Some of them died. Some of them were children.
The severe 2003–04 season started in terrible fashion, with flu-related deaths of 93 children between that October and early January. Overall, deaths from pneumonia and flu were considered epidemic for nine consecutive weeks that winter.
MSNBC April 17 2006
This flu season, there was a better match between the vaccine and circulating flu strain. As a result,
This season, the number of deaths are lower than normal. MSNBC April 17 2006
How many died due to the CDC policy of hiding flu data from other scientists? We may never know the total number, but given that the number of deaths due to influenza is estimated to be 20,000 to 40,000 per year in the US alone, it could easily be in the thousands. One number we can estimate is the number of children who died in 2003/2004 due to the mismatch between the vaccine and the flu strain. 93 children died that flu season, less than 24 died this flu season. Thus, we can estimate that approximately 70 children died in 2003/2004 because of the mismatch between vaccine and the flu strain that was infecting people that year.
Those deaths are directly attributable to Dr. Julie Gerberding’s decision, as Director of the CDC, to hide the available flu sequences from other scientists.
For a reminder of what happened in the 2003–2004 flu season, read this.
You might think this was enough to chasten Dr. Gerberding and that she’d release all the CDC sequences after this. You’d be wrong. The CDC continued to persist in it’s policy of hoarding data.
H5N1 influenza has the potential to cause a pandemic that will kill millions of Americans and hundreds of millions around the world. Surely, this is enough to get Dr. Gerberding to release the CDC sequences. Wrong again. The CDC sequences are still being hidden. This includes the H5N1 sequences they have.
What do respected scientists think about this behavior? Some examples
“Many in the influenza field are displeased with the CDC’s practice of refusing to deposit sequences of most of the strains that they sequence.” Michael Deem, a physicist at Rice University in Houston
Nature, September 22, 2005
“I think what ought to happen is that the U.S., starting with people funded by NIH and the CDC itself ought to start releasing all of their data and all of their samples — and lead by example,” says Salzberg, director of the Center for Bioinformatics and Computational Biology at the University of Maryland.
“Because one complaint I’ve heard from other scientists in other countries is: Hey, the CDC in the U.S. doesn’t release all their data. So why should we?’ And that’s a very legitimate complaint.”
Toronto Star, March 12 2006
“It is hard to get co-operation between CDC and modellers not at CDC,” said Simon Levin, director of the Center for BioComplexity at Princeton. “CDC researchers want to publish their results, and therefore [they] don’t share their data.”
Globe and Mail, March 25, 2006
- Blocking sequence deposits prevents us from monitoring the spread of H5N1. It is impossible to design reliable primers without accurate sequence information. Since PCR is one of the key techniques used to determine whether an individual is infected with H5N1 or not, blocking sequence data means that H5N1 could be spreading and infecting individuals without detection.
- Blocking sequence deposits prevents us from knowing whether existing vaccines are likely to work.
- Blocking sequence deposits prevents us from picking the most appropriate strain for vaccine development.
- Blocking sequence deposits delays the use of new, rapid vaccine development technologies.
- Blocking sequence deposits blinds us to the evolution of new strains of H5N1 which would cause us to raise our alert level if we knew about them.
These are not merely academic arguments. Blocking sequence data may result in the deaths of millions of people.
What should be done?
In 1996, the world’s experts in sequencing the human genome came together to discuss what to do about publishing this data. Issues of credit were discussed, but in the end, the rapid dissemination of these sequences in public databases so that scientists all over the world could begin using this information to understand and cure disease was considered too important to delay, for any reason. It was decided that sequences would be made publicly available within 24 hours after they were completed. Because the meeting was held in Bermuda, this agreement is known as The Bermuda rules.
10 years later, it is amazing that petty scientists concerned about credit are putting millions of lives at risk. When human life is at stake, standards of acceptable behaviour with regard to sequence have been established. The Bermuda Rules should be applied to H5N1 sequences. Anyone who blocks H5N1 sequences from the public databases is guilty of crimes against humanity.
What can you do? Write Dr. Gerberding. Let her know you want her to stop blocking flu sequences from the world’s scientists. And don’t forget to cc her boss, Secretary Leavitt of Health and Human Services and his boss, President Bush.
Here’s a letter you may use as a template if you wish:
Dear Dr. Gerberding,
I am writing this [letter or email] to express my grave concern regarding the failure of scientists at the Centers for Disease Control and Prevention (CDC) to deposit influenza sequences in GenBank. As you know, many of the world’s scientists are very worried that a flu pandemic may start soon. It has been estimated that such a pandemic may kill up to 2 million Americans, mostly young people and children. Many of the measures that we would use to reduce the impact of such a tragic event require the sharing of sequences obtained from flu viruses amongst the world’s scientists. Examples of activities that require these sequences include surveillance and vaccine development. Yet, most of the scientists of the United States and other countries are unable to examine influenza sequences acquired by the CDC, which has impaired their work.
The National Institutes of Health (NIH), in particular, the National Institute of Alergy and Infectious Diseases (NIAID) deposits all of the influenza sequences they obtain in GenBank very quickly. The United States Naval Medical Research Units have deposited H5N1 sequences that they acquired in GenBank. Since the CDC has published the sequence of the 1918 virus, national security does not seem to be the issue. Large numbers of nucleotide sequences can easily be deposited in GenBank through batch processing, so time and effort is not the issue.
There are concerns that these sequences are being kept secret so that individuals at the CDC can have exclusive access to them while they write papers, a process that may take years. Thus, the lives of millions of Americans may being put at risk to boost the careers of scientists at the CDC.
With respect, I ask that you require the scientists at the CDC deposit all of their flu sequences in a database (GenBank) that would be accessible to all of the scientists in the United States and elsewhere so that the spread of dangerous viruses can be monitored and the most effective vaccines be developed. Further, I ask that you require CDC scientists to deposit their flu virus sequences with 24 hours of completion (Bermuda Rules).
Your Name Here