Open Access Flu Prevention & Treatment Database Project: Proposal

Disclaimer: As the originating author of this proposal, I should make it clear that I do not consider myself in any sense the owner of the ideas herein. I can have no claim to copyright, nor do I seek unique credit for it. I have not claimed it on my curriculum vitae, nor do I intend to. Indeed, the reason I posted it on Fluwikie was so that it would get the widest possible exposure in the shortest amount of time, so that some other interested parties, someone or some group with the technological expertise and personnel and monetary resources, might be able to take on this project as their own. I have none of those things. Although some group, that might be interested in such a project, might reasonably ask, “why would we undertake such a project, if we cannot claim unique credit for its result?” The answer is best phrased as another question: If someone shows you the way to do a supremely good thing, do you refuse to do it just because it wasn’t your idea? If a pandemic happens, this is an idea which will result in great discoveries. It can be used by any party who will abide by its original intent: to gather the widest array of data about outcomes during a flu pandemic on a prospective basis, and to make those data available in raw form on a continual basis as a pandemic unfolds. My only purpose here is to show people how it can be done. If it works, there will be plenty of credit to go around. -MD

The text of this proposal is the property of FluWikie, and as such, revisions and useful additions by all parties are welcome. Wherever you read the word “link”, we have intended to include supporting sources, and will be gradually filling those in going forward. Any reader who knows of a link to any one of the sources we are referring to, should feel free to edit the proposal so as to provide that source information. Source links should go to sources that are relatively static, such as journal citations in PubMed and web pages maintained in a way so as to make them “permanent.” Look to the editing instructions you can find HERE to show you how to do that, it’s easy. Furthermore, readers who find factual inaccuracies in this text should feel free to correct our errors. Others will surely check the checkers, but in the end this proposal and allied documents will be better for the attention they receive. Hereafter, first person references in this document will use the term “we” to reflect the fact that this document, like all wiki docs, is the product of a community

Introduction

Much attention is being given to the possibility of a new influenza pandemic. The last major influenza pandemic was in 1917–1918. Millions died worldwide. Now scientists and public policy makers are using that outbreak and others as their motivation to design safeguards to counter an anticipated pandemic from a different strain of the Type A influenza, with many betting that it will be caused by a new strain of the avian influenza known as H5N1.

Highly pathogenic H5N1 is spreading quickly from its source in southeast Asia, literally carried on the wings of birds. It already has the power to infect humans, though not efficiently, and kills about half of the people who show symptoms. There are very few serological data showing how many people are infected without showing symptoms, but what data there are show that to date, the virus has not caused widespread infection of people in areas where it has been found. Virologists and epidemiologists have identified several emerging strains, and worry that at some point in the near future a new strain will emerge that is capable of efficient human to human (H2H) transmission. They hope that by early identification of an outbreak of a transmissable strain, resources might be directed to those locations, so that local authorities might be able to smother it by preventing further transmission using antiviral prophylaxis and quarantine. However, many believe that once a strain appears that is capable of efficient H2H transmission, the chance to stop a pandemic may already be past. If these events come to pass, and a pandemic results, what can be done to limit the death toll?

The question has many answers, each taking the form of preventive strategies and treatment techniques. Many experts have studied and described these strategies far better than we can here, and some readers already know what they are. The purpose of this document is twofold: 1) we will review the limited data that exist which allow us to make reasonable predictions about the effectiveness of the various preventative strategies and treatments, and 2) we will argue that due to a relative lack of good data, if there is a flu pandemic, there will be an overriding need to execute a prospective, open access study which seeks to gather the best information possible about individual health outcomes during a pandemic, in order that we might be able to use these data to make inferences about which strategies are most effective at preventing the worst direct and indirect effects of the panflu, and should an individual get infected, the best treatment strategies.

At its core, this proposal is intended to outline how to do a research project that can be done on short notice, under conditions of high risks, without an overarching theoretical question to be decided, done merely because we need to find something, anything out, so that we might be able to make decisions on the basis of the best data possible. The result is an audacious data gathering effort, to be executed while the participants are in mortal danger, all participating to further the general good. we have no doubt that there would be tremendous value in some part of these data, and may, if effectively disseminated, result in saving thousands or perhaps millions of lives. In this era of the internet and near instantaneous global communication, we have no doubt that this goal is acheivable.

About our terms: Before we continue, a distinction needs to be made, between avian influenza and pandemic influenza, aka panflu (link). Avian influenza is an umbrella term which describes a family of type A influenza viruses, all of which have in the past infected, or continue to infect, birds around the world. In the flu pandemic of 1918, it was an avian influenza, the H1N1 strain, that scientists currently believe crossed the “species barrier” so that it achieved the ability to efficiently infect (and kill) human beings, thus becoming a pandemic influenza (link). Current forecasts of a future flu pandemic are based on the idea that the current avian influenza virus of concern—H5N1—may make a similar transformation at some point in the near future (link), so following in the footsteps of its cousin, becoming a new panflu. When we use the term “avian flu” we are referring to H5N1 which is not well adapted to humans. There now seems little doubt that avian flu will become endemic worldwide in the next few years (link). For the purposes of this proposal, we will use the term panflu to refer to avian flu that has become adapted to humans, because this is the term that seems to be emerging as the term of choice? among the communities of people who are following this story. The reader should always keep in mind that when we use the term panflu, we are referring to some future version of H5N1, or some other avian influenza. There is in fact no generally agreed upon estimate of the likelihood that H5N1 will become panflu (link), although Dr. Robert Webster is on the record recently estimating that there is a 50% chance that it will (link). Until the moment it occurs, panflu remains a hypothetical possibility, and so the utility of this proposal also remains a hypothetical possibility.

A wide array of suggested strategies are floating around as to how best to prevent infection with panflu should it arise, and once infected, how to treat it (links). Some of these strategies are published by public health officials and academics of various countries and some global agencies such as WHO (links), and so might properly be called standard measures, because they are the result of years of work by groups of people with considerable training and expertise in the area of infectious disease control, and are frequently those which are most widely agreed upon by that community. Often, these strategies are supported by empirical data gathered during outbreaks of similar diseases, such as type B influenzas. Sometimes, when relevant data do not exist, the suggested strategies are informed by accepted theories of the likely mechanisms by which a panflu would be transmitted and infect its hosts. One such notable theory is the so-called “cytokine storm hypothesis” account of the lethality of the 1918 panflu (link).

However, the panflu we all fear does not exist yet, and so it would almost certainly be a horse of a different color, with some features that have not been anticipated by the data or the theories that are in existence. The fact that so little is known about the panflu constitutes a serious problem. Human nature abhors a vaccum of knowledge, because with that vacuum comes uncertainty, and fear. Where no knowledge exists, people will create information that passes as knowledge, merely as a shield against uncertainty and fear, and even now people all over the globe are filling that vacuum with their own personal theories and experiences, all in the effort to find viable strategies for the prevention and treatment of a panflu. Thus, a much larger group of people—the public—have been talking about strategies for prevention and treatment that are of unknown value. These will be called non-standard measures, and they tend to be held closely by small groups of like-minded individuals, often with little more than indirect anecdotal evidence, or support by non-scientific personal theories. Still, they are widely circulated, particularly on the internet, and should panflu occur, many millions of people can be expected to utilize some combination of standard and non-standard measures, all in an effort to survive in what will surely by a very uncertain situation for everyone involved.

Whatever their individual merits, each and every one of the strategies for prevention of infection, and for treatment of infection, shares a common shortcoming. Since the disease they are all intended to address does not yet exist, not a single one of them is backed up by direct empirical data, gathered from areas in which the panflu is circulating in the general population, or from infected individuals who are being treated for panflu. This is not to say that none of them will work. Indeed, this proposal is predicated on the possiblity that one or more of them will work. This proposal is for the creation of a mechanism for evaluating the efficacy of various preventative strategies and treatment regimes for panflu influenza in real time, while a pandemic is in progress. While we may begin the pandemic in a situation of uncertainty, with the right focus on gathering data, as a pandemic proceeds we can discover useful knowledge as it becomes available. Without such a mechanism n place, it may take months or years to make those discoveries retrospectively, long after the worst of the pandemic is over.

The problem that faces us in the spring of 2006 is this: there is only one empirically validated treatment for a person infected with the strains of H5N1 that are already in circulation: oseltamivir (tamiflu; link). That substance is in extremely limited supply, and production is slowly increasing worldwide (link). Adequate doses will not be available in the amounts needed were a pandemic outbreak to occur, probably not for the next year at least (link). The concern is that a pandemic may break out before enough tamiflu exists to treat the large numbers of infected persons that would present themselves. On top of that, there is a very real possibility that the widespread use of tamiflu may render it ineffective as resistant strains rapidly emerge (link). Indeed, there are some indications that it may be happening already (link).

The core of this proposal has five assumptions:

1. There are a number of factors (behavioral strategies, devices, or substances) that people believe will have some effect on panflu transmission, infection, or disease. These beliefs are already accessible, and they are circulated widely.
2. Consumers will create individualized strategies based on some subset of these beliefs, in order to ward off infection and treat disease.
3. Some subset of these factors substances may cause significant protection from panflu, either from infection, or once infected, from the worst symptoms.
4. Untrained participants with some minimal abilities and education (ability to make simple observations; ability to read and understand instructions) will be able to report their individual experiences with the panflu and their strategies if we ask them to do so on a continuing basis.
5. Circumstances surrounding a pandemic will provide intense motivation for individuals to report their experiences carefully, regularly, and honestly.

If we accept these assumptions, then it is safe to say that some subset of consumers may in fact use effective strategies, while others will not. Thus, we can use the situation created by a pandemic to gather data from many individuals simultaneously, in an effort to determine which of these strategies are in fact effective for lowering transmission risk, for preventing infection, or for treating panflu, while a pandemic is in progress. The proposed method will treat this situation as a sort of unblinded, uncontrolled, simultaneous therapeutic trial. It is hoped that with a large enough sample, that those strategies that are in fact effective will stand out from the rest. If this information is discovered and disseminated in rapid fashion, a number of individuals will be able to use this information to protect themselves form the worst effects of infection.

Rationale

Returning now to our original question, what are the various strategies to prevent the worst effects of a pandemic? Let us briefly review some of those strategies here, with the organization provided by the answer to the question “How does H5N1 cause the death of any given individual?” Logically, there are 4 events that need to occur to cause the infection and eventual death of a given individual. Each of these events will be expanded into a discussion of the strategies that might address the event in question, with a brief discussion of what is known about their likelihood of success. To forecast the conclusion, the amazing thing about what is known is in fact how little really is known, due to a relative lack of good, direct empirical evidence. Thus, this proposal can be seen as an attempt to address the lack of empirical evidence.

Here are the four events:

1. Viral Contact: Viable virus must be physically present in the location with the individual, and make physical contact with that individual, particularly with susceptible tissues of the individual, such as nasal mucosa, epithelium in bronchial passageways, etc.
2. Viral Invasion: The virus gets past the individual’s immune defences. This occurs because the body has not developed an antigenic response to the presence of viral particles, though given enough time it will.
3. Viral Infection and Replication: The virus attaches to host cells, enters them, replicates itself, and is released from the infected cells. At this point the individual is thought to be shedding virus, and so is himself infectious to other uninfected individuals.
4. Continued infection and secondary effects: At some point the effects of viral infection reach sufficient intensity so as to seriously impair or even to kill the individual. These are the conditions that ensue after severe viremia, such as bacterial pneumonia that sets up as a secondary infection in the lungs, or ARDS, a condition in which the lungs lose their ability to support the oxygen demands of the individual’s body. It is this situation that is implicated in the cytokine storm hypothesis, and is in fact the focus of many people’s theorizing about putative effective treatments for pandemic influenza.

It is not known if people reach stage three (aka viremia) whether this is adequate to cause deaths, but the high mortality rate of panflu will most likely be due to people reaching stage 4 (link). Whatever the case, each one of these conditions suggests possible strategies that might be attempted. we will outline those strategies here, plus some of what is known about how likely those strategies is to be effective at preventing treating infection with panflu.

On the next page you will see a conceptual diagram that depicts the causal system that this analysis envisions. In the center column is that sequence of events that leads ultimately to an individual’s death. Each plus symbol “+” and minus symbol “-“ represents an increased or decreased probability of the event named in the process box happening. It is difficult if not impossible to know the precise nature of the probability function that maps an upstream factor to a downstream event, so only the general nature of the funtion as positive or negative is suggested and no more.

Under this model, the only real difference between the current avian flu and panflu is that the functions that map earlier stages of viral infection onto later stages (center column) are “low gain”, an engineering term which refers to the fact that considerably greater amounts of the earlier process need to be completed before the later stages kick in to any significant degree. In this analysis, the transformation of H5N1 into panflu would be reflected by the low gain functions being transformed into high gain functions, so that relatively little viral contact is required for infection to occur, for example.

In the left hand column are some of the factors that are thought to increase the probabilities of the events in the viral infection process. They include factors which increase physical contact (“exposures”), that increase the likelihood that the virus successfully infects the individual, and so forth. The suggested examples are some of the factors that are currently suspected to increase the risks and severity of infection. One of the boxes (“replication enhancers?”) is included merely for the sake of symmetry. There are currently no known factors separate from the virus itself that would enhance replication, though they remain a theoretical possibility.

It is probable that there are some genetic factors which enhance viral replication, and so would provide a selective disadvantage in a pandemic situation, while there may be other genetic variants that inhibit replication of the virus. Indeed, there may be genetic factors which would occupy each of the boxes on the left and on the right hand columns, but very few specific genetic factors have in fact been suggested at this time, so they will remain a purely speculative logical exercise for now.

In the right hand column are some of the factors that are thought to decrease the probability that the virus will successfully complete the stage they point to. These act as countervailing forces to the progression of the virus through the various stages.

One additional feature of this system is worth noting. The system is presumed to contain a feedback loop (marked “FB”), so that in an infected person, newly created viral particles are shed and cause new contact and reinfection. Thus, viral infection is self-reinforcing. For a virus that replicates quickly, this can lead to an exponential explosion in the amount of virus present in the virus or in the person’s body. This gives us some idea of the importance of having a good vaccine or a good antiviral like tamiflu. These substances interfere with the virus before it can reinforce itself, thus heading off the worst effects of infection. Instead, a person who is taking antivirals or has been given an effective vaccine is likely to experience a low grade infection, or might be entirely asymptomatic.


Figure 1: Conceptual diagram of viral infection leading to death. A description of this system can be found in the text.


Viral Contact. This event can be prevented by using some of the oldest infectious control strategies known, generally characterized as isolation of infected individuals from non-infected individuals (aka quarantine) and strategies desiged to minimize the amount of the viable virus that is present elsewhere, particularly in those things which make contact with the individual’s body. In quarantine, infected individuals are kept separate from the uninfected. When infection rates have exploded, then uninfected individuals can be sequestered, a situation sometimes called reverse quarantine or self-quarantine (link).

Individuals can also use various strategies to prevent the virus from coming in contact with particularly sensitive tissues. These include the wearing of high efficiency masks, that are thought to help prevent the virus from coming in contact with the sensitive tissues in the nasal mucosa, mouth, throat, and lungs (link). Frequent hand washing, use of disposable gloves or clothing, and wearing of eye protection are all much discussed and will surely be used to varying degrees by individuals in regions where panflu is circulating (link). While there are many good reasons to expect that some of these strategies will be effective (links) reliance on these strategies is as much an article faith as it is supported by hard data (link). We need to do what we can to get the best data possible to see how well each works.

During the 1918 pandemic, estimates of the proportion of the world population that came into contact with the virus reach 98% (link). This panflu will move around the globe quickly, following the movements of global travelers such as birds and people (link). Ultimately, panflu will probably be carried by multiple vertebrate vectors, including birds, cats, and dogs (links). There are even suggestions that some invertebrates such as house flies might carry the virus for short distances (link). Infectious individuals shed virus through a number of channels, including sweat, sputum, blood, urine and feces (link). As each of these substances is spread around the environment, it creates a new potential source of infection.

Virus is excreted by infected animals and individuals in their feces and other bodily fluids, and can remain viable under good conditions for many weeks (links). Furthermore, if panflu is anything like the H5N1 that is currently infecting people, there will be an incubation period of approximately 5 days (link), which means that there is a considerable amount of time where an asymptomatic infected individual is shedding virus while appearing healthy (link). This means that the virus might in effect make infected individuals into trojan horses, gaining access to quarantined persons as infected but asymptomatic individuals come and go. There is even the possibility that viable virus might be borne aloft in very small aerosols or even as dust particles, and so travel for considerable distances (link). If a pandemic breaks out, we have to accept the fact that panflu is likely to become ubiquitous.

While all this may make contact with the virus seem inevitable, some of these routes of transmission are clearly implicated as more important. Some experts have reasoned that the elimination of avian vectors in locations where people are present, or the isolation and vaccination of commercial poultry, and culling of infected poultry, might help prevent the spread of the disease among humans (link). Indeed, this is often the most visible strategy already employed in Hong Kong, China, Viet Nam, Turkey, and India (link). Interestingly, large scale culling of poultry does not seem to be occuring in Indonesia (link). Culling poultry involves killing millions of chickens, ducks, geese, and other birds bred and raised for various uses by people. It is hoped in this way to reduce the likelihood of contact between people and the virus.

Even if avian vectors were to be controlled, maintaining a quarantine condition might also mean making a choice between quarantining or culling pets and other livestock. Experts remain skeptical as to whether this is a realistic goal (link). People are very attached to their animals, for emotional and for economic reasons, and for this reason we should expect them to resist efforts to cull their animals. Still, in the event of a pandemic, we might at least gather data about peoples’ contacts with animals in an attempt to determine to what degree such contacts affect transmission.

The effectiveness of a quarantine presumably depends in large part on being able to prevent contact between infected individuals and the uninfected (link). However, it is possible that as panflu saturates the local environment, that the only options left to will be to assume that anyone might be infected, and thus people will attempt to limit contact between individuals in general, i.e. social distancing (link). The behavioral changes that people use to achieve this goal will vary across a wide range, from merely avoiding physical contact with people, through rigidly maintaining a minimum safe distance (link), all the way to complete self-isolation in homes, with little or no contact made with the outside world (link). Some reports from the survivors of the 1918 pandemic indicate that social distancing was a widely used strategy back then, although it is not know how successful that strategy was back then (link). Which social distancing strategy is inadquate, which is adequate, and which is overkill? We do not have answers to these questions based on good empirical data. A pandemic would, ironically, be a golden opportunity to get those data.

Furthermore, there are unanswered questions about unanticipated consequences of social distancing. Social distancing will surely have the effect of disrupting social support networks, a situation that psychologists have shown to be a factor in causing mental and behavioral dysfunction during times of great stress (link). These in turn might cause poor outcomes for people, such as excessive alcohol use, family dysfunction, martial strife, depression, anxiety, suicide, etc. So it would be desirable to make an attempt to track these changes as well if possible. This requires that we try to measure outcomes generally, not just tracking those that are specific to panflu infection. So we will propose that we include a number of simple measures of psychological, emotional and behavioral functioning. While this may make reporting more unwieldy, we might be able to use relatively simple measures with some minimal level of validity and reliability in order to get us part of the way to this goal.

Finally, sanitation is clearly one of the most successful public health measures in the last two hundred years (link). The study of sanitary conditions and their relation to disease have produced some of the great heroes of medical science. Procedures designed to maintain sanitary conditions might presumably have some effect on the probability of infection in individuals. Procedures that are designed to reduce the presence of the virus include copious use of chlorine bleach and other disinfectants, the use of masks for infected individuals to prevent the spreading of their sputum, sequestering the infected in isolation wards, and so forth (links). Furthermore, attention must be given to making sure that people have potable water that is free of virus, and that waste products continue to be processed and released in a fashion that removes live virus (link). It goes without saying that sanitation will figure prominently in our response to the presence of the virus, though it is hard to know exactly how strict our efforts to maintain sanitary conditions must be to prevent contact with the virus. Is it adequate to clean frequently touched surfaces such as door handles with bleach-soaked disposable wipes, or does a person need to scrupulously wipe and spray all surfaces as frequently as possible (link)? How important is it for a person to treat their drinking water so as to remove live virus? Again, good empirical data for the relative effectiveness of sanitation strategies by the general public is lacking.

Viral Invasion. The prevention of this event is of course the major goal of vaccination. By presenting the immune system with “examples” of the infectious agent, in the form of viral particles, the immune system develops smart bombs called antibodies through an immune response that specifically targets the virus particles when they get into the body (link). The power of this strategy generally is well established, as the elimination of smallpox from the globe makes clear. As with sanitation science, the science of vaccines has produced another set of the great heroes of modern medicine (link). While there are many who reject vaccines as bad medicine (link), the vast majority of the public and of public health officials disagree (link). So it is natural that in anticipating the panflu, there are a number of efforts to develop a vaccine (link).

However, vaccines require a long design and trial process. There are already several candidate vaccines in trials (link), but the problem right now is that the virus everybody fears does not exist yet. Without it, we can not know if a vaccine will be effective, we can only design a vaccine that we hope will be similar to the virus that eventually breaks out, and hope that immunity to the virus of the vaccine provides cross immunity to the hypothesized panflu virus (link). This is in fact the basic strategy being used by several vaccine design efforts, where older versions of H5N1 have been used to engineer vaccines that are currently in trials. So scientists are faced with a situation in which the optimal conditions for designing a vaccine (i.e. they have a known target virus) are not present, but once the target virus does exist, it may already be too late. At any rate, even if the vaccines in the design stages right now were effective, estimates of our ability to deliver them in numbers large enough to vaccinate a significant portion of the public are in the range of 1–3 years, even in the rosiest scenarios (link). At the moment, we cannot assume that an effective vaccine will be available except to a select few essential personnel (link).

All the same, there are other factors which are known to modulate the immune response. Some individuals such as those undergoing chemotherapy and individuals with HIV/AIDS have suppressed immunity (link). There are suggestions that frequent use of pneumovax (the seasonal type B flu vaccine) in individuals might result in cumulative immunity to a wider variety of influenas (link). Finally, there are naturally occuring individual differences in immune function due to sex (link), age (link), or even country of origin (link). Any of these factors might conceivably improve or hurt an individual’s chances of having a successful immune response to the panflu virus. We need to try to gather these data to see which of these falls out as a significant determinant of panflu infection. If one or more do, it might help us to understand the mechanism by which the flu infects individuals, and thus might serve to guide the design of interventions. ‘’ Virus Infection and Replication’‘. If the panflu virus successfully makes it through the host body’s internal mileu without being scarfed up by antibodies and WBCs, it makes its way to the surface membranes of various cell types, and finds sialic acid receptors (links) which it locks onto. Then, like a trojan horse, it convinces the cell to incorporate it into the cellular cytoplasm, then uses other mechanisms to release its RNA into the cytoplasm. This RNA hijacks the reproductive machinery of the cell, convincing it to make hundreds or even thousands of copies of the original virus. These viral copies are then released from the cell, often destroying it in the process (link). Destruction of many cells in this manner causes large scale tissue damage, which in turn can cause toxic effects (link).

There are relatively new drugs on the scene called anti-virals, many of which target either the mechanisms by which viruses latch onto host cells (link), enter those cells (link), insert their DNA or RNA into the cell’s own genetic material (link), and then for the cells to create and release new virus (link). Antivirals can be very effective, as the triple cocktail AIDS therapy illustrates (link). The problem is that viruses use different strategies to achieve this goal, and so an effective anti-viral for a particular viral disease needs to target the specific mechanisms the virus uses to accomplish replication.

Right now there is a lot of attention being given to oseltamivir (Tamiflu), which is a member of a class of antivirals called neuraminidase inhibitors, which are generally predicted will be an effective antiviral for H5N1, and thus probably also panflu (link). It apparently works by blocking the release of new viral particles after they have been created in host cells, thus preventing the exponential explosion of infected cells in an individual. There is some evidence that Tamiflu will be effective in treating infection with H5N1, though there is considerable debate of this point. On the assumption that Tamiflu is and will continue to be our most successful treatment for the avian flu, or at the very least will be an effective pophylactic, preventing individuals who are in contact with the virus from developing infection, governments (and individuals) worldwide are stockpiling (some would say hoarding) this drug (link). Another less widely used member of this class is Relenza, and both will be used widely as prophylaxis and as treatment for infected individuals. Effectiveness data are only now emerging, and there is considerable debate as to the most effective use of these substances (link). More data, gathered in conjunction with data on a wide variety of other variables, would almost certainly reveal their optimal use.

Furthermore, there is a rapidly evolving discussion of how best to administer tamiflu, under the suboptimal circumstances that a pandemic is likely to present individuals with. This includes the possible use of probenecid to inhibit tamiflu’s secretion by the kidneys, so as to maintain higher effective concentrations of tamiflu in the blood (link). This would be a risky strategy, as probenecid can have toxic effects (link). Also, some people are proposing that since tamiflu is excreted unmetabolized, that individuals might extend their limited supplies of tamiflu by saving their urine and reingesting it (link). Some older antivirals such as amantidine and rimantidine are in wide supply, but are thought to be largely ineffective against H5N1, and so are unlikely to be effective against the panflu (link). As such, use of amantadine and rimantidine are not likely to be part of any standard measures. Still, many individuals in crisis situations are likely to take these substances anyway, making their use a non-standard measure. Finally, there are a number of herbal and homeopathic remedies and nutritional suppliements which are purported to have antiviral action, such as Vitamins A, C, and D, Sambucol, Echinacea, and even homepathic remedies such oscillocinum (links). All of these non-standard measures have their supporters, many of whom point to various types of evidence of their effectiveness. Usually, that evidence is mostly anecdotal, with all of the bad things implied by that term: they are retrospective, qualitative, unsourced, and unsystematic.

While all of the approaches in the preceding paragraph would be considered non-standard, they all have seemingly large numbers of believers, and so it is almost certain that some groups of individuals will be using them in an attempt to stave off infection with panflu. If we think of these individuals as self-directed experimenters, without specifically buying into or rejecting the basis for their beliefs, it is important that we help these people to systematically document their successes and their failures, using a standardized reporting method, so that their relative effectiveness can be empirically determined.

Secondary effects of infection cause the individual’s death. This last step is believed by many to be the immediate cause of death in individuals infected with panflu (links). This belief is based on emerging knowledge of the 1918 virus (link), testing of the H5N1 virus in animal models (link), and new data on the effect of infection by H5N1 in humans (link). There is good evidence that a class of hormone-like compounds called cytokines, produced by the immune system, are important causes of inflammatory and the immune responses, and become “disregulated” so that they cause various effects having to do with fluid imbalance: hypotension, pleural effusions, and ARDS (acute respiratory distress syndrome; link). Since the explanation usually includes the idea of a positive feedback loop resulting in an exploding cascade of cytokine release, this is often evocatively referred to as the cytokine storm (link). The cytokine system is in fact at least as complicated as the immune system, and there are dozens of idenitified cytokines, including the interleukins (IL), interferons (IFN), tumor necrosis factors (TNF), and Tumor Growth Factors (TGF). There exact relationship at this time is unclear, so there doesn’t appear to be a coherent explanation of the cytokine storm in terms of a precise mechanism (link), although some attempts have been made (link).

Still, a promising approach, under this theory, targets some of the cytokines, attempting to block their action (link), so as to cut off or at least dampen the cascade. Considerable debate exists as to when the best time to attempt this type of intervention would be (link), which substances and which cytokines to target (link), and what contraindications exist (link). The question on many people’s minds is this: Are there substances in existence which block cytokines, and are these substances in widespread distribution?

Cytokine blockers have been a very hot topic in the pharmaceutical industry for about a decade now. Some of the newer treatments for the autoimmune diseases are designed to block cytokines in some fashion (link). An example of this in Enbrel, a drug that is given to people with rheumatoid arthritis (link). Also in trials is a drug called OX40 which apparently shows promise as a strong blocker of TNF-Alpha (link). There are many others “in the pipeline” (link). The cytokine storm hypothesis has created the most heated discussions on internet discussion groups, because a number of substances in existence appear to interact with the cytokine system. Take, for example, curcumin, derived from turmeric, which appears to suppress TNF-alpha (link). Other substances such as DHA in fish oil may also do this (link). Then again, certain already approved pharmaceuticals such as haloperidol (link) and the statins (link) also appear to interact with the cytokine system in a way that suggest that they might be effective at preventing cytokine storms. The list of substances that are now suspected to interact with cytokines is a long one, growing longer all the time.

There are other secondary effects that are clearly implicated as complications of influenza infection. People who are suffering the worst effects of influenza often lose valuable fluids and basic nutrients such as electrolytes, becoming weak due to dehydration, salt imbalance, and lack of carbohydrates (link). Therapy for these individuals includes careful maintenance of their intake of these substances, and special purpose solutions called ORS have been formulated for this very purpose (link). We can expect that during a pandemic many individuals will receive this sort of supportive care. What will its effects be for the people who receive it? This project is a chance to answer this question with solid empirical data.

Finally, some people who are in respiratory distress will surely be put on mechanical ventilators, or will be provided other respiratory support, in an attempt to maintain their blood O2 levels. Considerable debate exists as to how many ventilators will be needed relative to the number that are available, leading some public health officials to declare that during a pandemic, ventilator triage will almost certainly be enacted, with the few ventilators that exist in a given locale likely given to those who will most benefit (link). How will these decisions affect patient outcomes? This database is our chance to determine the answer to this question too.

Regardless of the theory we cleave to as to exactly how the virus kills a person (thru cytokine disregulation, acute viremia, electrolyte imbalance, respiratory failure, or what have you) closer examination of this question gives one some hope that an effective treatment for panflu infections might be nearer at hand than most people think. There are a number of techniques or devices that might reasonably be expected to prevent transmission of the disease. There are also hundreds of other substances with potentially important pharmacological properties as yet untested. Interestingly, many of these compounds are already available without a prescription, sold as herbal remedies, or included in foods, or growing in people’s back yards. Given that during a pandemic people all over the place will be using some combination of them, can we use these “personal experiments” to gather outcome data in an attempt to establish their relative effectiveness? Yes we can, and—should a pandemic occur—we must.

Method

In a worst case scenario, a transmissable strain of the virus breaks out soon, and health care providers in various locations quickly become engaged in a desparate crisis, treating thousands of patients in a given location. Good information will be hard to come by, so fear and uncertainty will be the order of the day. Rumors will circulate like wildfire, some of them planted by unscrupulous individuals hoping to profit from the fear and uncertainty. How can we fill the gap, making the best information possible in the shortest amount of time? The internet is perhaps the most efficient mechanism for the transmission of information (good and bad) ever made. It is ideal for the gathering of systematic information about the successful and unsuccessful treatment of patients while the pandemic unfolds.

Method Overview: Topics covered in this section include the rationale, privacy policy, reporting method and reporting constraints. The details of the system to be used to enter and store these data (hardware platform, database host, software framework, etc) are unspecified at this time, as are the host site(s), and the system that will be used to disseminate these data. That said, the goal should be obtain reports on an ongoing, continuous basis from individual participants, then to clean the data and add them to the dataset, to perform preliminary analyses on an ongoing basis, making the raw dataset and preliminary analyses available for free electronically. By making certain of the raw data publicly available (everything but information about individual participant identities), so as to allow third parties the opportunity to conduct analyses of their own design, we hope that this will allow for rapid development of novel insights as a community of analysts develops to discuss and publicize discoveries made from the dataset. The ultimate goal of this project is to save lives, by revealing to members of the public the differences between effective, ineffective, and counter-productive strategies for coping with the pandemic and its effects on individuals.

Method Rationale: Researchers currently have poor empirical evidence for the relative importance of many factors that might influence health outcomes during a flu pandemic. We would like to know answers to the following basic questions:

To what degree do different exposure experiences such as public assemblage, contact with animals, and so forth, put people at risk for contracting panflu?
What sorts of preventative measures (mask wearing, hand washing, use of hand cleanser, disinfection, social distancing, etc) lower individuals’ risks for contracting panflu?
How do naturally existing variables of preexisting health conditions, age, sex, family status, variables such as local weather conditions, increase individual risk, and which ones lower risk?
Are there variables such as diet or use of various medicines and CAM treatment which can be identified that are associated with better health outcomes once panflu has been contracted?

During a pandemic people will be having a wide variety of experiences, some of which they planned, others unplanned, but many are potentially related to health outcomes during a flu pandemic. Getting these data and making them publicly available on an ongoing basis during a pandemic may serve two purposes: a) it may help us discover and disseminate effective prevention and treatment strategies while the pandemic is in progress, b) it may help us discover more about the nature of disease transmission generally, and thus may benefit humans as they plan strategies to counter future as yet unknown diseases. Finally, these data may yield as yet unanticipated inferences about human health during times of crisis generally, that might result in better planning for a wide variety of possible large scale crises.

Privacy Policy: All information reported by participants will be used only for the purpose of answering questions about associations between behavior and health during a pandemic. At no time will identifiable information about individual subjects be shared with outside parties. In order to protect subjects, their permanent information and their dynamic information (defined below) will be stored in separate files, connected only by a unique identifying code number. Dynamic information will be publicly available on a continuously updated basis. Identifying information must remain private. Certain permanent information (age, sex, racial/ethnic identity) may be made available at a later date, but in such a way that unique identifiers (name, phone numbers etc) have been stripped out so as to protect the identities of subjects. At all times, participants will have the choice to report or not to report information as they wish, so the degree of personal revelation is a matter of individual consent. Individually identifying information will be permanently destroyed at some predefined date so as to permanently prevent these data being used by third parties to invade individual privacy.

Participant Recruitment, Protection of Participants, and Informed Consent: No effort will be made to get a random sample of the population. We will, however, seek to get large number of participants across a few variables we expect to be critically important in interpreting the data and allowing for statistical control: sex, age, and location. Participants will be recruited through online sources, using a snowball sampling method, in which current participants are induced to spread the word online and offline about recruitment of participants for the study. Recruitment will proceed on an ongoing basis.

Since we will be gathering sensitive information about individuals with their identities attached, we will need to take extra care to make sure that they give their fully informed and explicit, active consent. For this purpose a consent form will be created and participants will need to sign and return the consent form at the start of their participation. The consent form will describe the sensitive nature of the information sought, and the procedures that will be followed ot protect this information, plus their rights as participants, in accordance with the Declaration of Helsinki (for non-U.S. studies; link) or with US CFR Title 45 part 46 (for U.S. researchers; link). Finally, it is desirable to get information about the experiences of persons across the widest age spectrum, and from all walks of life. That means that we will seek young children and pregnant women, prisoners, and mentally challenged individuals such as those with MR or dementia. Human subjects requirements for these persons are the most stringent possible, and so require that we use a variety of adapted consent procedures to protect individuals from these populations.

Data Reporting Method: Reporting of experiences by individual participants will take place periodically, but at a rate of no more than one time every hour, and no less than once every seven days. The basic reporting format is in the form a structured personal journal, in which each participant will record their own personal experiences on a wide variety of variables, many predefined, but some open-ended items so as to allow reporting of unanticipated details. Subjects will report their data using a standard form that seeks detailed information in the five variables categories defined as follows: 1) incidental variables, 2) exposure variables, 3) preventative variables, 4) treatment variables, 5) direct effects of panflu.

Participants will input their data either by sending a standardized form by email, or by inputing their data using a web-interface interactive form to report. The idea is to give participants multiple ways to report their data, creating redundancy so as to minimize data loss due to system malfunctions. Also, participants will be instructed to make plans for other individuals to report their data should they become incapacitated by illness or other circumstances. Finally, we might consider the possibilty that participants will print out multiple blank copies of the forms so as to make it possible to record their experiences even in the event of power failures.

A participant who fails to report their information will be contacted each day by email to remind them of the need to continue reporting. After three days, the participant’s buddies will be contacted to tell them of that person’s failure to report, and in an attempt to get information on that person’s status. The participant may also be contacted by phone in an attempt to follow up. Finally, if after seven days there is no update by the participant and no information suggesting otherwise, he will be declared MIA and his data will updated to reflect this fact. Analysts at a later time may decide to make the inference that persons who are MIA are in fact deceased, a proceed with their analyses from this inference. If after this time the participant reappears with new reports, his status will be changed, his new reports will be accepted and his dataset will be appended with the new reports.

Reporting constraints: Participants will be allowed to report the information for up to ten other individuals, but only when those other individuals are not capable of doing so for themselves, such as a parent reporting information for a child. However, because certain of the data require a degree of awareness of that person’s experiences that are not available to second parties, this will not be allowed for a person who is not in close daily contact with the individual in question. Furthermore, the ability to report data for multiple individuals may be exploited by individuals for the purpose of spoofing the system to as to commit fraud, making certain products appear more effective than they really are. Registration of subjects will use a “buddy system” in order to make it possible to obtain an individual’s data even when that individual becomes incapacitated. Buddies will be drawn from among other participants, either chosen by the participants themselves (so groups of friends or family members might sign up and participate together) or by the researchers, choosing other enrolled individuals from the same approximate location (so as to make person-to-person communication possible, which allows people to keep tabs on each other even in the case of large scale infrastructure failures).

Variables Overview: In general, we are seeking data that will allow us to make cause-effect connections between any of a large number of variables. While the only perfect way to do this is with a true experiment, we can treat this situation as a reasonably well-defined quasi-experimental situation. If we gather the data on an ongoing basis, we should be able to find data that meet at least two of the three criteria for causal inferences (covariation and temporal priority), while the third criterion (control of extraneous variables) might be achieved post hoc, through statistical controls, or might be achieved at least circumstantially. Putative causal variables are divided into three large categories: 1) those that are thought to be related to exposure and infection, some which might increase exposure and infection risk, 2) strategies that are thought to decrease exposure risk, such as mask wearing or social distancing, and 3) those that are thought to be related to the course of the disease in infected individuals, some which might make symptoms worse, while others that might provide relief from symptoms. Effect variables are also divided into two groups: 1) those which are directly related to infection by panflu, i.e. physical symptoms, and 2) those which are indirectly related to the pandemic, i.e. psychological and behavioral adjustment variables. So, the next few sections are devoted to these variables, descriptions of the criteria that will be used, and with an eye toward how they will be reported by subjects.

Causal Variables.

Likely Exposure Risks. It is not currently known exactly what variables put people at higher risk of exposure. On the assumption that to be exposed to viable virus, you need to be exposed to an individual who is currently infected, or at least to their bodily fluids such as feces or sputum. We will create a watchlist of exposure experiences that includes at a minimum the following variables:

a. Number of people have you encountered in the previous 24 hours
b. Number of people those listed in (a) have encountered in the previous 24 hours.
c. Exposure to children under 10 years of age.
d. Exposure to domesticated animals: How many? What types (cats, dogs, birds, livestock) How many unsequestered?
e. Exposure to wild animals, particularly birds, and their physiological products (guano etc)
f. Exposure to Unfiltered Air Supply vs Filtered Air
g. Contact with unsanitized surfaces
h. Contact of hands and fingers with mouth/nose/eyes
i. Known exposure to clearly infected individuals (caring for or being in the proximity of someone who is ill)
j. Known exposure to an individual who is caring for someone who is ill
k. Exposure to water or food that comes from uncontrolled storage
l. Being a person who is immune compromised (due to use of chemotherapy or HIV+)
m. Pregnancy & Sex.
n. Other exposures as yet unspecified, that may pose some exposure risks.
o. Weather conditions (average temp, wind, humidity)
p. Tobacco smoking
q. Other variables as yet unknown

The list of variables included on this watchlist will be summarized at the FluWikie page on disease transmission.

Preventative Strategies.The exposures and preventative strategies watchlist will be maintained on Fluwiki in the section on prevention. Each of the strategies listed here will be included in the watchlist as a separate entry. The manner in which it will appear on the reporting form is shown in the sample reporting form that appears at the end of this document. The watchlist of preventative strategies will include at a minimum:

a. social distancing (keeping other individuals at a minimum distance)
b. social isolation (keeping uncontrolled individuals at a minimum distance)
c. mask wearing (ad hoc masks, surgical masks, N95 Masks, N100 masks)
d. glove wearing (disposable gloves)
e. eye protection wearing
f. disinfection of surfaces which individual is in physical contact with (door handles, etc)
g. frequent hand washing
h. frequent use of hand cleanser gels
i. use of disposable clothing/changing clothing after being in public
j. Avoidance of uncontrolled/public places
k. Refraining from close physical contact (kissing, hugging, sex)

Treatment Strategies. The preventative strategies watchlist will be maintained on Fluwiki in the section on treatments and in the section on CAM. Each of the strategies listed here will be included in the watchlist as a separate entry. The manner in which it will appear on the reporting form is shown in the sample reporting form that appears at the end of this document. A comprehensive list of these will not be attempted at this time, though we will be forming one in the near future.

Standard versus Non-standard strategies: The problem with non-standard strategies is just that—they are not standardized. Thus, for example, persons who believe that megadoses of Vitamin C will prevent serious illness from panflu constitute a major problem for our effort. What is the daily dose of Vitamin C that constitutes this practice? 500 mg? 1000mg? 2000 mg? Should it be taken once daily, or q.i.d.? This is a problem that cannot be solved merely by having people report dosages, because some of the other remedies do not use standardized preparations such as those you can get in Vitamin C pills. In order to allow reporting of a wide array of standard and non-standard preventatives and treatments, we will create a list of definitions, in which a generally agreed upon standard for that strategy can be achieved. Subjects will be asked first if they followed or executed that strategy during a given reporting period, Yes or No (subjects may decline to report if they wish). Then they will be given the opportunity to report more detailed qualitative information that might at some later date become the basis for further analysis. So, to give you a concrete example: there are some people who think that large doses of EPA found in fish oil might help regulate the cytokine system, and so might be helpful in preventing the cytokine storm (link). The problem is, even if this is true, how much fish oil will be necessary, and in what form should it be consumed, as regular capsules, enteric coated capsules, or as whole fish? This author (MD) takes a small dose enteric coated fish oil capsule with 220 mg of DHA and 240mg of EPA in it. While this would probably be under the minimum standard dose (so I would report that NO, I do not use this strategy since I do not consume 1000 mg daily), still I would report my consumption of a small evening dose, for later possible use in analysis. So how can we decide what to do with the problem of standardizing preventatives and treatments? We propose that the system will work as follows:

1. We will create “watchlists” which are lists of treatments or preventatives that have some degree of support from the scientific community or in the public’s mind. Degree of support is the only criterion for inclusion, because in order for valid conclusions to be drawn about efficacy, sufficient numbers of people need to be using the strategy so as to give adequate statistical power to the test. So my personal superstition that dancing a jig by the light of the moon provides protection from the panflu will not make the list because I am perhaps the only person in the world who will be doing so. Participants will report merely if they are following the minimum requirements of each of these strategies (YES or NO). They will also be given the opportunity to report additional information about the details of their own use of the strategy, if they so desire.
2. Each treatment or preventative will be defined in some manner so as to provide a reasonable standard that must be met by persons wishing to report that they are using the strategy. So, for items that are ingested such as an herbal preparation, it will require ingesting some minimum amount in some standardized preparation such as that provided by a commercially available product. This will take some time to work out, but any interest group who wishes to get a strategy included on the watchlist will have to present a standardized formulation for inclusion. Each item on the watchlist will be created so as to be relatively independent of the other items on the list. So, for example, there will only be one item on the watchlist for echinacea, covering all uses of this strategy. Having one entry for standardized echinacea extracts while another for use of unprocessed echinacea root will not be allowed. Participants will have to familiarize themselves with the standards as set out before determining how to report their use of a strategy.
3. We will provide an online resource which has the best information about each of the strategies, both standard and non-standard, so that participants can become informed and decide on their own which they will and will not be using during the pandemic. This resource will be open and honest about any controversies that exist with regard to this product or strategy, highlighting particularly possible side effects, drug interactions, or bad effects. So, the entry on echinacea will note that the best controlled study to date has found that echinacea had no effect on the progression of rhinovirus infection (link), and furthermore that echinacea might interact with immune suppressors (link), and might potentiate the cytokine storm (link).
4. We will allow participants to report any additional information about things they may be doing, that are not included on the watch list. In this manner, we hope to allow reporting of the widest possible assortment of data.
5. We will reserve the right to refuse inclusion on the watchlist of any strategy which we deem unworthy for whatever reason we see fit. As we have said before, this study should be theory neutral to as great a degree as is possible, so we don’t imagine using this “veto power” very often. However, one can imagine pranksters proposing inclusion of coprophagy on the watchlist, and they may even gain support of a large enough group to meet the degree of support criterion. However, no reasonable person would support this strategy.

Using this approach will allow us to achieve a reasonable compromise between two opposing forces of empiricism—the utility and efficiency of constrained answer items, while allowing for the comprehensiveness of open-ended qualitative data.

Effect Variables: Direct Effects of Panflu Exposure and Infection. There are a variety of outcomes which we might ask participants to report, but clearly the most critical one has to do with the panflu itself. Asking people to simple report in words how they are feeling is not going to go very far in the direction of helping us make good inferences, because such data are very subjective and hard to compare. For this reason we have created a standard symtpoms rating scale, in which participants will report of a numbered scaled rangning from 0 (zero) to 5, the severity of their symptoms. Criteria for each rating value are as follows:

0 well. Symptoms not outside normal, everyday range.
1 symptoms compatible with seasonal flu - fever, cough, sore throat, mild nausea or loose bowels
2 symptoms compatible with severe seasonal flu - tight cough with slight shortness of breath on coughing only, chest pain on coughing, vomiting more than 2x, diarrhea more than 4x in 24 hours, minimal (specks) of blood in sputum, vomitus, or stool, urine output reduced or darkened but still passing urine more than 3x in 24 hours.
3 symptoms suggestive of onset of ARDS - more or less constant difficulty breathing, inability to lie down, talks only with difficulty, use of accessory respiratory muscles (visible at neck) for beathing, frankly blood-stained sputum
symptoms suggestive of multiple organ involvement - explosive uncontrolled diarrhea, fresh blood in stool or vomit, severe abdominal pain with tenderness or bloating,
symptoms suggestive of early CNS involvement - drowsiness, delirium, restlessness, (in absence of serious breathing difficulty) severe headache unrelieved by non-narcotic analgesics, photophobia, neck stiffness
4 frank ARDS - severe breathing difficulty with cyanosis, cold clammy skin, unable to talk without severe distress or worsening of cyanosis, respiratory muscle fatigue with visible reduction in ability to expand chest, coughing up massive amounts of fresh blood mixed with minimal or no sputum
signs of systemic failure - drop in blood pressure, weak thready pulse, signs of liver failure eg jaundice, signs of renal failure eg no or almost no urine output, edema, severe abdominal distension
frank CNS involvement - convulsion (in absence of previous history), unconscious, arched rigid back,
5 terminal state - respiratory failure, comatose, persistent low or unreadable blood pressure, signs of disseminated intravascular coagulation (DIC) with bleeding into skin and from multiple sites,

See the discussion thread on FluWikie? for more information about this scale and its design. The scale will be modified going forward so as to translate it’s language for the medically untrained. Note that occurrences of “5” are going to pose a particular problem for this project. It means that the person is so incapacitated that they are either dead or nearly so. While it make the designers of this project seem more than a little heartless, information about people who reach this state is very important, and so loss of this information presents a potentially serious barrier to the likelihood of success. As noted in the section above, we will create mechanisms for ensuring that should participants die, that fact will get reported to the database. It is that information together with the information about all those other variables surrounding their deaths, that presents a important chance to make strong inferences about the connections between certain variables and the worst outcomes.

Of course, people will die for lots of reasons not directly attributable to the panflu, so we should make sure to try to record qualitative information about the cause of death wherever that information can be found—car accident, gunshot wound, heart attack/MI, sepsis from untreated wounds, drowning, lightning strike…whatever. It is entirely possible that a significant number of deaths during a panflu will result from indirect effects of the pandemic (link). Take, for example the possibility that a person who’s life relies on the availability of electricity, so as to power some form of lifesaving device. If power fails, so does that device, and thus the person dies, then the cause of death should probably be attributed to the power failure, which during a pandemic might be considered an indirect consequence of the pandemic, not to panflu infection per se.

Defining Outcomes: Indirect effects of the pandemic. Other outcomes could be monitored as well. One type of outcome that we would like to track is psychological and behavioral in nature. Psychologists are well aware of how natural and technological disasters affect people psychologically. Many psychological and behavioral symptoms are thought to be caused by the stress of the event itself, coupled with individual risk factors such as pre-existing genetic factors. A pandemic will surely be very stressful, so it would be handy to get a quick reading on “how people are doing”. While it would make sense to try to track these variables as well, doing so may make partcipation unwieldy, and so we should put this goal at the end of the line, as something that might be attempted if all the other goals are achieved, and if doing so doesn’t jeopardize the integrity of the rest of the project.

Analysis Strategy

Analysis of the data would be done in various ways, but in its simplest form, analysis of the data could be done as a simple crosstabulation, by taking people who have been definitely exposed, and comparing the number of people using a particular strategy (Factor X) with certain levels of symptoms, with everybody else. For an slightly effective strategy, the results might look something like this:

SymptomsFactor XEveryone Else
0–1200500
2–3100350
4–540150

Analysis of the data in the table will show that there is a significant association between the “treatment condition” and the symptoms, with chi-squared (df = 2, N = 1340) = 8.01, p < .02, but that this is only a weak to moderate association, with Cramer’s V = 0.077. This means that whatever substance X is, the people who are taking it are doing slightly better overall than those who are not. Another way of thinking about the data are that for people taking substance X, 12% progress to the worst symptoms, while 15% of those who are not taking substance X do so. This is indeed a small effect, and it illustrates the dilemma that scientists and statisticans face all the time: when is a statistically significant effect in fact significant? Saving an extra 3 out of 100 people doesn’t seem like a lot, but it is information that we did not have before. We might use this information together with info about other small effects (say for substances Y and Z) to create a novel combination therapy. Such combinations would be unpredictable, yet data such as these might lead us to discover what they are even as a pandemic is in progress.

Statisticians among you will recognize that what I have demonstrated is the most simple-minded analysis possible, by treating symptoms as a nominal variable, when the symptoms variable, as I have defined it, is at least potentially better treated as an ordinal variable, allowing for more sensitive (though also more complicated) analysis. Furthermore, by having individuals log in over many days, one could also potentially perform time series analyses of one form or another, creating all sorts of interesting possibilities. But all the same, this is in fact a fair approximation of the way in which data from drug trials are analysed.

Another issue that the statistically minded among you will recognize is the problem of false positives aka Type 1 errors. Here the probability of using the data to conclude a treatment like our substance X is effective, when in fact it is NOT effective, is given by the p value at the conclusion of the statement about the statistical significance. In our example, we are basically lead to conclude that there is about a one in 50 chance that the conclusion that substance X is effective, is wrong. It doesn’t sound like a high probability, but the problem here is that we plan to submit perhaps hundreds of substances to the test. In these circumstances, if we set p <.02 as our threshold for significance, then we are faced with the problem that for every 34 items in the test, there is a 50% chance that at least one of them will be a false positive. This is what is meant by the term “family-wise error rate” The way around this problem in analogous situations, is usually to set your significance threshold lower, say p<.01. If we did that, for every 34 items in the test, there would be a 29% of at least one false positive, much lower than before, but still perhaps unacceptable. The solution is to adjust our criterion as far as we need to, and in effect look for only those substances that have the biggest effects. Statisticians are fond of procedures like the Bonferroni correction, which is a very conservative correction for familywise error rates, but we can dicker about this once the data are in.

Part of the solution to this issue is whether or not we expect a particular strategy to be effective. When there is a positive prediction based on some prexisting theory of evidence, statisticians sometimes counsel that you use an uncorrected significance threshold, because the testing of the data that go to this prediction are in effect a “planned comparison.” The problem is that in a sense, every one of the things on our watchlist is “expected” by some parties to be effective. However, the indiscriminate use of relatively liberal significance thresholds for so many possible variables will only end in in confusion. In the end, we suspect that what we’ll end up doing is adjusting our significance threshold so that we rank order effective treatments according to their relative effect sizes, with those conferring the largest effects ranked highest, as we will be most confident in our conclusions about these variables. Other variables with marginal effects might remain in the mix, but we think this is what people will naturally do anyway, so we think this is a viable approach.

A final remaining issue is the matter of placebo effects. In a well controlled clinical trial, some subjects are given placebo, so that they are lead to expect an effect of some kind. Results from placebo controlled drug trials show time and again that people receiving placebo often show large improvements in their well being. Indeed, the magnitude of placebo effects can sometimes rival those of effective drugs. Why should we expect these data to be any different? Can’t we expect placebo effects to muddy the waters?

Short answer: Yes, of course. But here’s the beauty part: in a placebo controlled drug trial, a drug’s effectiveness is judged in comparison with the placebo. Even if the placebo subjects show improvements (which they often do) a drug is judged effective if the improvements of subjects receiving the drug are greater than those receiving placebo. The improvements of the experimental group are taken to be a sum of the placebo effect + drug effect. In this effort, we would treat each candidate substance X in turn as the experimental drug, and compare subjects’ outcomes for substance X versus “everyone else” i.e. those people taking anything other than substance X. In effect, we would treat everyone else as the placebo control group. Since everyone could be assumed to being equally susceptible to placebo effects, all that remains is for us to look for incremental successes in each drug in turn. The only way this might fail is if too many substances are effective, but if that happens, then we won’t have much to worry about anyway, will we?

Now that you have a clear idea what information would be gathered, and how it would be used, all that remains is to critique the proposal. Before we do so, I’d like to make it clear that this database should be “open access” following the well understood wiki model that already exists. This mechanism ahs the advantage of encouraging participation by interested parties, and creates a real-time data acquisition and analysis process that would be ideal in a pandmiec sitation. Users could in fact determine the strategies (behaviors, substances, etc) that are included as candidates, and the host would make the data available in raw form to anyone who wanted it. That way, those with the skills and inclination to analyze the data could do so, posting their results for the purpose of forming conclusions about the effectiveness or ineffectiveness of individual candidate strategies.

Threats to Validity

There are several particular weaknesses in the design as we see it, but none we think that are so severe as to doom it to be invalid. The ones that we can foresee are these:

Spoofing & Fraud. Certain individuals with snake oil to sell might create multiple accounts, and spoof the system into thinking that their product is successful by fabricating data. Spoofing attacks are commonplace, and it is the sort of behavior that internet community moderators deal with all the time, and I am confident that we could create strategies to counter spoofing. See the method section for our treatment of the issue of data fraud.

Certain procedures should be followed to detect fraud after the fact. Among them will be examination of the data reports for variability signatures suggestive of data manipulation. Among these, for example, will be testing for zero variance. If a person wishes to commit fraud by suggesting that some product they sell is effective for holding off the flu, one way to do it easily is to create multiple fictional participants, and a template report for each of them for a “typical day”. Then, as they report their daily data, all they do it change the date of the report, while all other variables remain constant. The likelihood that a given participant will experience no changes in their exposure status, their symptoms, their diet, their mental status is so remote as to be impossible. Any person reporting no change in these variables over three consecutive days will be disenrolled from the study and their data will be purged from the data set on the assumption that they are spoofing the dataset. Another way to commit this kind of fraud would be to create ten or so reports with different fake values for each variable, and then rotate through these ten reports over time, creating the illusion of real variance. This strategy could be detected by comparing the sets of values reported across subjects, and across days. Any time different reports match exactly, they can be flagged as possibly fraudulent, and further scrutinized if need be. Any time fraud is detected, the IP address from which the fraudulent reports were issued can be banned from the study.

Subject Self-Selection. Subjects will volunteer for the study on the basis of public solicitations. This means that we should be prepared for biases that arise due to participants selecting themselves to be in the study. How that might skew the results will be hard to anticipate. However, we have to have faith that the effects of panflu will not discriminate between the types of people who volunteer for the study, and those who do not. That said, we can expect that the subject population will mirror the population which currently has internet access. Fortunately, the proportion of the American public at least that has internet access is very high, although not as high as the proportion with televisions or telephones.

False Panflu Victims. In this scenario, the pandemic flu breaks out while large numbers of people have the regular flu, colds, and “stomach flu” all of which resemble panflu in some manner. So these people all dutifully report their symptoms, take their remedies, and when they get better (as so often happens with these ordinary diseases) their data are registered as successes for the substances they are taking. This is in fact the thorniest issue for the open access trial idea. Here is our response: if panflu breaks out, there’ll be a sudden surge in people reporting symptoms, and it’ll be a safe inference that the vast majority of those folks will have it because they’ve caught the panflu. Even if it broke out during cold/flu season, it’d be reasonable to expect that the prevalence for a highly transmissable panflu will be much higher than for regular colds or flus. But here’s the kicker: as with the placebo effects, a large number of people with colds or regular flu would not be likely to bias the results in favor of any one preventative strategy or remedy. Instead, they would merely add statistical noise to the dataset, making it harder to spot significant effects of any given substance.

Mortality. How does an individual’s death get reported? If that individual dies, then by definition he or she cannot log back into the system to report that fact. We might circumvent this problem by automatically updating an individual’s status if he or she fails to check in for three straight days. See the Method section for our treatment of this issue.

Minimum Sample Size. With the crude nature of these data, we will need lots of subjects to take part. For a substance that has a small effect on the severity of symptoms, we might need several thousand individuals taking part to have sufficient power to detect such an effect. This means that whoever takes this project on needs to be able to get the word out about its existence, and attract a large (100,000+) user base very quickly. Who is that person or organization? One with high public recognition, one with high public confidence, and one with technical savvy and access to high bandwidth internet connections. A major research university might be a good host. Microsoft. Consumer’s Union. We might also seek to create a distributed organization, hosted at multiple sites across the globe. The advantage of such a system would be to distribute the system, creating redundancy that is a hedge against failures of infrastructure. If such a system is to be created, each site will need to be using a common data reporting form and creating the same data structures, so as to make their eventual integration into a single overarching dataset possible. This proposal makes that system possible.

Access Interruptions due to Infrastructure Failures. Use of the database requires internet access. This is potentially the most severe problem, and if there are widespread disruptions of the internet, this might deep six the entire project, as users are not able to check back in, and they are treated as MIA. However, we might create a backup system, whereby active users are polled by phone, using volunteers who do have internet access to track those who are MIA down.

Language Barriers. The database will be in Standard American English. We are in the fortunate situation at this point in history that the vast majority of internet traffic is in English, to the extent that it can in fact be considered the internet lingua franca. This means that even if participation is open only to English speakers, there will be people worldwide who are capanble of taking part in the study. That said, it would be desirable to translate the reporting form into other major languages, particularly those that might be near the epicenter of a panflu outbreak. On the other end, dissemination of the results will need to be made more widely available, but if we get to the point where there are results to be reported, we will have no trouble finding plenty of willing translators.

Quality of data. Finally, how well can we rely on people to accurately report their own symptoms, and the other data they enter? In answer to this criticism, I will merely say that I have faith in everybody’s ability to do this adequately, because, after all, isn’t this a major way that doctors find out about their patients symptoms: by asking them? By keeping the data sought relatively simple (not asking people to report dosages, etc) and with clear-cut criteria for symptoms, we can keep our data more accurate, even if they are less “fine grained” than we might like under optimal circumstances.

So that is our proposal. What is its potential for saving lives? Let’s spin some scenarios. One class of substances that people are talking about as potential cytokine modulators are the statins, widely used for the control of blood lipids. If you were to walk to an average pharmacy in the U.S., Canada or Europe, you would find mountains of these pills on the shelves. Furthermore, temporarily denying these drugs to the individuals who take them is not likely to severely hurt the health of those individuals, as their cholesterol climbs back to what it was before. In an emergency situation, if the data showed an early large effect for the statins, one can imagine a fairly rapid response on the part of public health officials as they rush doses to critical care wards and clinics.

On the other side of things, there is considerable controversy about the likely effectiveness of corticosteroids such as prednisone, used sometimes to control excessive inflammation, such as that seen in the late stages of an influenza infection. In the minds of some clinicians, prednisone may make things worse. If prednisone in fact increases mortality, the database will make that clear as well.

Finally, who knows what weird thing will pop out as effective? We have heard all manner of things for which a good case can be constructed: curcumin, fish oil, skullcap, elderberry extract, resveratrol, star anise, etc etc ad nauseum. Without specifically buying into the claims made by their supporters, the fact remains that many have undocumented chemical properties, some of which might turn out to be just what the doctor ordered. At the very least, we know that there will be thousands of people trying these things out on their own anyway. Why not just recruit them for the purpose of gathering data from the experiments they will already be performing on themselves? As a matter of ethics, we are obligated to try to gather these data in the best manner possible.

Appendix A: Data Reporting Form [to be published later]

[this revision completed March 18, 2006] Originated by Michael Donnelly, Ph.D.

Discuss this entry here?.

Page last modified on October 15, 2009, at 04:40 PM by pogge