Molecular epidemiology is not new, but it is progressing rapidly.
Last month saw the publication in PLoS Pathogens of a new study of an equine influenza outbreak in Newmarket, UK, in 2003. This town is known for its high density of Thoroughbred race horses; around 3000 horses are divided among yards of 20-200 horses.
Despite vaccination against equine influenza (EI), a large outbreak of EI occurred resulting in the infection of horses on many yards. Nasal samples were taken from 19 of the 21 yards during the outbreak and the possible order of infection of the yards was determined based upon ELISA and real-time RT-PCR data. Both of these assays detect the viral antigen (which is dependent on a swab, taken only once clinical signs are detected), rather than antibodies; but it nevertheless gives an indication as to which yard was infected and when - a valuable source of information in the absence of further biological data.
|Outbreak progression: the order of when yards were infected. Virus presence was detected using real-time RT-PCR, with the copy numbers obtained provided in the y axis.|
The authors sequenced a huge number of clones - 2361 - of a 903bp PCR product representing a fragment of the haemagglutinin 1 (HA1) gene, with multiple clones per horse. A particularly interesting aspect is that the overall dN/dS ratio is 0.89, which implies that, in general the outbreak progressed without a massive restriction on the maintenance of mutations. When the sequences were combined to form a consensus for each horse, only two unique consensus sequences were found for the entire Newmarket outbreak. This lack of inter-horse genetic variability highlights the limitations of analysing consensus data at this level.
When they looked further at the diversity sequences found within a horse (the term 'quasispecies' conspicuous by its absence) they found limited diversity associated with dominant sequences, with the dominant species sometimes changing in those horses sampled more than once, either due to one sequence becoming dominant, or due to co-infection.
The classic view of transmission involves bottlenecks - something I've written about previously with regard to arboviruses. It turns out that the bottlenecks for EIV are loose, even allowing sequences with stop codons, i.e. lethal sequences, to pass between horses. This goes along with the observation of low levels of purifying selection and may tie in with a recently accepted manuscript in J Virol (not yet in press) suggesting that individual virus particles fail to express all of the proteins correctly.
Sequence data is increasingly being linked with geographical data with the aim of tracing outbreaks to a finer scale. Although 903bp might be thought of as being a bit too short for this purpose, by taking into account the within host diversity there was sufficient information within the dataset to allow such an approach here.
One particularly interestingly fact was that there didn't appear to be a straightforward yard-to-yard pathway. Based upon this analysis, it is possible to hypothesise that a yard may be infected by multiple yards and subsequently itself infect multiple different yards. This raises the speculation that horses within a yard were not necessarily being infected by their yard-mates, leading the authors to suggest that social networks may more fully explain the observed transmission dynamics as opposed to a model based upon proximity. Such freedom of spread is emphasised further with respect to inter-horse spread. Figure 3C shows how numerous horses may be infected by a donor horse, including one which contributed infection to 10 other horses.
The biggest frustration is the same as that for most studies involving segmented viruses. Reassortment is a key process in the evolution of segmented viruses such as influenza, and the authors acknowledge that this aspect is missing from their analysis. As it is there's no way of saying whether two viruses which are identical in the 903bp fragment of HA are actually reassortants with their other segments derived from distinctly separate parts of the outbreak/network. When next generation sequencing methods predominate and generate full genome sequence data this should become less and less of an issue. My suspicion is that, in the not too distant future, studies like this will almost certainly be obliged to use full genome sequencing.
Equine flu might not seem the most important disease ever (although anything to do with horses = money), this study is massively extensive. They have shown how, rather than simply drawing a few phylogenetic trees, much information can be derived from what, ultimately, is a straightforward collection of sequences with some associated epidemiological data.Hughes, J., Allen, R., Baguelin, M., Hampson, K., Baillie, G., Elton, D., Newton, J., Kellam, P., Wood, J., Holmes, E., & Murcia, P. (2012). Transmission of Equine Influenza Virus during an Outbreak Is Characterized by Frequent Mixed Infections and Loose Transmission Bottlenecks PLoS Pathogens, 8 (12) DOI: 10.1371/journal.ppat.1003081