Figure 1. Phylogenetic relationships among insect endosymbionts (boldface) and related
-Proteobacteria, estimated from the 16S rDNA gene. Both maximum likelihood (ML) and Bayesian analyses give the tree topology presented. Values above nodes (in boldface) are bootstrap values for maximum likelihood analysis, and values below nodes are posterior probability values generated by the Bayesian analysis. Branch lengths reflect genetic distance under the maximum likelihood model used. This phylogeny strongly supports the following hypotheses: (i) a single origin of endosymbionts in the ancestor of the ant genera Camponotus, Colobopsis, and Polyrhachis, (ii) independent origins of symbiosis in the ants Formica and Plageolepis, and (iii), that Buchnera is a phylogenetically distinct lineage from Wigglesworthia and Blochmannia, which are closely related.
16S rDNA sequence data: Most 16S rDNA sequences were obtained from GenBank (with the exception new Blochmannia, obtained as described below). Nucleotide sequence accession numbers for other 16S rDNA sequences used in phylogenetic analysis are as follows: Antonina crawii S-endosymbiont AB030020; Buchnera-Acyrthosiphon pisum (P-endosymbiont) NC002528; Arsenophonus-Triatoma infestans (S-endosymbiont) U91786; Colobopsis nipponicus endosymbiont AB018676; Dysmicoccus neobrevipes S-endosymbiont AF476104; Enterobacter asburiae AB004744; Escherichia coli NC000913; Erwinia herbicola AB004757; Formica fusca endosymbiont AB018684; Wigglesworthia-Glossinia austeni (P-endosymbiont) AF022879; Wigglesworthia-Glossinia brevipalpis (P-endosymbiont) L37341; Glossinia pallidipes S-endosymbiont M99060; Haemophilus influenzae NC000907; Klebsiella pneumoniae AB004753; Arsenophonus-Nasonia vitripennis (S-endosymbiont) M90801; Plagiolepis pigmaea endosymbiont AB018683; Pseudomonas aeruginosa NC002516; Polyrhachis lamellidens P-endosymbiont AB018680; Proteus vulgaris J01874; Buchnera -Schizaphis graminum (P endosymbiont) L18927; Sitophilus oryzae P-endosymbiont AF005235; Salmonella typhimurium NC003197; Uroluecon astronomus S-endosymbiont (U type) AF293623; Yersinia pestis NC003143; Yamatocallis tokyoensis S-endosymbiont AB064515.
Obtaining Blochmannia 16S rDNA data: Genomic DNA was extracted from individual Camponotus festinatus and C. pennsylvanicus workers using the DNeasy tissue kit (Qiagen) according to the manufacturers instructions. This DNA was used as template for PCR reaction using SL and SR universal eubacterial 16S rDNA primers (Schroder et al., 1996). The single 1.6-kb band PCR product was then cleaned up using a column purification kit (Qiagen) and sequenced on an ABI 3700 automated sequencer. SL, SR, and two internal primers were used to sequence the PCR product. The resulting sequences were assembled and edited using PHRED, PHRAP, and CONSED. These 16S rDNA genes of Blochmannia-C. pennsylvanicus and Blochmannia-C festinatus are assigned GenBank accession numbers AY196850 and AY196851, respectively.
Phylogenetic analysis methods: Alignments were created using the Ribosomal Database Project II sequence aligner (Maidak et al., 2001), then manually edited in MacClade v. 4.05 (Maddison and Maddison, 2002). Maximum likelihood parameters were identified according to the Akaike information criterion (AIC) of Modeltest v. 3.06 (Posada and Crandall, 1998). The most likely model was a general time reversible (GTR) model in which invariant sites and the gamma distribution were estimated from the data. The optimized parameters (Rmat = {0.8676 4.6744 2.0447 1.0516 7.4521}, shape of gamma distribution = 0.5500, and proportion of invariant sites = 0.5115) were used for all ML searches. The tree topology presented is the consensus of 100 separate heuristic ML searches, each starting from random trees, using PAUP v. 4.0b10; (Swofford, 2002). ML bootstrap values were determined from 100 bootstrap replicates, with each replicate starting from 10 random trees. Replicates were performed in parallel on a Beowulf cluster using the clusterpaup program (A.G. McArthur, http://jbpc.mbl.edu/mcarthur). Bayesian analysis was performed on the same data matrix (MrBayes ver. 2.01; Huelsenbeck and Ronquist, 2001) by running four simultaneous chains for 300,000 generations, sampling every 100 generations. Stationarity in likelihood scores was determined by plotting the -1nL against the generation. All trees below the observed stationarity level were discarded, resulting in a "burnin" of 5000 generations. The 50% majority-rule consensus tree was determined to calculate the posterior probabilities for each node. The selected model for Bayesian analysis was the GTR, using empirical base frequencies, and estimating the shape of the gamma distribution and proportion of invariant sites from the data. The Bayesian tree with the best likelihood score was identical to the ML tree presented, and the parameter values across this tree were virtually identical to those obtained in the ML analysis.
Limited sequence data (<570 bp of 16S rDNA gene) were available for four taxa included in the phylogeny (Formica fusca, Plagiolepis pigmaea, Polyrhachis lamellidens and Colobopsis nipponicus), compared to >1202 bp available for the rest of the taxa. Removal of these four taxa from ML or Bayesian analysis did not affect the overall tree topology; however, their removal greatly increased the statistical confidence in the node marked with the open circle. Comparisons of these analyses indicate that including Plagiolepis pigmaea drives down the confidence in that node, and suggest ambiguity in its position on the tree. However, given the incomplete sequence for that endosymbiont, this topology is the best estimate of its phylogenetic placement.