|
|
||||||||
1 Department of Medical Biophysics, University of Toronto, Sunnybrook Health Sciences Centre, 2075 Bayview Avenue, Rm. S126B, Toronto, Ontario M4N 3M5, Canada
2 Department of Immunology, University of Toronto, Sunnybrook Health Sciences Centre, 2075 Bayview Avenue, Rm. S126B, Toronto, Ontario M4N 3M5, Canada
* To whom correspondence should be addressed. E-mail: jrast{at}sri.utoronto.ca
| Abstract |
|---|
|
|
|---|
Abbreviations: Ig, immunoglobulin LRR, leucine-rich repeat MACPF, membrane attack/perforin domain NLR, NACHT domain–leucine-rich-repeat protein PAMP, pathogen-associated molecular pattern SRCR, scavenger receptor cysteine-rich TCR, T-cell receptor TIR, Toll/interleukin-1 receptor domain TLR, Toll-like receptor VLR, variable lymphocyte receptor
| Introduction |
|---|
|
|
|---|
The situation has begun to change with the realization that core immune mechanisms are shared among Eumetazoa and with the steady accumulation of molecular evidence for basic shared characters among bilaterian immune cells. Immune investigations in Drosophila melanogaster (reviewed in Hoffmann, 2003) have paved the way for the important realization of the critical role played by Toll-like receptors (TLRs) in mammalian immunity (Medzhitov et al., 1997; Poltorak et al., 1998). These similarities notwithstanding, other frontline mechanisms of immunity are turning out to be very different at the phylum level. While this impedes rapid progress in comparative immunology, it simultaneously promises an enormous wealth of immunobiology that will have important evolutionary and practical implications.
Animal immune systems and genome sequence analysis
Animal immune cells participate in a variety of distinct processes including (1) maintaining self/non-self barriers and eliminating invading microbes when these barriers are penetrated; (2) wound healing; (3) maintaining balance among symbiotic microbes, especially as this pertains to gut immunity; and (4) controlling and eliminating aberrant self. Aspects of these systems (particularly antiviral and barrier functions) are spread throughout virtually all somatic cells, but in most bilaterians specialized mesoderm-derived cells are more-or-less dedicated to immunity. The primary focus of this review is the role of genomic sequences in understanding these dedicated immune cells in the more than 25 animal phyla that have not been the major target of immune studies.
Assembling a catalog of immune genes from a genome sequence first requires a classification scheme. Since this is a function-based approach and genes often carry out multiple tasks, any scheme will have its flaws; however, classification of immune genes as encoding (1) recognition factors, (2) regulatory factors, and (3) effector proteins has proven a useful start (for a more thorough discussion of immune gene categorization, see Hibino et al., 2006). Recognition factors such as TLRs, NACHT domain–leucine-rich-repeat proteins (NLRs), and lectins are proteins that make direct contact with target microbes or are the primary point of indirect detection by the immune system. Effector factors such as perforins and antibacterial peptides are the proteins that carry out pathogen killing, opsonization, and other elimination functions. These categories overlap when the same molecule provides recognition and effector functions (e.g., some functions of immunoglobulin in vertebrates), but these functions are often separate and specialized. Recognition and effector functions can thus be predictive as to the types of immune processes that are present in a genome. These categories also tend to be more quickly evolving than the regulatory factors and more difficult to identify by homology. Effectors are often completely novel in new taxa.
Regulatory factors include transcription factors, signaling molecules, and signal transduction mediators and modulators; they can be roughly divided between those that control immunocyte development and those that control differentiated immune function. From the developmental and cellular perspective there is growing evidence that homology exists among control systems in animal immunocytes. Immunocytes are notorious for their morphological variability and thus are difficult to categorize even within species. Nonetheless, certain basic types of immunocytes appear to be shared across phyla (Hartenstein, 2006). Regulatory gene usage in the development of Drosophila hemocytes is well studied, and some interesting parallels are evident with vertebrate hematopoietic systems (Evans et al., 2003; Hartenstein and Mandal, 2006). This is also true for sea urchin coelomocytes (Pancer et al., 1999; Rast et al., 2000; Hibino et al., 2006), and in some cases orthology is clearly evident within the deuterostomes. Although more detailed knowledge of regulatory interactions is needed to understand the level at which homology exists in animal immunocytes (e.g., how much of it lies in general mesenchymal cell functions and how much is specific to immunity), the implication of these findings is that the "blood" cells across bilaterians may share a basic kernel of developmental circuitry (see Davidson and Erwin, 2006). Investigations centered on transcription factors with developmental and functional relevance to immunity will help reveal the nature of immunity in the common bilaterian ancestor. Additionally, since homologs of these transcription factors and developmental signaling systems can be unambiguously identified from genomic sequence, they serve as an anchor point to characterize more divergent aspects of immunity.
Given that immune responses are potentially destructive to the host and must be tightly controlled, specialized forms of cell communication that serve to coordinate protective reactions among cells are likely of great importance in all animals. Immune signaling can act to recruit immune cells to a site of infection, activate or suppress a response, or promote proliferation and differentiation of immunocytes. These signals can mediate communication among immunocytes, or between immunocytes and nonspecialized somatic cells. In part because these systems can be manipulated by pathogens for immune evasion (Murphy, 1993; Alcami, 2003), vertebrate immune signaling systems evolve quickly, often making the interleukins and other cytokines that mediate these functions difficult to identify even across Class divides. Homologs of most of the major interleukins have not been identified outside of vertebrates, even when genome sequences are available. Their apparent absence in invertebrates may mean that they are present but highly divergent or that analogous functions are carried out by other signaling systems.
A final distinction is that of innate and acquired (or adaptive) immunity, referring respectively to non-self receptors that are encoded in the germline like most genes or generated in a process of somatic diversification. This division refers primarily to recognition mechanisms, although a highly complex and dedicated cellular machinery accompanies acquired immunity in species where it is well described. In jawed vertebrates the diversified immune receptors are immunoglobulins (Ig) and T-cell receptors (TCR) expressed on B- and T-lymphocytes, respectively. Ig and TCR function is mediated in a stepwise process of somatic DNA-level diversification, followed by cellular selection of clonally expressed specificities that are useful and deletion of specificities that are potentially harmful. This process enables immune recognition to adapt within the individual and allows for the establishment of immune memory that can protect against previously encountered pathogens during the lifespan of the individual. The diversification process, known as V(D)J recombination, is restricted to the jawed vertebrates and is associated with a specialized type of Ig domain proteins and recombinases.
Recently a second type of vertebrate receptor diversification system has been identified in the lamprey and hagfish (Pancer et al., 2004). This system uses a process of somatic gene conversion to diversify leucine-rich-repeat (LRR)–containing proteins called variable lymphocyte receptors (VLRs) that are similar to the extracellular region of TLRs (Alder et al., 2005, 2008; Nagawa et al., 2007; Rogozin et al., 2007; Herrin et al., 2008). This agnathan system is in many ways analogous in function to jawed vertebrate Ig systems. Invertebrate diversifying systems include extensive splice variation in Drosophila and Anopheles DSCAM (Watson et al., 2005; Dong et al., 2006), somatic mutation in gastropod FREP genes (Zhang et al., 2004), and highly diversified 185/333 transcripts expressed by sea urchin phagocytic cells (Buckley and Smith, 2007; Terwilliger et al., 2007). These recent findings suggest that unusual molecular systems for immune receptor diversification and possibly adaptive immunity are more widespread than previously considered. As alternative systems are better understood, general principles may emerge that will guide searches for further unusual systems.
The power of genome sequences to characterize divergent immune systems
Genome sequence analysis offers advantages that promise to overcome longstanding barriers to progress in comparative immunology. These barriers relate to the accelerated rate at which immune mechanisms evolve. First, when homologs of recognition and effector genes exist among phyla they are usually highly divergent and cannot be isolated by standard molecular screening or PCR procedures. Second, as genomes emerge from key animal groups it becomes apparent that orthologs of important jawed vertebrate immune mediators are often completely lacking in other groups. This is hard to prove in the absence of comprehensive data. A third complication is that immune recognition and effector genes are commonly encoded by complex and polymorphic multigene families. Genome sequences offer solutions to each of these problems and a framework on which data can be hung even when they do not coincide with anything that we know of immunity from past studies of other animals. Analysis of the sea urchin genome (Sodergren et al., 2006) will serve here as an example of a genome sequence that held many immune surprises that for the most part would have been impossible to predict by any other strategy.
| Immunity in the Sea Urchin |
|---|
|
|
|---|
|
|
The purple sea urchin genome encodes more than 200 TLR genes distributed in a variety of subfamilies. By comparison, mammals and insects, where these genes are most intensely studied, generally have about 10. The sea urchin TLR genes fall into three main structural categories (Hibino et al., 2006). The first is a relatively small group with structural similarity and TIR domain sequence that ally them with most of the protostome TLRs (e.g., Drosophila Toll). The LRR solenoid-like structure of these receptors contains internal cysteine-rich capping LRRs (Pfam: LRRNT and LRRCT). Another small group of sea urchin TLRs is characterized by a very short extracellular domain. The remaining majority of sea urchin TLR genes have structure more akin to those of vertebrates and Drosophila Toll 9. The presence of both these gene types in deuterostomes implies that a major divergence among TLR structural types had already taken place in the bilaterian common ancestor.
Patterns of diversity among members of the expanded sea urchin TLR families indicate that they are quickly evolving. Amino acid changes among closely related genes are highest in the ectodomain and are elevated in certain LRR motifs (Hibino and Rast, unpubl.). About 30% of these genes are pseudogenes, which is consistent with rapid gene turnover. Many of the TLR subfamilies are encoded in tandem arrays of closely related genes, while others are dispersed throughout the genome. Propositions regarding the role of positive selection in the evolution of sea urchin TLRs will have to be formally tested with rigorous phylogenetic analyses (e.g., see Alder et al., 2005). Nonetheless it is clear that as a gene family they differ greatly from the comparatively small number of vertebrate TLRs, which appear to be relatively conserved throughout the vertebrates (Roach et al., 2005).
The expression levels of members of many of the sea urchin TLR gene subfamilies are most elevated in coelomocytes and very low or absent in the embryo (Rast et al., 2006). Experiments to understand the mechanisms by which these large families of TLRs are expressed will answer important questions about how the vast innate diversity is managed by the sea urchin immune system. Various scenarios of restricted expression can in principle amount to levels of cellular diversity (Litman et al., 2007), and this is being tested with single-cell expression analyses.
Another unexpected finding from the sea urchin genome sequence is a large multigene family encoding NLR-like proteins. Before the sequencing of the sea urchin genome these genes were known only from vertebrates, though they share structural similarity to the plant Resistance (R) genes (Chisholm et al., 2006). In vertebrates these cytoplasmic proteins are implicated in the recognition of PAMP molecules, although the mechanisms of recognition are not entirely clear (Kufer et al., 2005). The sea urchin genome encodes more than 200 NLR genes (Hibino et al., 2006). The general structure of the encoded proteins is a Death domain followed by a nucleotide-binding NACHT domain and a series of LRR motifs. The LRR region of the sea urchin genes is not well defined for many subfamilies. In contrast to the TLRs, complex intron-exon structure makes gene modeling difficult, and cDNA sequencing will ultimately be critical to better understand these genes.
An interesting variation in the sea urchin NLR genes is the presence of an N-terminal Death domain on most of the gene family members. The majority of vertebrate NLR genes have caspase recruitment or pyrin domains in this position. Notably, a majority of the putative interacting adaptors and apoptosis mediators in the sea urchin also have Death domains rather than the caspase recruitment domains of mammalian homologs (Robertson et al., 2006), suggesting that parallel replacement of these homotypic interacting domains has modified the system as a whole (Hibino et al., 2006).
In vertebrates, NLRs are associated with gut immunity; in the sea urchin, the gut is the major site of NLR expression (Messier-Solek and Rast, unpubl.). In the sea urchin, these genes are highly diverse, and though they resemble the TLRs in the sense that many of the subfamilies appear to be recently expanded, they are overall a more varied and complex multigene family. The presence of this diverse recognition system primarily in the gut may in part explain the diversity of the sea urchin innate system. Gut immunity and maintenance of symbiotic microbial communities (McFall-Ngai, 2007) are primary forces that drive the evolution of immune systems.
The third family of diversified sea urchin immune genes encodes an immense diversity of multidomain SRCR proteins with structural similarity to phagocytosis receptors. Proteins with this general structure in vertebrates compose a class of scavenger receptors that is used by macrophages in the process of binding to microbes and aberrant self (Sarrias et al., 2004), generally by recognition of modified lipids (Mukhopadhyay and Gordon, 2004). The multidomain SRCR proteins were shown to be an immense family in sea urchin before the availability of the genome sequence (Pancer et al., 1999; Pancer, 2000, 2001). cDNA analysis of SRCR genes shows that some of these genes are secretory, while others are type-1 membrane proteins. Unlike the TLRs, they have a complex exon structure, and the genome sequence is just a starting point for characterizing this multigene family. Nonetheless if we compare species on the basis of the raw number of SRCR domains contained in multi-SRCR proteins, then the sea urchin repertoire greatly exceeds that of vertebrates (e.g.,
1100 in sea urchin vs.
80 in human).
Although these three immune gene families are greatly expanded, other immune receptors and effectors are encoded in gene families of more moderate size (Fig. 2A). These include peptidoglycan recognition proteins, Gram-negative binding proteins, and complement factors. It is notable that representatives of each of the sea urchin families of expanded factors (TLR, NLR, and SRCR) are part of an immune circuit in vertebrates where the DMBT1 protein, structurally very similar to the sea urchin multidomain SRCR factors, acts downstream of TLR and NLR recognition in gut immunity (Rosenstiel et al., 2007).
Origins of vertebrate immunity
B- and T-cell–mediated adaptive immunity is among the most intensely studied of biological systems. Recognition specificity in jawed vertebrate adaptive immunity comes through precise clonal selection of somatically diversified TCRs and Igs. The essential elements of the diversifying system include orthologs of the six rearranging gene classes of T-cell receptors and Ig, as well as the dedicated enzymatic machinery including the recombination activating genes (Rag1/2) and terminal deoxynucleotidyl transferase (TdT) that carry out rearrangement along with constitutive DNA repair processes. The major histocompatibility Class I and II genes act in the specificity selection process. Orthologs of all of these molecules are present throughout the jawed vertebrates but had not been found outside of this group. Although the basic domain structures of some of these elements can be found in invertebrates, evidence for the presence of V(D)J rearrangement is entirely absent outside of jawed vertebrates.
Until recently, Drosophila melanogaster and Caenorhabditis elegans were the only non-jawed vertebrate bilaterian species that could be considered as comprehensively analyzed outgroups for comparison. None of the core dedicated immune factors that mediate jawed vertebrate adaptive immunity are present in the fruit fly genome. Given the large phylogenetic gap between jawed vertebrates and protostomes, intermediate outgroups like echinoderms are clearly necessary to address this question.
The sea urchin genome encodes a wealth of genes that are relevant to understanding the origins of jawed vertebrate adaptive immunity (Hibino et al., 2006). These include a distant Rag1/2-like gene cluster, a TdT/pol µ homolog, and a variety of non-rearranging Ig variable region genes, some of which compose diversified gene families. Like much that comes from initial scans of genome sequences, none of these findings provide immediate clear-cut answers in themselves. This is particularly true for Ig domain proteins, which tend to be highly divergent even within jawed vertebrates. More detailed information of their function and associated pathways will serve to clarify their relationship to vertebrate immunoglobulin domain receptors.
The Rag1/2-like cluster (Fugmann et al., 2006) is related to other Rag1-like elements that have been recently identified in a variety of species (Kapitonov and Jurka, 2005). The finding of these elements was unexpected given their absence in the Ciona genome (Azumi et al., 2003). The sea urchin cluster is present in a single locus that spans 20 kb, and both the Rag1- and Rag2-like elements are interrupted by multiple introns. The Rag2-like element is divergent in terms of primary sequence but has a predicted structure identical to the unique Kelch repeat-PHD structure of vertebrate Rag2. The orientation of the genes in the cluster is also identical to that of vertebrate Rag1/2. Both genes are transcribed and spliced in the embryo and adult coelomocytes. Additionally, the sea urchin Rag-like proteins associate with each other and with their vertebrate counterparts (Fugmann et al., 2006). The role of these genes is as yet unclear. They could represent a transposable element, although the large locus size, presence of introns, and active transcription suggest otherwise. Whatever their nature, they offer a fresh window into the origins of the jawed vertebrate V(D)J recombining gene system.
A variety of Ig domain proteins are present in the sea urchin genome, and some of these are candidates as distant non-rearranging relatives of Ig and TCR. Ig, TCR, and MHC proteins contain an Ig-like domain subtype designated as C1 (Williams and Barclay, 1988). This subset of the Ig domain superfamily is relatively restricted and found in relatively few genes including the rearranging adaptive immunity receptors (Du Pasquier, 2004). A small family of three V-C1-TM-Cytoplasmic region genes is found in the sea urchin genome. Additionally, a relatively large family of about 60 V-C2-TM-Cyt molecules is also present (Fig. 2B). Ig genes diverge at a very fast rate, even among vertebrates. The precise implications of the presence of these gene families in the sea urchin genome is unclear, but elucidation of their roles in this organism will provide further insight into the origins of vertebrate immunity.
Immune regulatory genes of the sea urchin genome sequence
For the purposes of this genomic analysis, regulatory proteins include transcription factors and signaling systems (signaling receptors, ligands, transducers, and modulators). The regulatory genes that control hematopoiesis and immune reactions can be broadly divided between those that are specific to immunocyte systems and those that are shared by other biological processes. Individual transcription factors are invariably used in multiple cellular contexts, but the extent of usage for some factors can be limited such that they offer valuable insights into homology. The most informative transcription factors are those that are used repeatedly in the course of vertebrate hematopoietic and immune processes, especially when this applies across vertebrate paralog groups. These homologs are promising candidates as regulatory factors with an ancient role in immunity.
Orthologs of virtually all important vertebrate hematopoietic and immune transcription factors can be found in the sea urchin genome, including those that belong to deuterostome-specific subclasses such as the PU.1/SpiB/SpiC ETS factors (Hibino et al., 2006). The sea urchin genome generally encodes one subfamily member for each paralogous group of two to four genes in vertebrates (e.g., one homolog of vertebrate GATA-1, -2, -3 and one Scl factor that is orthologous to Scl, Tal-2, Lyl-1). In a few cases there are duplications specific to sea urchin (e.g., there are two Runx homologs; Howard-Ashby et al., 2006).
Cell signaling to activate and suppress immune responses and to regulate cell migration and proliferation is an essential component of immunity. Just as for transcription factors, the specificity of signaling to immune systems varies widely. For the most part, homologs of the broadly used signaling systems are all present, whereas many vertebrate immune signaling systems are absent or have diverged beyond recognition. Homologs of important receptor tyrosine kinases such as VEGFR and Tie1/2 are clearly present, whereas no clear homolog of the Flt3/Kit/PDGFR subfamily has been identified (Lapraz et al., 2006). Notably, both the VEGFRs and the Tie1/2 gene are expressed in coelomocytes (Smith et al., 1996; Messier-Solek and Rast, unpubl.).
A very different situation is found for homologs of vertebrate cytokines, interleukins, and chemokines (Hibino et al., 2006). No ligands or receptors of the four-helix bundle/hematopoietin family; the IL-10 and IL-12 families; interferons; and the CXC, CX3C, and CC chemokine families were identified in the sea urchin genome. These gene families are divergent even among the jawed vertebrates and have not been identified outside of this group. An IL-1 receptor homolog is present in the sea urchin genome, as is a greatly expanded family of about 30 IL-17 ligands and two IL-17 receptors. Four tumor necrosis family ligands and eight receptors were also identified. The sea urchin differs greatly from vertebrates and protostomes in its repertoire of these signaling factors. While many are absent and may be vertebrate-specific, others show signs of expansion. A caveat in searching for these genes is that they tend to be small, highly divergent genes in vertebrates and are often broken into many tiny exons. These characteristics greatly complicate computational gene searches. Nonetheless, this is another area of immunity in which radical differences among phyla can be expected. The rapid evolution of these immune cell communication systems within vertebrates is likely driven in part by molecular mimicry by microbes that exploit these systems to disrupt host immune responses (Alcami, 2003).
Representatives of most vertebrate cytosolic mediators of signal transduction are present, and these genes are generally conserved even with their fly counterparts. There is a moderate expansion of TLR signaling intermediates (e.g., about 25 MyD88, SARM-like genes and other cytosolic TIR domain proteins are found in the genome) that may relate to the highly expanded state of the TLR receptors. Factors beyond the immediate adaptor molecules that interact directly with members of the expanded recognition receptors tend to function in many biological systems and are relatively conserved compared to the immune genes themselves, as has also been seen in analyses among Drosophila immune mediators (Sackton et al., 2007).
Effector genes
Proteins that carry out killing and clearance of invading microbes include antimicrobial peptides, enzymes that generate reactive oxygen species, pore-forming proteins, and clotting factors that can physically separate non-self from self. Some of these factors (e.g., antimicrobial peptides) are among the most quickly evolving and difficult to recognize factors in the context of a genome sequence search. In a comparative study of innate immune genes from Drosophila genomes, immune effector genes as a class diverge most quickly and show greatest evidence of positive selection (Sackton et al., 2007). Many of these effectors will be identified in screens that are aided by the genome sequence and may ultimately be some of the more interesting finds from the genome. A few candidate effector genes were found in the sea urchin genome, including a family of 22 potential pore-forming membrane attack/perforin (MACPF) domain proteins. The founding member of this family was identified as an early-embryo protein whose function was unknown but that, interestingly, was secreted to the outside of the embryo (Haag et al., 1999). The diversity, multiplicity, and perforin-like domain structure, recognized from the genomic analysis, along with the spatial expression data suggest an immune function for this multigene family (Hibino et al., 2006). A recent survey for genes that respond to immune challenge in the Chinese amphioxus Branchiostoma belcheri identified probable homologs of these sea urchin MACPF genes as highly upregulated (Huang et al., 2007). These genes present an interesting case in which genome sequence analysis informed by older molecular data led to prediction of function.
| A New View of Animal Immunity Emerging From Genome Sequences |
|---|
|
|
|---|
|
Perhaps the most important findings that will emerge from a more comprehensive view of animal immunity will be the many independent solutions to immune problems that exist across different phyla. The most interesting of these systems will not be obvious from the genome sequences themselves, but the genome sequences will provide a critical organizational framework. Ultimately, much of what is found will probably be as unpredictable to us as it must necessarily be to the microbes against which it is directed.
| Acknowledgments |
|---|
| Footnotes |
|---|
| Literature Cited |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
D. de Jong, M. Eitel, W. Jakob, H.-J. Osigus, H. Hadrys, R. DeSalle, and B. Schierwater Multiple Dicer Genes in the Early-Diverging Metazoa Mol. Biol. Evol., June 1, 2009; 26(6): 1333 - 1340. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Cameron and J. P. Rast Biological Bulletin Virtual Symposium: Genomics of Large Marine Metazoans Biol. Bull., June 1, 2008; 214(3): 203 - 204. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |