We report a random survey of 1 to 2% of the somatic genome of the free-living ciliate<jats:italic>Paramecium tetraurelia</jats:italic>by single-run sequencing of the ends of plasmid inserts. As in all ciliates, the germ line genome of<jats:italic>Paramecium</jats:italic>(100 to 200 Mb) is reproducibly rearranged at each sexual cycle to produce a somatic genome of expressed or potentially expressed genes, stripped of repeated sequences, transposons, and AT-rich unique sequence elements limited to the germ line. We found the somatic genome to be compact (>68% coding, estimated from the sequence of several complete library inserts) and to feature uniformly small introns (18 to 35 nucleotides). This facilitated gene discovery: 722 open reading frames (ORFs) were identified by similarity with known proteins, and 119 novel ORFs were tentatively identified by internal comparison of the data set. We determined the phylogenetic position of<jats:italic>Paramecium</jats:italic>with respect to eukaryotes whose genomes have been sequenced by the distance matrix neighbor-joining method by using random combined protein data from the project. The unrooted tree obtained is very robust and in excellent agreement with accepted topology, providing strong support for the quality and consistency of the data set. Our study demonstrates that a random survey of the somatic genome of<jats:italic>Paramecium</jats:italic>is a good strategy for gene discovery in this organism.
Archived Files and Locations
|application/pdf 554.8 kB ||
|application/pdf 554.9 kB ||
|application/pdf 552.3 kB ||
access all versions, variants, and formats of this works (eg, pre-prints)