Pan troglodytes

Genome assembly: Pan_troglodytes-2.1.3

This site displays version 2.1.3 of the chimpanzee genome assembly (known as Pan_troglodytes-2.1.3). This assembly covers about 97 percent of the genome and is based on 6X sequence coverage. It is composed of 192,898 contigs with an N50 length of 44kb and 33,990 supercontigs with an N50 length of 8.4Mb. The whole genome shotgun data from primary donor-derived reads (Clint, a captive-born male chimpanzee from the Yerkes Primate Research Center (Atlanta, USA)) were assembled using PCAP (Huang 2006) using stringent parameters derived by eliminating detectable global mis-assemblies (interchromosomal cross-overs determined by alignment of the chimpanzee genome against the human genome) larger than 50kb.

The full gene build of the most recent Chimpanzee assembly, Pan_troglodytes-2.1.4 can be found on our main website.

Gene annotation

What can I find? Protein-coding and non-coding genes, splice variants, cDNA and protein sequences, non-coding RNAs.


This preview site includes alignments of chimpanzee EST and cDNA sequences and human cDNA sequences downloaded from the NCBI. Alignments were generated using Exonerate and those with a coverage or percent identity less than 97% were removed, leaving: 8,894 chimpanzee cDNA, 46,222 chimpanzee EST and 1,469,876 Human cDNA alignments. Preliminary gene annotation was generated by aligning chimpanzee proteins from RefSeq and UniProtKB. From these two sources, 1517 coding gene models were predicted. In addition, ab initio gene predictions and alignment of sequences from several public databases (e.g. UniGene, EMBL Vertebrate RNA, UniProtKB) are available.

Genome statistics

Assembly: Pan_troglodytes-2.1.3, Oct 2010
Database version: 75
Base Pairs: 2,996,674,299
Golden Path Length: 3,307,943,878
Genscan gene predictions: 56,616