Hamadryas baboon

Genome assembly: Pham_1.0

This release features the first preliminary assembly of the common baboon (Papio hamadryas) genome, Pham_1.0, provided by the Baylor College of Medicine Human Genome Sequencing Center in November 2008. The genome was sequenced to approximately 5.3x coverage, corresponding to a total of about 16Gb sequence from 47.6 million reads, which include Sanger whole genome shotgun reads, 454 fragment reads and 454 paired end reads from small insert clones.

The combined sequence reads were assembled using the Atlas genome assembly system into a set of contigs and scaffolds. There are 387373 scaffolds in this assembly, representing sequence contigs that can be ordered and oriented with respect to each other or isolated contigs that could not be linked. The total length of the assembly is 2.87Gb with 125 Mb of gaps. The N50 of the contigs is 7.1 kb and the N50 of the scaffolds is 85 kb. (The N50 size is the length such that 50% of the assembled genome lies in blocks of the N50 size or longer.

Display your data in Pre

Gene annotation

What can I find? Protein-coding and non-coding genes, splice variants, cDNA and protein sequences, non-coding RNAs.

Annotation

Preliminary gene annotation in baboon has been generated by alignments of proteins from two different sources: Ensembl human proteins from May 2009 genebuild (Ensembl release 55, GRCh37 assembly) as well as Papio Hamadryas baboon-specific proteins obtained from UniprotKB and NCBI; 12524 and 304 baboon models were predicted respectively. For putative gene models generated by the alignment of Ensembl human proteins, links to the supporting evidence on Ensembl human protein summary page are provided. In addition to preliminary gene predictions, ab initio gene predictions, alignment of sequences from several public databases (e.g. UniGene, EMBL Vertrna, Uniprot) as well as baboon cDNA from NCBI can also be viewed at this site.

Genome statistics

Assembly: Pham, Nov 2008
Database version: 75
Base Pairs: 2,761,844,948
Golden Path Length: 2,867,548,133
Genscan gene predictions: 76,208