This release features the latest assembly of the fugu (Takifugu rubripes) genome, FUGU5, provided by The Fugu Genome Sequencing Consortium in October 2011. The assembly is composed of 22 autosomal chromosomes. The assembly contains a total sequence length of 391Mb with 40Mb of gaps. There are 7,091 scaffolds comprised of 30,861 contigs with a scaffold N50 of 929kb and a contig N50 of 52.8kb. The N50 size is the length such that 50% of the assembled genome lies in blocks of the N50 size or longer.
The full gene build on an older Fugu assembly, FUGU4.0, can be found on our main website.
What can I find? Protein-coding and non-coding genes, splice variants, cDNA and protein sequences, non-coding RNAs.
Preliminary gene annotation in fugu has been generated by a combination of alignments of nucleotide and protein sequences from a number of different sources. We aligned 20,318 human and 18,523 fugu translations from Ensembl release 77 to provide 5,600 and 18,259 gene models respectively. In addition to these, we also made 1,006 models by aligning fugu proteins from UniProt and RefSeq. The mitochondrion and its annotation has been imported from RefSeq. Alignments of 23,783 fugu EST sequences and 1,145 fugu cDNA sequences are also included in this preview site. Ab initio gene predictions and alignments of sequences from UniProt, UniGene and the ENA vertebrate RNA collection are also provided.
We produced RNASeq-based gene models and an indexed BAM file for each of the three tissue samples used by the RNASeq pipeline and also for the merged data from all tissues. Each RNASeq-based gene model represents only the best supported transcript model. We did a BLASTp of these transcript models against UniProt proteins of protein existence level 1 and 2 in order to annotate the open reading frame. The best BLAST hit is displayed as a transcript supporting evidence.
The tissue-specific sets of transcript models built using our RNAseq pipeline are as follows:
|Tissue||Number of gene models|
|Assembly:||FUGU5, Oct 2011|
|Golden Path Length:||391,484,715|
|Genscan gene predictions:||30,293|