This is the preliminary display of the GRCh38 assembly of the human genome (Homo sapiens, GCA_000001405.15), produced in December 2013 by the Genome Reference Consortium. It consists of 24 chromosomes (1-22, X and Y), 127 unplaced scaffolds and 42 unlocalized scaffolds. GRCh38 contains 261 alt loci scaffolds (including haplotypes for the MHC region on chromosome 6 and LRC region on chromosome 19), in 35 alternate assembly units. 72 of these alternate loci were previously available as NOVEL patches to GRCh37.
The N50 of the contigs of the submitted assembly is 56.4 Mb and the N50 of the scaffolds is 67.8 Mb. The N50 size is the length such that 50% of the assembled genome lies in blocks of the N50 size or longer. Modeled centromere sequences have been incorporated.
What can I find? Protein-coding and non-coding genes, splice variants, cDNA and protein sequences, non-coding RNAs.
Preliminary transcript structures based on available human protein sequences are shown along with structures based on projections from Ensembl release 75 human gene set (GRCh37) and RefSeq genes from February 2014. In addition, alignments of human cDNA and EST sequences are provided as well as ab initio predictions and alignments of sequences from UniProt, UniGene and the ENA vertebrate RNA collection.
|Assembly:||GRCh38, Dec 2013|
|Golden Path Length:||3,099,750,718|
|Preliminary transcript models:||128,417|
|Imported RefSeq genes:||26,670|
|GENCODE 19 genes projected from GRCh37:||61,349|
|Genscan gene predictions:||50,117|
|Projected short variants:||67,990,369|
What can I find? Short sequence variants from Ensembl release 75 projected to the GRCh38 assembly. In addition, variation consequences for variants overlapping transcripts that have been projected from Ensembl release 75 to the new assembly and variation consequences based on RefSeq transcripts are provided. Variation data is available as GVF and VCF data dumps.