TY - JOUR
T1 - Improved reference genome for the domestic horse increases assembly contiguity and composition
AU - Kalbfleisch, Theodore S.
AU - Rice, Edward S.
AU - DePriest, Michael S.
AU - Walenz, Brian P.
AU - Hestand, Matthew S.
AU - Vermeesch, Joris R.
AU - O′Connell, Brendan L.
AU - Fiddes, Ian T.
AU - Vershinina, Alisa O.
AU - Saremi, Nedda F.
AU - Petersen, Jessica L.
AU - Finno, Carrie J
AU - Bellone, Rebecca
AU - McCue, Molly E.
AU - Brooks, Samantha A.
AU - Bailey, Ernest
AU - Orlando, Ludovic
AU - Green, Richard E.
AU - Miller, Donald C.
AU - Antczak, Douglas F.
AU - MacLeod, James N.
PY - 2018/12/1
Y1 - 2018/12/1
N2 - Recent advances in genomic sequencing technology and computational assembly methods have allowed scientists to improve reference genome assemblies in terms of contiguity and composition. EquCab2, a reference genome for the domestic horse, was released in 2007. Although of equal or better quality compared to other first-generation Sanger assemblies, it had many of the shortcomings common to them. In 2014, the equine genomics research community began a project to improve the reference sequence for the horse, building upon the solid foundation of EquCab2 and incorporating new short-read data, long-read data, and proximity ligation data. Here, we present EquCab3. The count of non-N bases in the incorporated chromosomes is improved from 2.33 Gb in EquCab2 to 2.41 Gb in EquCab3. Contiguity has also been improved nearly 40-fold with a contig N50 of 4.5 Mb and scaffold contiguity enhanced to where all but one of the 32 chromosomes is comprised of a single scaffold.
AB - Recent advances in genomic sequencing technology and computational assembly methods have allowed scientists to improve reference genome assemblies in terms of contiguity and composition. EquCab2, a reference genome for the domestic horse, was released in 2007. Although of equal or better quality compared to other first-generation Sanger assemblies, it had many of the shortcomings common to them. In 2014, the equine genomics research community began a project to improve the reference sequence for the horse, building upon the solid foundation of EquCab2 and incorporating new short-read data, long-read data, and proximity ligation data. Here, we present EquCab3. The count of non-N bases in the incorporated chromosomes is improved from 2.33 Gb in EquCab2 to 2.41 Gb in EquCab3. Contiguity has also been improved nearly 40-fold with a contig N50 of 4.5 Mb and scaffold contiguity enhanced to where all but one of the 32 chromosomes is comprised of a single scaffold.
UR - http://www.scopus.com/inward/record.url?scp=85060566078&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85060566078&partnerID=8YFLogxK
U2 - 10.1038/s42003-018-0199-z
DO - 10.1038/s42003-018-0199-z
M3 - Article
C2 - 30456315
AN - SCOPUS:85060566078
VL - 1
JO - Communications Biology
JF - Communications Biology
SN - 2399-3642
IS - 1
M1 - 197
ER -