Washington University School of Medicine SNP Research Facility
Google Research
FP-TDI SNP genotyping


[Archived] SNP Discovery in C. briggsae (build 3)

A major aim of the C. briggsae genetic map consortium is to develop a reliable set of single nucleotide polymorphisms (SNPs) for the organism. The current release contains 42,730 SNPs from the "HK104" mapping strain.

Build Numbers for SNP Releases
For clarity and tracking purposes, build numbers have been initiated for releases of C. briggsae SNPs. Two sets of SNPs that were previously released were labeled retroactively; both used the set of 13,632 HK104 sequence traces. "HK104 build 1" SNPs were identified by extracting substitution polymorphisms from CrossMatch output with quality scores of at least 30. In the "HK104 build 2" release I added SNPs discovered by older builds of Polyphred and Ssaha-SNP (the latter results provided by Jim Mullikin). Both of these releases used a supercontig-based reference genome (AF16) from GSC/Wormbase (cb25/agp8) that I modified using some updates from Sanger (update.041011). There were no previous releases for VT847 strain SNPs, but the same build number as HK104 will be used to avoid avoid confusion.

Current Release (build 3) from SSAHA-SNP Results
In the current release, build 3, SNP discovery was performed on shotgun sequence traces of strains HK104 and VT847. In this round only the ssahaSNP program (SSAHA2) was used, and found to be robust, efficient, and user-friendly. Unfortunately Polyphred (v5.04) and Polybayes (v3.0) were unable to run efficiently when the entire read set and reference genome sequences were provided as input. The reference genome used for SNP discovery was obtained from Wormbase (cb25/agp8) which is organized by ultra (fingerprint) contig. The flanking sequences for build 3 SNPs were repeat-masked to lower case by RepeatMasker with a customized C. briggsae repeat library. It should be cautioned that nearby SNPs have NOT yet been marked in the flanking sequences for this build.

SNP Discovery Results (build 3)
  HK104   VT847
Sequence traces examined   13,632   14,976
Unique SNP loci detected   42,730   35,005
SNPs in repetitive regions   12,464   10,735
SNPs in homopolymer runs   3,974   3,018

Processing and Integration of C. briggsae HK104 SNPs
Flanking sequences from the 25,317 HK104 build 2 SNPs were used to merge as many as possible with the build 3 set. Despite the differences in methods and the reference genome, 11,762 SNPs from build 2 are included in build 3 using their original SNP identifiers ("cbXXXXX"). The remaining SNPs in build 3 were assigned new SNP ID's starting at cb40000. Next, the full set of HK104 build 3 SNPs was integrated with the recombination map by cross-referencing them with the C. briggsae Genetic Map (v3.1) yielded 22,511 whose ultracontigs were included in the genetic map. The chromosome and genetic distance(s) for the ultracontig are provided with each SNP. The remaining 20,219 SNPs on ultracontigs that have not been genetically mapped were labeled with chromosome "CbUn" and a zero value for genetic distance.

Download:   C. briggsae HK104 SNPs (build 3)

Sequencing Services Genotyping Services HapMap Project Informatics Services

Copyright 2007, Washington University School of Medicine SNP Research Facility. All rights reserved.
Legal   Contact   Site Map