Subject: Re: nema genetic nomenclature-2
From: Leon Avery
Date: Tue, 29 Oct 2002 13:16:28 -0600
To: Paul Sternberg , Jonathan Hodgkin
CC: Paul Sternberg ,, Bhagwati Gupta , Eric Haag ,,,, Marie-Anne Felix , Victor Ambros , CGC advisory ,,,,,,, (Richard Durbin), Lincoln Stein ,, spieth John

question:  is Cb-tra-1 or Cbtra-1 preferable?

Cb-tra-1, definitely.  It puts a separator between different types of information.  Also, if the day ever arises when we need >3-letter gene classes, Cb-trac-1 can still be parsed, whereas Cbtrac-1 could be Cb-trac-1 or Cbt-rac-1.

Leon Avery                                        (214) 648-4931 (voice)
Department of Molecular Biology                            -1488 (fax)
University of Texas Southwestern Medical Center
6000 Harry Hines Blvd                  
Dallas, TX  75390-9148        

At 07:21 AM 10/29/2002 -0800, Paul Sternberg wrote:
Here is an attempt at convergence based on all the nice arguments.  I am particularly swayed by the wishes of those practicing intense classical genetics on other species (Ralf, Marie-Anne, Takao, Bhagwati, Eric), and thus think we need to use a dpy-LETTERSnumbers scheme for classicallydefined genes.
I propose that we submit a draft guidelines to the WBG, and call for comments, since I am sure we have not run this by all interested parties.  A final version will then be sent to the next Gazette, and submitted for publication. 

We still need some guidelines for orthologs. 

Revised genetic nomenclature for non-C. elegans nematode species  DRAFT 2002-10-29

Under the current nomenclature system for non-C. elegans nematode species, each gene class in a given species has a unique three-letter gene class name which does not overlap with other species. e.g. the C. briggsae equivalent of dpy is cby, and the equivalent of unc is mip. However, with over 1100 gene classes in C. elegans and an increasing number of species under study, this will soon become unmanageable.  We therefore propose an alternative nomenclature system that will allow genes with similar mutant phenotypes in other species to keep the same gene class name.

****Philosophy and constraints: 
a. From an informatician’s perspective, each genetic entity should have a unique name, and there should be an authority to maintain uniqueness. 
b. From a researcher’s perspective, the names should be easy-to-use and intuitive, and not generate confusing nicknames (think about what you would write on the side of your Petri plate). Subcommunities (e.g., those working on Pristionchus or briggsae) would tend to drop
c. If possible, the names should not stifle creativity.
d. From a classical geneticist’s point of view, there should be names that can be used for decades before the molecular identity of a locus is known.
e. From a molecular geneticist’s point of view, orthology should be obvious from the name.  f. However the name should not confuse relationships among genes.
g. Other species names should not crowd out those in C. elegans.

Uniqueness (a) is the overriding concern.  Ease of use is the second priority.  Dependingo n the researcher, (b,d)  or maximizing (e) and minimizing (f) is more important.

N. B.   There are millions of species. 
****The proposal:
1.  Orthologs will be given the same name but with a species prefix. For example, Cb-tra-1 for the C. briggsae ortholog of C. elegans tra-1.  In some cases, there will be paralogs and some confusion; we expect this to be minor compared to the convenience of having orthologs having the same names.  (question:  is Cb-tra-1 or Cbtra-1 preferable?)

2. When a gene is identified in another species that belongs to a gene class with a clear equivalent in C. elegans, it should be given the same gene class name, but with a unique symbol. The symbol will include one or more letters followed by a number.
For example, C. briggsae genes could be dpy-cb1 OR dpy-B1 OR dpy-CAENORHABDITIISBRIGGSAE000000001 etc.  The organism’s community should decide on the exact implementation; this choice will be tracked by the CGC or WormBase.  A species prefix can be added but will be redundant, e.g., Cb-dpy-cb1 OR Cb-dpy-B1

3.  Gene classes with no equivalent in C. elegans or other species will be given unique three-letter-number names.  There could be a standard name class for “classically-defined genes” such as cdg- or gen- or ?

4.  For alleles, strains, polymorphisms, rearrangements, transgenes, and other variants, unique numbers (unique across all species) will be assigned by the relevant laboratory using the standard C. elegans nomenclature.  In all cases, a species prefix can be used, but is redundant.  For example, “syIs701” is an integrated transgene in C. briggsae from the Sternberg laboratory; it could be referred to as Cb-syIs701. In this case, syIs701 will never be used for something else, especially a C. elegans transgene.

Responsibility for the numbering of a gene class will reside with the assigning laboratory, unless transferred by them to WormBase and the Caenorhabditis Genetics Center.  (As in the present practice, in some cases, if desirable, a small block of numbers can be assigned to another laboratory.)

Jonathan Hodgkin, Paul Sternberg, Ralf Sommer, David Baillie, Marie-Anne Felix, Donald Riddle, Takao Inoue, Bhagwati Gupta, Eric Haag, Erik Jorgensen, Iva Greenwald, Susan Strome, Victor Ambros, ….
TO BE CONSULTED:   Ronald Ellis, David Fitch, Mark Viney, Mark Blaxter, Jim McCarter,….others?

A.  Should we allow more letters?

****Arguments (not for inclusion in the draft):
dpy(e1111)(e1112) is not viable for the e1112 allele of dpy(e1111). 

Subcommunities can still talk in shorthand.
For example,  would use dpy-1 rather than dpy-CAENORHABDITIISBRIGGSAE000000001 just as I might write ‘d1’ on a plate. 

An alternative to the letter/number scheme is to start with dpy-1000 for other species.

If someone has a particularly nice example of real alleles, that would be good.