Subject: nomenclature
From: Paul Sternberg
Date: Mon, 22 Jul 2002 16:14:22 -0700
To: jonathan.hodgkin@bioch.ox.ac.uk, dbaillie@gene.mbb.sfu.ca, felix@ijm.jussieu.fr (Marie-Anne Felix), ralf.sommer@tuebingen.mpg.de
CC: tinoue@its.caltech.edu, Bhagwati Gupta , pws@caltech.edu

We have started to clone some C.briggsae mutations for which there are clear elegans orthologs, and have thought about names. What do you think of the following general nomenclature for other nematodes? Thanks.
Paul

DRAFT1:Possible revision of gene names for non-C. elegans nematode species.

Under the current nomenclature system for non-C. elegans nematode species, each gene class in a given species has a unique three-letter gene class name which does not overlap with other species. e.g. the C. briggsae equivalent of dpy is cby, the equivalent of unc is mip. However, with over 1100 gene classes in C. elegans and an increasing number of species under study, this will soon become unmanageable. We therefore propose an alternative nomenclature system, one which will allow genes with similar mutant phenotypes in other species to keep the same gene class name.


When a gene is identified in another species that belongs to a gene class with a clear equivalent in C. elegans, it should be given the same gene class name, but with a species identifier in square brackets as a postfix. For example, a new Tra mutation in C. briggsae could be named "tra[Cb]-1", and a new Unc mutation in P. redivivus could be named "unc[Pr]-1".

Gene classes with no equivalent in C. elegans or other species will be given unique identifiers. The postfix indicates the equivalent gene class but does not indicate orthologous relationship. In other words, C briggsae genes tra[Cb]-1, tra[Cb]-2 and tra[Cb]-3 do not necessarily correspond to C. elegans tra-1, tra-2 and tra-3.

The preexisting system of indicating orthology by a prefix (e.g. Cb-tra-1 for briggsae ortholog of C. elegans tra-1) will be retained and used in parallel. If tra[Cb]-1 is discovered to be the C. briggsae ortholog of C. elegans tra-2, it would be renamed Cb-tra-2 and tra[Cb]-1 would be treated as a synonym.


Although this system would lead to similar names with different meanings (e.g. tra[Cb]-1, Cb-tra-1, Ce-tra[Cb]-1 etc.), it should be easy to follow if one keeps in mind that prefix indicates orthology at the gene level and postfix indicates the equivalent gene class. Preexisting species-specific gene names like cby could be renamed (dpy[Cb]) or retained.

Discussion
Bhagwati suggests that gene numbers should start from 101 or some other high number to avoid confusion (e.g. between Cb-lin-11 and lin[Cb]-11). (Or, higher with unc and let).

Takao prefers low numbers for several reasons.
1) High number would be redundant with the postfix.
2) Three digit gene names like let-617 are hard to remember.
3) We can start tra[Cb] with 101, but if we start other species with 101 also, there would be confusion among non-C. elegans species. If we assign different number blocks to different species, numbers would get very large very soon. Also, this would require someone to keep track of which numbers correspond to which species.
4) If we assign an unique number block for C. briggsae, the tendency would be to abbreviate the postfix, which could cause problems later, especially if other species have the same number block. Having overlapping numbers would enforce the use of postfix.
5) Gene names would get very long with the postfix and three-digit gene numbers.
Takao is most concerned with the redundancy and awkwardness of three-digit gene names.

26 squared is 676 and 26 cubed is 17,576; neither is sufficient.