On Tue, 29 Oct 2002, Paul Sternberg wrote:
We still need some guidelines for orthologs.
Here's one idea. Say that Pristionchus pacificus has three
genes, paralogous to one another, but, as a group, orthologous to
let-60 in C. elegans (this is probably wildly counterfactual, but
let's ignore that for the sake of discussion!). Then we could use
the following scheme for naming them:
Ppa-let-60.1
Ppa-let-60.2
Ppa-let-60.3
which would keep both the intragenomic paralogy and the intergenomic
orthology clear, and would be easy to keep track of with respect to
other genes and genomes.
Two letters for a species will probably not be enough. How
about three letters, the first being taken from the genus, the
second two from the species? So, "Cbr" for C. briggsae, "CCB" for
Caenorhabdtitis CB5161 [note that there's no classical species name
yet for this, or for Caenorhabditis PS1010], "Ppa" for Pristionchus
pacificus, etc.
(If we're really determined to not stumble over those millions
of species, maybe we can go to six letters: "Caebri" for C.
briggsae, "Pelpun" for Pelodera punctata, "Pripac" for Pristionchus
pacificus, etc.)
(question: is Cb-tra-1 or Cbtra-1 preferable?)
As Marie and Leon have both said already, "Cb-tra-1" (or
"Cbr-tra-1", or "Caebri-tra-1") -- it's clearer, and easier to
revise.
There could be a standard name class for “classically-defined
genes” such as cdg- or gen- or ?
This has to be the ultimate molecular biologist's way of
classifying genetic loci. But, yes, "gen-" might well be a useful
catch-all for those pesky actual mutations that haven't been
properly cloned and sequenced yet.
An alternative to the letter/number scheme is to start with dpy-1000 for
other species.
In practice this probably fails the "write on plate" test.
--Erich Schwarz