Why genes have such weird names
by C.A. Sharp
First published in Double Helix Network News, Summer 2009, Rev. May 2013
Remember the good old days when genes were so simple for a breeder to understand? They were dominant ecessive and occasionally incompletely dominant. We figured there was a gene for almost every trait, though a few were polygene so there wasn’t much you could do about them. And the names! The names were easy: A, B or maybe M. A really fancy one might be Tw. Then came all the genome research and scientists found that dogs don’t have a hundred thousand genes, but 20-30 thousand. And the names! ALX4, EPM2B, and HSF4 – where do they come up with these?
The truth is it was never simple. But years ago science new only a little and the average person—which included most dog breeders—knew even less and most of what was known didn’t have a direct bearing on what we do. Now that science is able to read the genetic code and has started figuring out exactly what genes do, there’s a huge amount of information out there that actually can be applied to not only breeding in general, but to specific breeds of dog.
It turns out that genes are part of a complex interconnected network. This network links not only genes, but other parts of the DNA and molecules within the cell that regulate and control the activity of genes. The names of the genes can tell us something about what roll they play in that network. Most of this knowledge isn’t something we are going to apply to our dog breeding efforts on a daily basis, but understanding why the old terminology we used to describe specific genes has changed can help us absorb new scientific findings that may have a direct impact on our dogs’ health and the choices we make as breeders. The more we understand, the better equipped we will be to breed better, healthier dogs.
Actually those old single-letter gene names were themselves short-hand for more descriptive terms. The names were almost always related to traits like coat color that were easy to identify. A, B, M and Tw were abbreviations for gene names: Agouti, Brown, Merle and Tweed (the gene for a variation of merle sometimes called “harlequin” in Australian Shepherds.)
I’ll be using bold capital letters to help the reader understand how the abbreviations are derived from the gene names. In actual usage they are not bold. The abbreviations, however, are always in caps – unless the researcher works with mice, in which case they capitalize only the first letter. (Some people just have to be different.) For our purposes as dog breeders we will go with what the dog researchers use, which is the all caps abbreviations.
Gene names in Drosophila, the fruit fly favored by generations of geneticists, often describe mutations associated with those genes: Buttonhead, wingless or hunchback. ALX4 (Aristaless-like homeobox 4) is a gene dogs and many other species share with Drosophila and the name is used for all these species even though the trait the name describes applies to insects: “Arista” refers to the bristle-like appendages on the end of the flies’ antennae.
As you can see with ALX4, abbreviations are still in vogue, though they’ve gotten longer. Our old favorite A is now ASIP, short for Agouti SIgnal Peptide. We can still use the old short-hand as a convenience amongst ourselves, but if we want to look up recent research on a particular gene we need to know the current scientific abbreviation.
So how is it they come up with these weird names? One of the most common naming conventions is to use the protein the gene produces. Remember B? It’s short for “brown,” or what we in dogs call variously liver, red, or chocolate. That gene is now called TYRP1 (TYrosinase Related Protein 1.) Another example among canine coat color genes is MLPH (MeLanoPHilan) which we are more familiar with as D, or Dilute. A third example is AP3 (Adaptor-related Protein complex 3.) A mutation of AP3 causes cyclic neutropenia, or “grey collie syndrome,” a lethal congenital blood disorder in collies that also features an unusual grey coat color.
You may have noticed that several of the genes mentioned so far have numbers as part of their names. There is a reason for that: Some genes belong to “families,” groups of genes with similar but slightly different end products. The number signifies which one of that family it is. The IGF (Insulin-like Growth Factor) family includes two genes. In humans these genes may be associated with eating disorders. A particular variation of IGF1 in dogs is associated with small body size. IGF2 also has an interesting canine connection: Whippets with one copy of an IGF2 mutation tend to be faster than those that lack it. However, having two copies makes the dog hyper-muscled, a trait referred to in the breed as “bully whippets” for their resemblance to the more muscular bully breeds, like American Staffordshire Terriers.
Remember the fruit fly gene ALX4? It’s part of what might be considered a sub-family (Aristaless-like homeoboxes) of a larger group of homeobox genes. Homeoboxes are a type of DNA sequence that regulates developmental patterns. Here’s another example from canine coat color: MC1R (MelanoCortin Receptor 1.) You’ll note that the numbers sometimes appear before the “R” for “receptor,” but this isn’t consistent. MC1R is our old familiar “E” (Extension,) variants of which can give a dog a facial mask or yellow color.
Genes may also be named for what they do: Remember the odd fruit fly gene names, like “headless,” mentioned earlier. HSF4 (Heat Shock transcription Factor 4) is a member of a gene family that activates another group of genes called heat response genes under conditions of heat or other types of stress. In dogs, we know HSF4 best for its association with cataracts in Boston Terriers, Staffordshire Bull Terriers and Australian Shepherds. Another function-named gene, MITF (MIcrophthalmia Transcription Factor) is associated with abnormally small eyes (microphthalmia) in some species though not, apparently, in dogs. In dogs it is another color gene, producing some, but not all, white spotting patterns. Our old favorite S (Spotting) isn’t a single gene after all. MITF is but the first to be identified.
Genes may be named because of association with a disease. EPM2B (Epilpesy Progressive Myoclonus 2B) causes a particular type of epilepsy specific to wirehair Dachshunds. The human version of this gene causes Lafora Disease, a lethal neurological disorder. However, the name of the human version of the gene, called a “homologue,” is NHLRC1, or NHL (Non-Hodgkins Lymphoma) Repeat Containing 1. In humans the gene is also associated with a particular form of lymphoma, hence the name.
Genes like EPM2B/NHLRC1 wind up with different names in different species because they were discovered independently by researchers. In some cases the researchers may have been investigating different problems in the same species. In many cases scientific bodies have designated a particular form of the name as official in a given species. Ultimately, to save confusion, this will probably be the case with all genes across species.
Most genes do have homologues in different species, particularly those that are closely related. It isn’t surprising that you would find homologues among different mammals, but we and our dogs share some genes with species that aren’t closely related at all, like insects. (Remember ALX4?)
Merle color – the old color gene designation is M – is the result of a version of the SILV (SILVer) gene. The name derives from the mouse, where it was determined to be the cause of the color variation of the same name. Because of the mouse origin of the name, you will often see it noted as “Silv.” SILV is associated with diluted color in a variety of species, including horses, cattle, and even chickens. But not all genes are shared, even among related species. Homologues of SILV are found in a number of species, including the chimpanzee, but it may be absent in humans.
For once we have a short, simple name in SILV. However, there is another name that is becoming less used but will be found in older research on this particular gene: PMEL17 (Pre-MELanosomal protein 17.) Melanocytes are pigment cells, so the gene is involved with the development of those cells. In the case of merle, something interrupts the developmental pathway and diluted pigment is produced on some areas of the body.
Merle dogs aren’t silver (or not completely, in the case of some blue merles) but if a gene is originally named in a different species – in this case the mouse – the name may be related to what was observed in that other species. In the case of SILV, it was the silver color of the mice. For our old friend ALX4 it was the lack of bristle-like features on the fruit fly’s antennae. Merle and silver are different, but similar, traits. However, I think it’s safe to say none of our dogs have bristles on their antennae!
A gene’s name may be changed as science learns more about it. There is one that we in Australian Shepherds are very familiar with that has recently undergone a name change. A mutation of MDR1 (Multi-Drug Resistance 1) has a mutated form that can cause severe reactions to some medications. A DNA test has been available for several years and is commonly used by dog owners in several collie-type breeds, including the Aussie, and a couple of sighthounds. The gene’s original name stems from cancer research. It was found to confer resistance to chemotherapy drugs. This is why it had the, to us, confusing name of Multi-drug resistance 1 when we associate it with increased drug sensitivity in our dogs.
Research on this gene is ongoing and scientists have recently discovered that what we call MDR1 is actually a member of a superfamily of genes called ATP (adenosine tri-phosphate) Binding Complexes, or ABC. ATP is the fuel for cell operations. No ATP, nothing happens. Our old familiar MDR1 is now ABCB1, for ABC family, B subfamily, Gene 1. The lab that offers the test is keeping the old MDR1 name to save confusion among the dog-owning public.
On a related note, even chromosomes have names. Different species have different numbers. While many genes are shared across species, the arrangements of genes on the chromosomes can be very different, so science has developed a short-hand method of describing chromosomes. Dog chromosomes are designated by CFA, followed by the number of the chromosome, or X and Y for the sex chromosomes. So why CFA? Canis FAmiliaris, the scientific name for the species. Therefore, dog chromosomes might be identified as CFA1, CFA32 or CFAX. The same system is used for other species: HSA – Homo Sapiens or BTA – Bos TArus (the cow.) So if you read something that says ASIP is on CFA24, you know that A, the agouti gene, is on your dog’s 24th chromosome.
All this alphabet soup can seem confusing, but there is a logic and purpose to it. Knowing a bit about how these names arose and what the abbreviations stand for can help us better understand the genes we manipulate when breeding dogs and, for those so inclined, make it easier for us to do deeper study of genes that are of particular interest to us.
The author would like to thank Drs. Sheila Schmutz, University of Saskatchewan; Danika Bannasch, University of California – Davis; and Katrina Mealey, University of Washington; for their assistance with the development of this article.