Genome variations are differences in the sequence of DNA from one person to the next. Just as you can look at two people and tell that they are different, you could, with the proper chemicals and laboratory equipment, look at the genomes of two people and tell that they are different, too. In fact, people are unique in large part because their genomes are unique.
How different is one human genome from another?
The more closely related two people are, the more similar their genomes. Scientists estimate that the genomes of non-related people—any two people plucked at random off the street—differ at about 1 in every 1,200 to 1,500 DNA bases, or “letters.” Whether that’s a little or a lot of variation depends on your perspective. There are more than three million differences between your genome and anyone else’s. On the other hand, we are all 99.9 percent the same, DNA-wise. (By contrast, we are only about 99 percent the same as our closest relatives, chimpanzees.)
Most genome variations are relatively small and simple, involving only a few bases—an A substituted for a T here, a G left out there, a short sequence such as CT added somewhere else, for example. Your genome probably doesn’t contain long stretches of DNA that someone else’s lacks.
If the genome were a book, every person’s book would contain the same paragraphs and chapters, arranged in the same order. Each book would tell more or less the same story. But my book might contain a typo on page 303 that yours lacks, and your book might use a British spelling on page 135—”colour”—where mine uses the American spelling—”color.”
If every human genome is different, what does it mean to sequence “the” human genome?
The complete human genome sequence announced in June 2000 is a “representative” genome sequence based on the DNA of just a few individuals. The scientific paper was published in the February 16, 2001 issue of Science. Over the longer term, scientists will study DNA from many different people to identify where and what variations between individual genomes exist. Sequencing a genome is such a Herculean task that capturing its person-to-person variability on the first pass would be next to impossible.
But that doesn’t mean that the representative sequence we have now will be useless—far from it. The vast majority of the genome’s sequence is the same from one person to the next, with the same genes in the same places. In other words, my genome is a pretty good approximation of yours, and if scientists sequenced your genome they would learn a lot about mine. Moreover, since every person’s genome is unique, no one person is any more or less “representative” than any other and it hardly matters whose genome is sequenced first.
Why is every human genome different?
Every human genome is different because of mutations—”mistakes” that occur occasionally in a DNA sequence. When a cell divides in two, it makes a copy of its genome, then parcels out one copy to each of the two new cells. Theoretically, the entire genome sequence is copied exactly, but in practice a wrong base is incorporated into the DNA sequence every once in a while, or a base or two might be left out or added. These mistakes—”changes” might be a more accurate word, because they are not always bad news—are called mutations.
When a mutation occurs in a sex cell—a sperm or an egg—it can be passed along to the next generation of people. Your genome contains about 100 “new” mutations—changes that occurred as your parents’ bodies made the egg and sperm cells that became you. These genome variations are uniquely yours. Other variations in your genome arose many generations ago and have been passed down from parent to child over the years, until they ended up in you. You probably share each one of these older variations with many other people all over the world, but still, no one else has the exact same combination of variations that you have.
Where are genome variations found?
Variations are found all throughout the genome, on every one of the 46 human chromosomes. But this variation is by no means distributed evenly: It’s not as if there is one difference every 1,000 bases as regular as rain. Instead, some parts of the genome are “hot spots” of variability, with hundreds of possible variations of a sequence. Other parts of the genome, meanwhile, don’t vary much at all between individuals—in scientific parlance, they are said to be “stable.”
The majority of variations are found outside of genes, in the “extra” or “junk” DNA that does not affect a person’s characteristics. Mutations in these parts of the genome are never harmful, so variations can accumulate without causing any problems. Genes, by contrast, tend to be stable because mutations that occur in genes are often harmful to an individual, and thus less likely to be passed on.
What kinds of genome variations are there?
Genome variations include mutations and polymorphisms. Technically, a polymorphism (a term that comes from the Greek words “poly,” or “many,” and “morphe,” or “form”) is a DNA variation in which each possible sequence is present in at least 1 percent of people. For example, a place in the genome where 93 percent of people have a T and the remaining 7 percent have an A is a polymorphism. If one of the possible sequences is present in less than 1 percent of people (99.9 percent of people have a G and 0.1 percent have a C), then the variation is called a mutation.
Informally, the term mutation is often used to refer to a harmful genome variation that is associated with a specific human disease, while the word polymorphism implies a variation that is neither harmful nor beneficial. However, scientists are now learning that many polymorphisms actually do affect a person’s characteristics, though in more complex and sometimes unexpected ways.
About 90 percent of human genome variation comes in the form of single nucleotide polymorphisms, or SNPs (pronounced “snips”). As their name implies, these are variations that involve just one nucleotide, or base. Any one of the four DNA bases may be substituted for any other—an A instead of a T, a T instead of a C, a G instead of an A, and so on.
Theoretically, a SNP could have four possible forms, or alleles, since there are four types of bases in DNA. But in reality, most SNPs have only two alleles. For example, if some people have a T at a certain place in their genome while everyone else has a G, that place in the genome is a SNP with a T allele and a G allele.