March 1, 2018
March 1, 2018

The largest family tree to date, which includes 13 million people dating back 11 generations and 500 years, offers new insights into marriage and death, and everything comes from public data.
The tree was created by a team led by Yaniv Erlich, a computer scientist at Columbia University who is also the scientific director of the genealogy company MyHeritage. Erlich's team downloaded 86 million public profiles from the ancestry site Geni.com (which is owned by MyHeritage). Many small family trees appeared, along with a huge one with 13 million people; around 85 percent are from the western world. The tree, which is available online, includes data (anonymous) on when and where they all died. When Erlich's team analyzed the data to find trends related to marriage and death, they discovered that genetics may play a smaller role in longevity than we think, and the advent of mass transit was not the only reason why We started to marry people outside the family. The results were published today in the journal Science.

In the previous family tree of 6,000 people, people spanning seven generations are represented in green and marriage in red. Image: Columbia University

The project was possible partly because the Geni platform allows users to merge trees. "So if you put your tree and I put mine, and we share an Uncle Albert, the website would offer to merge the trees to create a much larger tree," says Erlich. In this way, your team did not have to start from scratch, but could rely on the work of the people who use the site. After downloading the raw data, the challenge was to clean it up and make sure it did not include results that were biologically impossible, such as people with three parents. If the data was not clean, they could not run algorithms to analyze the information.
For the analysis, the researchers focused on two topics: how long we live and who we choose to marry. By measuring the birthplace of husbands and wives and tracking that over time, they discovered that, as expected, before the Industrial Revolution, most Americans married someone less than six miles from where they were born. It is likely that this person is a relative, a fourth cousin on average, says Erlich. After the Industrial Revolution, when transportation became more common, people began to marry those who were born farther away and who were more distantly related. (In 1950, people found their spouses within 60 miles of where they were born).
But the pattern shows that it's not just about transportation. Between 1800 and 1850, people traveled more and moved to cities en masse, but the genealogical distance remained the same: in other words, people were still marrying their relatives. "This suggests that the advent of mass transit and the rail system was not the only reason why people decided to marry their cousins," says Erlich. "There is a gap between the two, so it is likely that cultural factors also cause people to start marrying outside their group."
Then, to see death. The researchers analyzed the life expectancy of 3 million relatives who were born between 1600 and 1910 and who lived beyond the age of 30 (the data did not include twins and people who died in wars). Genes obviously play a role in longevity: someone with a gene That makes them more likely to have cancer will probably have a shorter life, but environmental factors also matter a lot. When comparing the life of each person with that of their relatives, they discovered that the genes are responsible for about 16 percent of the variation in how long they lived. Peter Visscher, a quantitative geneticist at the University of Queensland who was not involved in the study, said he would have guessed that the genes were responsible for 10 to 20 percent of the longevity, which is in line with the authors' report, although Some estimates given numbers of up to 30 percent.
The results also suggest that the genes that influence longevity act independently rather than interacting with each other, a question that has been a major debate in the field of genomics. If the genetic variants worked together, there would be a greater correlation in life expectancy among relatives that are more closely related. For example, the correlation in life expectancy should increase very rapidly between two sibling cousins ​​compared to two identical twins. But that pattern did not appear in the data.
Because the data is available for free online, there are many different questions that could help answer in the future, says Erlich, such as how migration affects fertility. In addition, MyHeritage now offers DNA testing. So, if Geni.com users loaded genetic data that matched the genealogy, scientists could answer even more questions about nature and nutrition, Visscher wrote in an email to The Verge.
Meanwhile, it is really beautiful to see so many lives scattered all over the world displayed on an interlaced map.

There are 70,000 relatives shown in the previous family tree, connected through marriage (in red) and shared ancestors. Image: Columbia University


