With hundreds of thousands of plant species in the world today, researchers at the University of Arizona are building a road map of the history and evolution of plant life through gene sequencing.

Using genetic data from more than 1,100 species, the team helped to create the most comprehensive evolutionary tree for green plants to date.

The One Thousand Plant Transcriptomes Initiative, or 1KP, is a global collaboration of nearly 200 scientists to examine the diversification of plant species over the last 1 billion years. All of the plants living in the world today have evolved from a common ancestor. With this project, researchers hope to shed some light on this leap in biodiversity.

For the past nine years, the team has collected data on gene families that have contributed to the evolution of green algae, mosses, ferns, conifers, flowering plants and all other lineages of green plants.

“In the tree of life, everything is interrelated,” said Gane Ka-Shu Wong, lead investigator and professor in the University of Alberta Department of Biological Sciences. “And if we want to understand how the tree of life works, we need to examine the relationships between species. That’s where genetic sequencing comes in.”

For Mike Barker, a professor of ecology and evolutionary biology at the UA and one of the study’s lead authors, examining the genomes of these plants allows researchers to answer important questions about the evolution of plant life. According to Barker, some of these gene families have duplicated over millions of years, causing new species of plants to form.

“Whole genome duplications are actually really common in plants,” Barker said. “This is also called polyploidy and it’s basically where instead of having a diploid individual, like you and I, with one chromosome from mom and one chromosome from dad, polyploid individuals have at least two chromosomes from mom and two from dad.”

One goal of the project is to identify a potential connection between these genetic duplications and plant innovations that have happened over time, such as the ability to grow tall and the development of flowers and seeds.

When the first plant genome, called Arabidopsis, was sequenced in 2000, researchers found that it had five chromosomes and saw evidence of three different rounds of genome duplication throughout its history.

“Since then, we’ve had sort of these glimpses of duplication events deep in the history of plants, but we’ve never had a single framework to analyze and figure out exactly when and where these things happen,” Barker said. “The other big question that we’ve been interested in is not only knowing that the duplication happened, but really how many times it happened in the history of different species. Every time these events happen, they double all of the genes and then things go kind of crazy.”

Researchers believe these duplication events are correlated with some big variations in flowering plants over time. There’s also evidence that plants with duplicated genomes are more likely to survive through harsh environmental changes. For example, most of the plants that survived the dinosaur-killing asteroid that hit Earth 66 million years ago were polyploids.

“A lot of these events that we’re seeing are from some of these periods of mass extinction and climate change,” Barker said.

“So, figuring out precisely where these happen and how many times a species has experienced them in their history could tell us something about not only what happened over the last 100 million years or 200 million years, but also potentially where all of the diversity and the structure of the genome comes from today.”

In related research, Barker and other plant scientists found that polyploids have faster niche shifts, meaning they are more versatile and can better tolerate environmental changes.

One of the biggest surprises of the study, Barker said, was that they found no evidence of genome duplications in algae.

“We found that the average flowering plant genome has nearly four rounds of ancestral genome duplication dating as far back as the common ancestor of all seed plants more than 300 million years ago,” he said. “We also find multiple rounds of genome duplication in fern lineages, but there is little evidence of genome doubling in algal lineages.”

Being able to access a variety of plant species was another crucial aspect of the project. While genome data is available for some plants already, the researchers wanted to use plants that were difficult to find or haven’t been studied as much as others.

“We sequenced so many transcriptomes in all these plant genomes from all corners of the plant tree of life,” Barker said. “So instead of just having crop species, which is most of what we have because it’s economically important, we went out and got all sorts of weird stuff.”

Along with the work of researchers, super computers at the UA also played a key role in the project. The team used high-performance computer facilities at the university to process the genetic sequences from plant samples and map the data onto more than a half-million “family trees” showing the relationship among gene families.

UA undergraduates Thomas Kidder, Sally Galuska and Chris Reardon worked closely with Zheng Li, a doctoral student in Barker’s lab, to analyze hundreds of thousands of gene trees.

The team has been making all of this data available for other scientists to use with the help of CyVerse, a national project providing computational infrastructure and data science training for life sciences research housed at the University of Arizona. CyVerse, which used to be called iPlant, was originally designed to support and enable the 1KP project, said Ramona Walls, CyVerse senior science informatician.

“Seeing 1KP come to fruition is almost like watching your kid graduate from college,” she said. “It is especially gratifying to have all of the data publicly available in one place on the CyVerse Data Commons, because it means that even more science can continue to come out of the project.”

The genome data that they’ve collected and published on CyVerse has been used in a variety of research, even outside of plant sciences, Barker said. For example, neuroscientists have used the diverse photoreceptors in plants, which are used to perceive light, to study brain function in other organisms.

“We brought together a huge team of people to sequence all of these things and it tells us a bunch of new stories about how plants may have evolved,” Barker said. “It really provides us with, for the first time, a genomic road map of the history of plant life.”


Become a #ThisIsTucson member! Your contribution helps our team bring you stories that keep you connected to the community. Become a member today.

Contact reporter Jasmine Demers at jdemers@tucson.com

On Twitter: @JasmineADemers