Some plants are very sensitive to stresses, while others are hardy and continue to grow unperturbed when the field gets hotter or drier.
The increasing stresses on agriculture from an accumulation of stresses such as heat and drought are increasing the pressure on growers and breeders to use stable crop varieties that can continue to produce good yields in these adverse conditions.
Figure 1: Abbildung 1: Camelina sativa – Leindotter © Fornax, Wikipedia
Only with the right crop varieties can we continue to grow crops that are as productive and resource-efficient as possible in a changing climate (see “Growing the right crop varieties protects the environment and secures our food supply” in this blog). To do this, people are also revisiting varieties and species that were grown a long time ago. These varieties are said to have a higher stress tolerance. Like rapeseed, gold of pleasure is an oilseed that was cultivated for many centuries and then displaced by the higher-yielding rapeseed. However, gold of pleasure exhibits high natural stress tolerance to high temperatures or drought. Understanding the mechanisms of stress tolerance in the ancient traditional oilseed gold of pleasure (Camelina sativa) is the goal of the UNTWIST project (short for “Uncover and Promote Tolerance to Temperature and Water Stress in Camelina sativa”). Eight partners from six different EU countries are working together on this project. UNTWIST, which is funded by the EU, is investigating the stress response of at least 50 different flax dock varieties, which include wild varieties, landraces and commercial cultivars, for their tolerance to drought and heat.
The aim is to identify cultivars that can be used to breed particularly robust varieties for the future.
In the process, very large amounts of data will be generated at the various research sites in many different studies, which will be recorded and interpreted in their entirety. Therefore, the project needs a common storage location, a “database”, in which all information and data are stored together. In addition, the researchers in the project need a common language so that, for example, conditions, locations, varieties and plant parts and samples are always named in the same way and can be found again by all project partners and, in the future, also by other researchers.
The database is therefore also created strictly according to the so-called “F.A.I.R. procedure”, whereby attention is paid to the data being findable, accessible, interoperable and reusable (Findable, Accessible, Interoperable and Reusable = FAIR). In Jülich, in the institute’s IBG-4 division, the resulting large data sets from the various partners are to be organized for the project and linked in a database, the “UNTWIST knowledge hub”, and made retrievable and usable for everyone. The project will collect large material sets of samples and subsequently very large amounts of genome data, gene expression data, data on plant constituents (for example oil composition) and also on their external appearance and stress responses and yield under different environmental and site conditions so that in the end a picture of the overall response of plants in many processes will be possible.
It is impossible to keep track of these data without an appropriate database: Gold-of-pleasure, for example, has a comparably large genome, i.e. a large amount of the hereditary substance “D.N.S.” (desoyribonucleic acid) namely 641 “megabases” = 641,000,000 varying D.N.S building blocks) in which almost 90,000 genes are present.
Genes are the organized subunits of the hereditary substance, which in their sequence of building blocks represent the blueprints for proteins with corresponding functions in the plant.
Some of these many genes could be responsible for the stress tolerance of certain gold-of-pleasure varieties, or be involved in the expression of the tolerances.
If you now look in gold-of-pleasure how these genes are switched on or off, for example, once under optimal environmental conditions and once under heat stress and once under drought stress, you get information about 90,000 genes multiplied by the three conditions and that to 50 varieties (each has its own 90,000 genes in slightly varying form) then you already get 13,500,000 data points (90,000 x 3 x 50).
To be sure that these data do not just vary randomly, but are really related to the variety under investigation, one often examines at least three plant individuals per variety and environmental condition, then there are already 40,500,000 individual data in the database just in this data area on gene expression!
Figure 2: Barcodes that can be attached to plant pots or sample containers and then redirected to the corresponding information and data are also helpful here in the database.
It is important that such a database is available early in the project, into which the data and information can be entered. In order to guarantee that it is clear for all data and samples what they contain, where they come from and under which conditions they were obtained, the partners have agreed on terms to be used jointly. It is also clearly regulated which information must be entered for samples and data.
This is the only way in which the wealth of valuable information on the 50 varieties studied can best be put into context and finally interpreted and understood. The project also benefits from already existing genome data on gold-of-pleasure and the corresponding explanations of gene functions as they have been integrated into the own publicly accessible database at IBG-4 this year. Information on experiments and on sample material can be easily tracked in the UNTWIST database.
Barcodes (Fig. 2) that can be attached to plant pots or sample containers and then redirected to the corresponding information and data are also helpful here in the database.
Samples from an individual plant (e.g., from leaves or fruits of the same plant, or from different dates) and various associated data can then always be traced to individual plants and treatments. Pictorial representations also help in viewing the data, showing, for example, locations and applied environmental conditions of the plants studied in different experiments (Fig. 3).
The aim is to make it as easy as possible for researchers to enter their data into the database, find and use them, and ultimately use them to obtain improved and stress-tolerant crops.
Figure 3: Representation of data on measured values of plants under three different conditions, each marked in colour. Two different measurement days are shown. The data are shown as a so-called “box plot” and as a “violin plot”, so that the distribution of the data around the mean (dashed line in the box) or median (solid line in the box) can be examined. The width of the “violin” is an indicator of the frequency of the values in the experiment conducted. Single points are extreme points of the measurement.