Agriculture x Data Science

Data Driven Decision Making And Engineering

In this first section, will discuss different implementations and current state-of-the-art research using machine learning as a core component in plant or crop science. We need to define the aim of scientists and engineers: the central objective of plant science is to make different traits such as yield, taste, or resilience as a function of the available genomic, phenotypic, and environmental data. 

Machine Learning Applied to Genetic Engineering


Machine Learning approaches to genetic engineering and crop design [1] are currently a hot topic. In fact, ML techniques are used extensively these days to understand and predict how the machinery of the cell (a.k.a biochemistry) works. 

These implementations focus mainly on predicting proteins' of interest localization, genome crossovers, alleles, and epigenomics which are key to gene expression in general.  These attributes are used in the crop design process to create accurate models and automate tasks to be able to understand and, in the future, engineer genomes that translate into a viable phenotype


Phenotype — Set of observable characteristics or traits of an organism. The term covers the organism's morphology or physical form and structure, its developmental processes, its biochemical and physiological properties, its behaviour, and the products of behavior.


Locus (plural loci)  Specific, fixed position on a chromosome where a particular gene or genetic marker is located.

These topics are very dense, thus we will summarize some key areas. At the end of the article, you will find the corresponding literature. 

Digital Plant Phenotyping


The concept of Digital Plant Phenotyping is fairly new but its potential impact makes it attractive and numerous research contributions have be published. The aim is simple: automatically derive a specific phenotype from image data and environment parameters (temperature, light intensity, humidity, soil composition) of plants and predict their future traits at an early stage of their devlopment. 

The first system of this kind was introduced to study wheat development [2]   and used a CNN to detect and classify with good accuracy spikes and spikelets in wheat images at their specific development stage. 

From that point onward several equivalent systems were created to study different species and topics in their development.  Some incredible applications were impactful at a really large scale such as 

Unfortunately, for these systems, a large amount of data is needed and most of the time we need species-specific (sometimes even location-specific) systems which leads to a tedious and heavy collecting phase.  Some premises of open-source data sets have seen the light but are still insufficient.

Overview of plant phenotyping systems

Advanced Analytics Can Address Supply Chain Shortcomings 


Climate change, catastrophes, pandemics, or even regional wars, are problems that produce food shortages due to destroyed fields, disturbances in transportation, or labor shortages. Nevertheless, these problems (for now!) aren't even close to the horrendous food waste produced by the rigidity and inefficiency of the supply chain.

“FAO estimates that 30-40 % of total production can be lost before it reaches the market, due to problems ranging from improper use of inputs to lack of proper post-harvest storage, processing or transportation facilities. These losses can be as high as 40-50 % for root crops, fruits and vegetables, 30 % for cereals and fish, and 20 % for oilseeds.” [6]

To tackle these problems, McKinsey and Company suggest a four-way approach to decrease the supply chain rigidity [7] and request organizations and institutions to leverage available data. This will allow us to build global and useful data sets to create a digital twin of each product (vegetables, dairy, ...). Current problems can then be addressed using algorithmic approaches to be more efficient during transport or just see where the food waste clusters are.


Digital Twin Digital representation of an intended or actual real-world physical product, system, or process that serves as the effectively indistinguishable digital counterpart of it for practical purposes, such as simulation, integration, testing, monitoring, and maintenance.

References

[1] Nur Shuhadah Mohd Saad, Ting Xiang Neik, William J.W. Thomas, Junrey C. Amas, Aldrin Y. Cantila, Ryan J. Craig, David Edwards, Jacqueline Batley, Advancing designer crops for climate resilience through an integrated genomics approach, Current Opinion in Plant Biology, Volume 67, 2022, 102220, ISSN 1369-5266, https://doi.org/10.1016/j.pbi.2022.102220.

[2] Bedo, J., Wenzl, P., Kowalczyk, A. et al. Precision-mapping and statistical validation of quantitative trait loci by machine learning. BMC Genet 9, 35 (2008). https://doi.org/10.1186/1471-2156-9-35

[3] Genze, N., Bharti, R., Grieb, M. et al. Accurate machine learning-based germination detection, prediction and quality assessment of three grain crops. Plant Methods 16, 157 (2020). https://doi.org/10.1186/s13007-020-00699-x

[4] Aalt Dirk Jan van Dijk, Gert Kootstra, Willem Kruijer, Dick de Ridder, Machine learning in plant science and plant breeding, iScience, Volume 24, Issue 1, 2021, 101890, ISSN 2589-0042, https://doi.org/10.1016/j.isci.2020.101890.

[5] Mohamed Bouni, Badr Hssina, Khadija Douzi, Samira Douzi, Towards an Efficient Recommender Systems in Smart Agriculture: A deep reinforcement learning approach, Procedia Computer Science, Volume 203, 2022, Pages 825-830, ISSN 1877-0509, https://doi.org/10.1016/j.procs.2022.07.124.

[6] Seeking end to loss and waste of food along production chain, Food and Agriculture Organization of the United Nations, https://www.fao.org/in-action/seeking-end-to-loss-and-waste-of-food-along-production-chain/en/#:~:text=This means that globally%2C around,not have enough to eat.

[7] How advanced analytics can address agricultural supply chain shocks, McKinsey & Company, https://www.mckinsey.com/industries/agriculture/our-insights/how-advanced-analytics-can-address-agricultural-supply-chain-shocks