ML for crop prediction models

Published on Feb 28, 2025

Scene 1 (0s)

[Audio] Hi all, in this presentation, we are going to talk about machine learning applications for crop prediction models. My name is Sebastiaan Verbesselt, researcher in precision agriculture at ILVO and I will be replaced in this video today by an AI avatar who will hopefully convey the presentation comprehensibly..

Scene 2 (26s)

[Audio] Here is a quick overview of the topics we are going to discuss today. First a general introduction of artificial intelligence, machine learning and deep learning. Than we are going to explain how different sensors and platforms can be used to collect data for crop monitoring. We will touch on some of the computer vision algorithms that exist to analyse these data, some examples of research cases we have at ILVO whereby we use machine learning to monitor crops. Last, we will discuss some of the challenges of machine learning for agricultural research..

Scene 3 (59s)

[Audio] Let's start on what is Artificial intelligence? It is een concept invented by The English mathematician Alan Turing, farther of the first "computer" that decoded the secret codes of the Germans in World War II. He defined AI as "AI is the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings"..

Scene 4 (1m 23s)

[Audio] But what is "intelligence"? Even for living organisms, Biologists have trouble to find a correct definition (or the right characteristics) to say if an organism is intelligent or not..

Scene 5 (1m 36s)

[Audio] This is because all living organisms have a level of "intelligent" behaviour. From bacteria to plants, animals and humans. All these organisms have strategies to survive, like finding food, escaping from predators, finding mates and to on. And we don't neceassarily need "brains" to do this, like plants, bacteria and even some animals..

Scene 6 (1m 59s)

[Audio] The same goes for computers, or machines steered by computers. Even calculators which most people would not consider as being very "intelligent", can calculate much faster complex equations compared to humans..

Scene 7 (2m 16s)

[Audio] What is easier to do, in to rank organisms and computers from less intelligent to more intelligent. For example, we could rank humans being more intelligent by chimpanzees, chimpanzees more than mice, mice more than snails and snails more than plants. We can also rank self driving cars being more intelligent than a deep learning object detection algorithm, an object detection algorithm more than a simple machine learning classification algorithm that classifies "pets" into cats and dogs based snouth length and ear geomerty, the classification algorithm more than computations in spreadsheets or excel, and excell more than a simple calculator.

Scene 8 (3m 3s)

[Audio] So, there are many different ways to "categorise" AI. One way is by level on intelligence (like we dit in previous slide). All processes whereby a human has defines the rule set for the data analysis (like for example in spreadsheets) can be considered as "automation". Most machine learning and deep learning algorithms (we will later discusse wath machine and deep learning is) can be considere as "weak AI", good for doing one or a limited amount of tasks. We see ourself (humans) and being General intelligent, so we are "intelligent" for multiple tasks. Nowadays, we see that first "General Artificial Intelligence" is emerging, like for example ChatGPT. ChatGPT can do multiple 'tasks' at the same time, like (1) translating texts, (2) summarising tekst, (3) generating new text and images and (4) sentiment analysis. Strong AI, whereby the AI is more intelligent than humans is for now a theoretical concept..

Scene 9 (4m 13s)

[Audio] We can also categorise AI based on functionality. Reactive machines are algorithms that don't have any memory. They have predetermined, fixed actions for every incoming solution. E.g. Chess-playing computers. They can be strong AI, but can not handle uncertainty or learn from memory. Limited memory AI include most machine and deep learning algorithms of today. They use memory to learn from data in order to make predictions on new, unseen data. Theory of mind AI is still in a research fase. It includes AI that understand the needs of other intelligent entities (like humans), like complex emotions, thought processes, ideas. Self-aware AI is AI that is, like the name says, self-aware. This is often used as concept in science fiction films where AI algorithms takes over the world. This is of course a more hypothetical form of AI..

Scene 10 (5m 22s)

[Audio] So most of the AI that is already present in our daily live are reactive machines and limited memory AI..

Scene 11 (5m 33s)

[Audio] You can also devide AI based on application purpose. For AI that understand and interpete human language is often referred to Natural Language Processing, for images, films and other visual information we call it computer vision, and for audio fragments and human speech, we call it speech recognition. Machine agents that interact with the physical world (whereby they often collect information of there surrounding and use AI for the interpretation) are called robots..

Scene 12 (6m 10s)

[Audio] Most AI experts will howevere categorise AI based on technical approach or learning method (explained in the next slide). The general term for intelligent behaviour of machines and computers is called AI. A subdivision of AI is machine learning, whereby algorithms detect patterns within the data and learn form memory. Data scientists and engineers will help these algorithms by already selecting interesting features within the data to give to the algorithm. For deep learning, a subsection of machine learning, the algorithm will mosty find these features by itself. Last, we have generative AI (which is a subsection of deep learning), where algorithms can generate new content based on patterns an memory of previously seen data. For example ChatGPT and Dall-E.

Scene 13 (7m 7s)

[Audio] Last, we can devide AI (in this case, machine learning) based on learning method. we have: Supervised Learning: whereby the model is trained on labeled data, meaning each input has a corresponding correct output. Example: Image classification with labeled categories. Unsupervised Learning: whereby the model is trained on unlabeled data and must find hidden patterns or structure on its own. Example: Clustering customer groups in marketing. Semi-Supervised Learning: The model is trained on a mix of labeled and a large amount of unlabeled data. It learns patterns from the unlabeled data while being guided by the labeled data. Example: Speech recognition with limited transcribed audio. And last, reinforcement Learning: The model (agent) learns by interacting with an environment and receiving rewards or penalties based on its actions. Example: Training a robot to walk..

Scene 14 (8m 13s)

[Audio] We are going to illustrate the difference between machine learning and deep learning with the metaphor of learning a child to cycle. In this metaphor, the child is the computer which can uses it's brain, or algorithms, to learn. The data it has to analyse is the bicycle. The parent is in this example a data analyst, that can indicate the interesting features or properites of the data to learn. He or her can indicate on how to steer the bicycle, how to brake, sit on the saddle and step and the pedals. The child or computer will optimize the problem by learning. The less the child falls, the better he/her can cycle. It will also stores the information in memory and use it when he/her has to bike again (maybe with another bicycle)..

Scene 15 (8m 57s)

[Audio] For deep learning, the role of the data analyst or parent will dissapear. So (alsmost) no features already pre-defined. The child or computer will just try to learn by itself and will still optimize the problem. It only takes more training time and data..

Scene 16 (9m 15s)

[Audio] Downside, how it learns is often a 'black box'. The computer can sometimes have 'original' solutions. As long as this works, it is not a problem, but if it eventually goes wrong somewhere, it requires a lot of effort to open and analyse the black box..

Scene 17 (9m 34s)

[Audio] Wat is essential in order to conduct machine learning? First, data input. High volumes of relevant and high-quality data that 'represents' the problem to be solved. Data is analyzed by algorithms, which are mathematical formulas that can analyse the data and recognise patterns (and pass this along as insight). For deep learning, these are often artificial neural networks who looks in build developer's the neuron cells in our brain. Computers with enough processing power are required to do the calculations and optimalisations of the algorithms. This can be sometimes very energy consuming. Both bad for the environment and costly for the end user. In order to correctly collect and label data and find relevant machine learning applications, the developers' team or data engineers themselves need to have sufficient domain knowledge. Collecting and annotating or labelling large datasets is time consuming and often costly. Agriculture is a very complex sector, both business wise as application wise. Markets are volatile and interactions between living organisms (like the crops and plants) with their environment are complex, which creates variation in data and uncertainty within the machine learning models, lowering their accuracy and precision. The last thing to consider are the ethics of your AI or machine learning application. Are their potential risks or drawbacks of your application? Do you use private, personal or sensitive data? Are you enough transparent of your application to end users and so on..

Scene 18 (11m 15s)

[Audio] For crop monitoring, you need to understand the complex processes and interactions of plants with their environment. Weather, soil, management, genetics and orther organisms can play a role on the plants fenotype and how the grow. Some usefull dataset you can collect to better understand and monitor your field are mentioned on the slide..

Scene 19 (11m 43s)

[Audio] For the direct monitoring of crops and other relevant organisms as weeds, diseases and pests, we more and more use remote and proximal sensing technology to do the monitorring. Somethimes combined with a limited number of field observations. This technology uses different types of sensors, which can be devided into active and passive sensors. Active sensors sent out a signal to an object. In this domain, ofthe the plants or crops. The signal will hit the target will be reflected back to the camera or sensor, providing it with usefulle information like biomass, 3D structure, canopy information, water content and so on. Passive sensors measure the reflected electromagnetic spectrum (like visual light, near infrared, thermal infrared and microwaves) of the sun by the plants and terrain. This gives information of the chlorophyll content, biomass, cell structure, water and chemical content of plants. The signals can be used to monitor the plants health, detect diseases, damage by pests and to distinguish weeds from crops..

Scene 20 (12m 55s)

[Audio] Sensors measure information of plants and soils in an inderct may, without making contact. They can be mounted on aerial platforms like satellites, helicopters, hot air balloons, planes and drones. We often refer to this techology as "remote sensing". Sensors and cameras can also be handheld (like digital cameras or cameras in your mobile phone), mounted on carts, robots and agricultural machines or mounted on stagionairy poles and platforms. We often refer to this technology as proximal sensing..

Scene 21 (13m 30s)

[Audio] Satellite remote sensing that are open source and used for civilian purposes, the image rasters have often low to medium spatial resolution. Famous examples are the NASA's Landsat-8 satellite with spatial resolutions between 15 to 100 meter per pixel and ESA's Sentinel-2 satellites with spatial resolution between 10 to 60 meter per pixel. Commercial satellites can have higher resolution, like PlanetScope with 3 meter per pixel resolution and WorldView 3 with 0.31 to 30 meter per pixel. Satellites can have revisit time, which can lead to interesting time series data if atmospheric conditions are optimal, so not too much clouds. Pixels are however often larger than the objects we are interested in. In this case, the crops. This is why the machine learning models operate on pixel level. The data can be used for plot boundary detection, anomaly detection (for example, change in management, crop type, disturbances or damage by fire, floods, animals and to on). The data can be used to predict yield on a plot or subplot level or for landcover classification. Satellites collect vast amount of data due to their large coverage. One satellite image of Sentinel-2 has a width of 290 kilometer..

Scene 22 (14m 56s)

[Audio] Other aerial platforms have medium to very high spatial resolution, depeding on the flight altitude ande camera and lens type. Data collection is often costly and not freely available. One flight can however cover a large area, often faster than measurements on the ground. Machine learning algorithms can both function on a pixel level (if the pixels are larger than the plants) or on a object level, where multiple small pixels can represent a plant, insect, soil and so on. Typical computer vision models that work on an object level will use image regression, image classification or recognitioning, object detection, semantic segmentation, instance segmentation or panoptic segmentation..

Scene 23 (15m 44s)

[Audio] Proximal sensing has both mobile and stationairy platforms. The data is collected at very high spatial resolution but is costly to collect. Machine learning models will again work at object level. Sensors on agricultural machinery and robots for example will use machine learning to detect crop rows for automatic steering, weeds to target them with local treatement or for the evaluation of fruits for harvest..

Scene 24 (16m 28s)

[Audio] AI and Machine learning models that work on pixel level will collect information on pixel colour, hue, intensity and texture to make predictions for regression, embedding, clustering or classification. Some of the typical examples are given in this slide..

Scene 25 (17m 12s)

[Audio] AI and Machine learning models that work on object level will beside collecting information on pixel colour, hue, intensity and texture also look at the form and structure of these pixels within the object. Often deep learning algorithms like convolutional neural networks are used in stead of traditional machine learning architectures to analyse these images..

Scene 26 (17m 50s)

[Audio] Lets give some examples on how machine learning computer vision is used for proximal and remote sensing research at ILVO.

Scene 27 (17m 58s)

[Audio] Machine learning can be used in combination with crop growht simulation models to predict fertilization strategies for leek crops in Flanders. Within the Wikileeks project, soil samples together with soil scan data where collect before the growingh season while crops where monitorred with satellites and drones during the growing season. This data where used to simulate fertilization strategies for homogeneous zones within the field..

Scene 28 (18m 25s)

[Audio] Leeks are very Nitrogen demanding plants. In order to minimize the effect to the environment, lower input costs and maximize yield, these managements zones where simulated. Machine learning models where especially used to predict leek biomass and nitrogen uptake from multispectral data collected by the satellites and drones..

Scene 29 (18m 46s)

[Audio] Another project is the Flaxense 2 project, where Inagro, ILVO and VITO are working together on a digital visual monitoring tool for flax growers based on satellite imagery as input. This tool will provide them with remote information about the crop condition of their flax and advise them about the optimal sowing time, possible re-sowing and the use of inhibitors for a homogeneous crop growth..

Scene 30 (19m 13s)

[Audio] An earlier study (leaded by Inagro and ILVO) in close cooperation with a handful of flax growers showed that satellite images provide a good indication of crop growth in the field. In order to be able to link useful cultivation advice to this, ILVO will calibrate and validate an existing flax growth model in this new project. To this end, Inagro and ILVO will annually draw data from some twenty practical plots that they will monitor both from the ground and with satellite images. In addition, Inagro and ILVO will set up specific tests to investigate the difference in growth between the varieties and to measure the impact of a growth regulator on the growth of the flax. Those trials will also be monitored with drones. With that data, we will set up a decision support tool around the braking of the flax, so that flax growers can make informed decisions about whether braking is necessary or not (spot on). Through the use of soil moisture maps and satellite images, in combination with weather forecasts, we also want to draw up advice on the ideal sowing time. This data is shown in the WatchItGrow platform of partner VITO..

Scene 31 (20m 24s)

[Audio] Drones imagery can also be succesfull implemented to classifiy or detect Colorado beetles and their larvae within potato fields. Especially the larvae are clustered within certain area's within the field. They do most damage, but can also be treated better than the adults with pesticides. Detected locations of these larvae can be converted into pest maps and task maps for location specific treatment or spot spraying. Again, this reduces the total amount of chemical input, which is both good for the environment and the costs for the farmers. This concept is tested (still ongoing) by ILVO within the KODA2030 project: towards a more sustainable cultivation of potato by 2030..

Scene 32 (21m 10s)

[Audio] Within the same project, ILVO also test if drone and satellite imagery can be used for location specific potato haulm killing. We evaluated the haulm biomass after 0, 1, 2 and 3 treatments of herbicide, whereby we used 2 different potato cultivars, two fertilization treatments and three herbicide volumes. The indices can be converted from a vegetation index to a binary map with living plant or not. The percentage of living biomass per subplot can than be evaluated. The goals is to test if these platforms can be succesfull tools for monitoring and how vegetation indices can be used to advice farmers where to spray herbicides and by what volume of herbicide in the form of task maps..

Scene 33 (22m 0s)

[Audio] Computer vision models can also be used to detect weeds within crops. This was showcased at ILVO in 2021, where a drone sent its images out to a deep learning model within the cloud during its flight. For this a 5G antenna was installed by partner Proximus next to the field. This cloud computing ensured a semi-real time site specific herbicide application in maize. 5 minutes after flight, the results of the AI model where already converted into a task map for spot spraying and given to the sprayer. This demonstration is showed in the video in the next slide..

Scene 34 (22m 35s)

Remote and proximal sensing – research examples @ILVO.

Scene 35 (23m 53s)

[Audio] One year later, we did a similar demonstration. This case, the weeds were not chemically controlled by spot spraying. Instead, the taks map was given to our biggest robot, the CIMAT robot. The robot dit thermal weed control by only burning away the weed plants..

Scene 36 (24m 21s)

[Audio] Another research topic is the detection of the leaf disease early blight, caused by the fungus Alternaria solani in potato crops. Alternaria attacs and disrupts the leaf and stem cells and can be recognized by brown and plack spots, called leasions, on the plant. At later stages, it can also attach the tubers of the plant. The disease can severly lower the quality and quantity of the potato harvest. The disease can be detected at a more early stage with specialized camera's that can detect the near infrared, before it is visible for us. This result is shown by looking at a hyperspectral camera, that goes from the blue visible light to green, red and eventually to the near-infrared. At the wavelength 730 nm, you will see the leasions appearing on the leaves. By modyfing a normal digital RGB camera, we could look in the near infrared region for better detection of the disease. The plants are no longer green but appear red or orange due to the near infrared filter..

Scene 37 (25m 28s)

[Audio] With this specialized camera, we performed flight at low altitude, 10 meters above the ground. A zoom lens provided ultra high resolution images of 0,3 millimeter per pixel. The drone images where cropped to smaller tiles and given to a convolutional neural network for classification of the tiles into healthy and infected patches. From these predictions, we can make infection maps and application maps for location specific fungicide treatment..

Scene 38 (25m 59s)

[Audio] Sensors can also be mounted on robot platform. The data in collected and real-time processed via edge-computing for direct application. In this example, we used a multispectral camera mounted on our smallest robot. Via NDVI calculation of the images and thresholding, which is a simple case of green on brown detection, a prescription map was made for the spraying boom. Via adaptable nozzles, only locations where green vegetation (in this case, the weeds) where detected, herbicide was applied under the soil strips of blue berries bushes. This is demonstrated in the video of the next slide. It is however in Dutch but the images should be clear..

Scene 39 (26m 43s)

Remote and proximal sensing – research examples @ILVO.

Scene 40 (26m 55s)

[Audio] We can also train robots to autonomously inspect vineyard by reïnforcement learning. For this, we first made a 3D representation of a vineyard by the use of mutiple sensor information. RGB cameras, LIDAR, stereo and depth cameras, realtime kinematic gps information and odemetry data where given to the model..

Scene 41 (27m 17s)

[Audio] A complete 3D map was created and used for a simulation environment. The model could learn the agent (robot) in this invironment to correct navigate via feedback loops. In the end, we want the robot to navigate with these sensor information in an adaptive environment. We can not let the robot navigate based on a fixed 3D map since the plants will grow and change in structure. Small errors of the GPS readings could also hamper fixed navigation paths, while robots with a reïnforcement model can correct its path in a smart manner..

Scene 42 (29m 29s)

[Audio] After the model for navigation was finished, the robot could be used to detect the position of grape bunches via a real sense depth camera. A hyperspectral camera mounted on a robot arm was brought before the grapes to monitor the acidity and sugar content of the grapes, meausering thereby the quality and timing for yield. This is also demonstrated in the next slide. The video is unfortunately in Dutch..

Scene 43 (29m 53s)

Remote and proximal sensing – research examples @ILVO.

Scene 44 (31m 20s)

[Audio] Finally, we want to talk about the challenges of machine learning for agriculture. We have already talked about it in the previous slides, but here is a quick summary. Agricultural data is characterised as making highly variable but models uncertain. So be aware of the limits are of your models and re-evaluate in time. Be aware that sufficient domain knowledge is needed to make really good models. Involve agricultural experts in the creation and application of your models. Data is not always readily available. Collecting and labelling them can be time-consuming and costly. Use simple models if you have less data available and if they are already sufficiently performant. Some models require a lot of energy and computing power. Check there are sometimes alternatives to your problem. Test new code with limited data and limit your training time. Only use lots of data and long training time after your exploration, when you really want to start training validating your model. Some models are real black boxes. Try to use explainable AI where possible. Think about the possible ethical implications of your machine learning applications..

Scene 45 (32m 30s)

[Audio] Our video ends here. Thanks for your attention, hopefully you found it interesting..