Big Data for Dairy Farms? How exactly?

30 April 2024 by
Big Data for Dairy Farms? How exactly?
HoliCow
| No comments yet

You must have read already that the HoliCow project is based on the use of Big Data to improve the resilience of small and medium-sized dairy farms. But how exactly is that going to work?

During past projects and for months, our partners and associated partners have been collecting data from farms and cows all around North-West Europe. The plan is now to use this huge amount of data to develop easy-to-use tools for farmers.

But first, let’s have a look at the data available.


As you can see, our database is transnational. If we add up all this data, we have at our disposal an impressive number of 63 million spectral data. This historical number will enable us to create a very accurate tool destined for farmers. This type of transnational database for spectral data is unique and we are very proud of this achievement, despite some difficulties related to data harmonization between countries.

This huge number of data comes indeed with challenges since it comes from many different sources. Some data have inconsistent formats, which makes it hard to make them interact with one another. The first big step in this project was therefore the standardisation of the data. It puts all the data on the same level and allows comparison. It is a crucial step that will enable the cooperation between partners from different countries, regions, and laboratories.

What does this data contain?

All these numbers represent a lot of information. There are animal characteristics, mid-infrared (MIR) spectral data, and common dairy traits (milk yield, fat, protein, lactose, somatic cell, urea, etc.). These data constitute the holistic database, that means that we look at data in its entirety, and not just at isolated pieces of information.

Some data are so similar (same breed, same region, same farm, etc.) that they actually repeat themselves in the database. One possibility is therefore to filter and reduce this information to keep only a representative set of data.

And now? How will we use this data?

Collecting information is one thing, but the most important thing is what we will do with it. Thanks to machine learning and the expertise of our scientific team, the data will be interpreted and transformed into a decision-making tool.

We will apply around 300 predictive equations on the selected spectra and obtain predictions for methane, dry matter intake, mastitis, fatty acids, etc. Our scientific teams invested quite a lot of time on listing these equations, on finding where they come from, on obtaining the rights for using them, and on their performance.

To be able to run all these prediction models, an API (Application programming interface) is currently being developed by the team of Gembloux Agro-Biotech (ULiège).

Based on these predictions, unsupervised learning statistical methods will enable the creation of 6 clusters (groups of phenotypes). These clusters will be incorporated in a tool that will provide feeding and management advice directly to farmers through an online platform or an application.

These holistic solutions will concern 6 different clusters and enable farmers to improve their management in these 6 areas:

What kind of tool?

One of our challenges is to decide how this information will be presented to farmers in an easy-to-understand and easy-to-use manner. Our main idea is to work with a spider web that would present a “diagnosis” to the farmer. This illustration would indicate, for each cluster, if there is no problem (green), if a problem is suspected, meaning there is something to assess or control (yellow), or if there is an alert needing direct action (red). This would enable livestock farmers to know the situation of their herd at a glance and, if necessary, to react accordingly as soon as possible.

Besides this spider web, our goal is that the tool can also offer practical solutions to the farmer. These solutions will be designed thanks to the expertise of our technicians and advisors, thanks to feedback from pilot farms, and with information from the scientific literature. These solutions will also be constantly updated thanks to continuous feedback from users. The objective is indeed to create a community and solution database to complete the holistic database. Thanks to feedback from pilot farms and, later, from the users themselves, we will have data in terms of management solutions, nutritional solutions, etc. directly from farmers and advisors. These data will help us improve the new tool and provide information and solutions that are in tune with the reality of livestock farmers.

Our priority for the HoliCow project is to work closely with farmers, so that they can co-create the tool depending on their needs and on what they find more practical. In a context where administrative matters are becoming more and more of a burden for farmers, our intention is to develop a tool that they could intuitively integrate to their daily life. That’s why we are working with pilot farms that will help us develop and, more importantly, test the tool. We can therefore take their feedback into account and make the necessary modifications before offering the tool to a larger audience of breeders.

Sign in to leave a comment