YOLO-Behaviour: A new and faster way to extract animal behaviours from video

Collecting video data is the long-established way biologists collect data to measure the behaviour of animals and humans. Videos might be taken of human subjects sitting in front of a camera while eating in a group in the University of Konstanz, or researchers using cameras to measure how often house sparrow parents visit their nests on Lundy Island, UK. All these video datasets have one thing in common: after collecting them, researchers need to painstakingly watch each video, manually mark down who, where and when each behaviour of interest happens—a process known as “annotation”. This becomes a major bottleneck when analyzing data, as manually annotating these videos often takes a long time and can also be susceptible to human biases.
With the recent explosion of AI and computer vision, tools like ChatGPT can read an image and understand what is happening. You can enter a description of a video and AI models can generate a photo-realistic sample of it. In the minds of many biologists, surely these powerful models can help them save time on laborious video annotation, right? Unfortunately, this might not be the case yet. The problem is that every experimental set-up is different, every researcher wants to extract different things, and the best computer vision solution for each video type often changes depending on the circumstances. This makes it hard for biologists to figure out what AI model is best to use, even though computer scientists have developed many solutions over the years.
https://youtu.be/YytSl9gEOHI?si=xRelyic-58IB7BZEExample video of how YOLO-Behaviour automatically detects bahaviours in different animal species and humans.
A diverse team of researchers from the Cluster of Excellence Collective Behaviour and the Max Planck Institute of Animal Behavior, led by PhD student Alex Chan Hoi Hang, hopes to solve this problem. Published in the journal Methods in Ecology and Evolution, the team introduced YOLO-Behaviour, a robust, flexible framework for behavioural annotation from videos. As opposed to the common definition of “You only live once”, YOLO stands for “You only look once”, a family of computer vision models that identifies objects from images by scanning through the image once. The team demonstrated the flexibility of the framework by presenting case studies that range from controlled lab environments to data collected in the wild. The framework can automatically identify house sparrows visiting nest boxes on Lundy Island, UK, Siberian Jays eating on a feeding stick in the Swedish Laplands, humans eating in a lab in Konstanz, pigeons courting and feeding in Konstanz, and zebras and giraffes running and browsing in a national park in Kenya.
“The beauty of the approach is how easy the tool is to use, yet so effective across so many study systems”
Alex Chan, PhD student at the University of Konstanz
Lowering the barrier for automated analysis
YOLO-Behaviour is shown to work across a diverse range of study systems and case studies, and importantly very easy to train and implement, without the requirement of specialized coding expertise. To ensure that the method is widely usable, the authors also provided detailed documentation and instructions, as well as a video tutorial.“ Every behavioural ecologist that had to manually code videos has dreamt of having a tool that automates everything,” adds Chan. “I hope this tool can be a step towards that dream, to help researchers in animal behaviour save some laborious coding time,” said Fumihiro Kano, senior author of the paper.
This method has already contributed to automatically analyzing videos from each study system. In the Lundy house sparrow system, the method has been applied to extract parental visit rates for up to 2000 previously unanalyzed videos, more than doubling the amount of data available to gain insight into the causes and consequences of parental care behaviour. For the Siberian jay system, the method can be applied to process years of feeding videos, to study cooperative behaviour and how animals might have evolved to live in social groups. The method is also not limited to applications in behavioral ecology, with potential applications across diverse fields like psychology, animal welfare or livestock management. The hope is that this tool can be readily applied to different systems around the world, increasing the speed at which data can be collected and analyzed, to allow researchers to understand and measure animal behaviour.
