EarthSense: How to build a vision based Agriculture Robot with Michael McGuire
A few weeks ago, I was browsing LinkedIn when I saw an incredible post from an Engineer who was working on autonomous agriculture robots. The post had over 1,000 likes, and was showing the "internal" view of an autonomous agriculture robot, spraying an oil palm field. It was fascinating.
How did it work? I had to find out, because I suspected this engineer knew a lot more about agriculture robotics than I did, and probably than most of my readers. So Michael and I got in touch, and we, together, recorded a special episode of this show. Will you learn something from Michael today? 100% guaranteed!
First, let me give you a brief intro...
What is EarthSense and how it works
Meet:
Michael started as an intern at EarthSense after graduating from the University of Illinois in the US. He got hired via a DeepSORT project, and he then evolved as a Computer Vision Engineer. 4 years later, he accepted to take in charge the operations in Malaysia, and become the Computer Vision Lead. As of recording this episode, he was just out of a demo on TerraMax, an oil palm robot
And now, here is how he defines EarthSense, and how it works using something they named the "vanishing point algorithm".
If I asked 100 engineers to drive a robot autonomously in an oil palm field using vision only, many would tell me to use Stereo Vision. Some would say Visual SLAM. A few might quote Bird Eye View. But the question of "how do you know where to go" should still remain. Here is how Michael solved it:
Let's unpack this short clip, there are 2 big ideas here:
- Navigating in agricultural fields
- The Vanishing Point Algorithm
Navigation in structured agriculture fields
The first part I'm interested in is here, when Michael describes the environment they drive in:
"The core of how our autonomy functions is that we can heavily utilize the fact that the fields are highly structured. So an oil palm, [...] they tend to have relatively straightforward rows, predictable row widths, and then predictable lane turns at the end. And so the idea is what you want is a system that is capable of starting at one corner of the field, navigating down a row in the middle of the row without crashing into anything, and then stopping at the end, turning the lane, and then coming back down the next row. And if you can just do that on repeat, those are effectively the two operations that you need to deploy to any large number of acres, basically."
Fascinating, don't you think? It looks very simple, but the part of "without crashing into anything" actually complexifies it.
How do you make sure you don't crash? Do you use an object detector? Or segmentation? How do you do this since all objects are unknown? Or, do you use an occupancy map? Or freespace detection?
Here is an illustration provided by EarthSense to explain it in more details:
Now, I am NOT going to describe these, but instead, I'd like to move to the second part of the clip, which is (I think) the most interesting of them all. It discusses navigation.
The 'Vanishing Point' algorithm
It starts from this quote:
"When we're going down the row the chief algorithm that we rely on is the lane detection as you're describing,
I think Renaissance painters centuries ago figured out that a key way to make paintings look realistic was that vanishing line. So if you're looking down a tunnel, for example, the lines, the pillars, they all converge to one vanishing point. And so, we leveraged that geometry to tell us two pieces of information, one of which is how far are we from the center and another of which is how far are we tilted from the center.
And so once you have that information, you can then tell your robot exactly where in the row it needs to travel to relative to where it currently is now."
Can you see the idea? It's all happening here:
The prediction happens not via simple geometry, but using a Deep Neural Network, which can be useful when the vanishing point is not at the center (for example, when turning) — this tells you exactly how to turn.
Of course, it's more "complex" than it looks (it always is, isn't it?). Michael already mentioned the idea of "fronds" (giant palm leafs) covering the camera, and disturbing the vanishing point detection... But there are other ideas, such as traversability, identification of end of a row, mapping, and more...
Still, the idea of the algorithm is surprisingly simple (and I LOVE simple ideas).
So these are 2 things we're learning from Michael in this clip — of course, there is a full in-depth interview available to the members of The Edgeneer's Land, my community membership experience.
But right now, I would like to leave you with 2 things: A bonus video from Michael, sharing his Top 3 Computer Vision skills — and an invite to an event on March 18, where I'll organize a live session to tell you all about Off-Road Robotics, and the 3 core skills to build in there.
Bonus Video: The Top 3 Skills of Computer Vision Engineers
Special Invite for readers of this post: The Off-Road Robotics Event
If you enjoyed this article, you're probably interested in learning more about every robotics that goes "off road". Good News: I will be hosting a LIVE Experimentation of all the algorithms discussed with Michael, PLUS way more on Thursday, March 19! All tickets are FREE - and the experience is unique, never to be repeated again!
Click "Book Your Ticket" below to access it!