The probe hones in on one of Tesla’s most eyebrow-raising decisions when it comes to its driver assistance package: the insistence on exclusively relying on camera sensors instead of LiDAR and radar like its competitors, which CEO Elon Musk has long derided as a “crutch.”
In 2022, the company went all-in on cameras, ditching ultrasonic sensors in its vehicles altogether — a decision that could prove to be a major mistake as it struggles to catch up with its competition and has now promised robust self-driving capabilities to owners who may lack the necessary sensor hardware.
This is one of the comments that Elon Musk uses a lot when he says humans drive with their eyes, but its untrue. We actually have a wide array of sensory systems that help us drive. Firstly, we use our ears, eyes and body motion to drive. Secondly, unlike a fixed camera mounted on a car, our heads are in constant motion. This means that we cover blind spots better than a fixed camera, and we are able to determine if it’s a small deer really close by, and a large deer really far away. Our brains take multiple 3d images and stitch them together to determine size, distance and speed.
The best way to explain the driving using your eyes fallacy is basically to look at fpv RC cars, and see how much sensory information you have been robbed of while trying to pilot the vehicle
Not only are our heads in constant motion. Our eyes are also always in motion. We’re constantly, quickly and accurately shifting our attention to different points in our vision.
That’s mostly accounting for the resolution and motion sensitivity in different parts of the eye. With enough cameras a car should be able too “see” more than we could at any one time.
No, not really true.
The way AI systems have been implemented in cars produces a flat image which we run through some fancy AI and the arrive at a conclusion. But what if 1 camera sees a child and for whatever reason, the other sees a clear road? The AI is not trained to process vision the way we do, where we use all our various senses including the conflicting info we get from each eye to arrive at a conclusion. It just does a merge and then process. It should process from each sensor, then reprocess to arrive at a conclusion
To some extent you are correct, but also notice that the cameras in teslas are not installed in pairs, so they don’t have depth perception. And since they don’t have lidar or radar it doesn’t have alternate methods to measure depth and distance.
The cameras have overlaps which can be used to measure depth and distance.
There are multiple front cameras
The side pillar camera has overlap with the side rear facing
The 2 side rear facing each have overlap with the rear.
Edit: I imagine their weakest depth/ distance perception with the current set up would be their side pillar cameras. But they could also probably do some calculations with how fast it passes from front to rear.
Nothing you said there can’t be done by cameras other than sound and the car has a microphone inside. We just might not have the capabilities yet and need to keep improving them.
All it really means is maybe the car needs more cameras and more microphones.
Determining distance with images from multiple angles over time can provide accurate distances and velocity
you’re not wrong, but also that’s a fantasy with current technology. meanwhile, cars are dangerous heavy hard boxes travelling around at high speed while we “get the technology right”, and that’s unacceptable