Member-only story

Tesla AI Day

Deep Understanding Tesla FSD Part 2: Vector Space

Jason Zhang
17 min readOct 19, 2021

--

From Tesla AI Day

This is the second article in my series on Deep Understanding Tesla FSD.

  1. Deep Understanding Tesla FSD Part 1: HydraNet
  2. Deep Understanding Tesla FSD Part 2: Vector Space
  3. Deep Understanding Tesla FSD Part 3: Planning & Control, Auto Labeling, Simulation
  4. Deep Understanding Tesla FSD Part 4: Labeling, Simulation, etc

In the previous article, we discussed the architecture of Tesla’s neural network — HydraNet. At present, HydraNet can only process the input from a single camera.

From: Tesla AI Day

Vector Space

When the Tesla AI team worked towards FSD, they quickly found that this is not enough. They need more cameras, and the prediction results of the perception system must be converted to three-dimensional space that is also the foundation of the Plan & Control system. Tesla calls this 3D space “Vector Space”. The information of the vehicle and the space in which it is located, such as the position, speed, lane, signs, signal lights, and surrounding objects of the vehicle, is digitized and then visualized in this space.

From: Tesla AI Day

Occupancy Tracker

They developed a system named Occupancy Tracker using C++. This system stitched up the curb detections from the images, that across camera scenes, camera boundaries, and over time. But this design has two problems:

Problem1: The across-camera fusion and the tracker are very difficult to write explicitly. Tuning the occupancy tracker and all of its hyperparameters was extremely complicated. Tuning C++ programs by hand is a nightmare for every programmer.

Problem2: Image space is not the right output space. You should make predictions in the vector…

--

--

Jason Zhang
Jason Zhang

Written by Jason Zhang

Software Engineer, Kaggle Competitions Expert

Responses (4)

Write a response