With this in mind, scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) created “VISTA 2.0,” a data-driven simulation engine where vehicles can learn to drive in the real world and recover from near-crash scenarios. What’s more, all of the code is being released open-source to the public.
“Today, only companies have software like the type of simulation environments and capabilities of VISTA 2.0, and this software is proprietary. With this release, the research community will have access to a powerful new tool for accelerating the research and development of adaptive robust control for autonomous driving,” says the senior author of a paper about the research, MIT Professor and CSAIL Director Daniela Rus.
VISTA is a data-driven photorealistic simulator for autonomous driving. In addition to live video, you can also simulate LiDAR data and event cameras, and incorporate other simulated vehicles to model complex driving situations. VISTA is open source and the code can be found below.
Built from the team’s previous model, VISTA, VISTA 2.0 is fundamentally different from existing AV simulators because it is data-driven. This means it was built from real-world data and rendered photorealistically. This allows a direct transfer to reality. The first iteration only supported lane following for his one car using a single camera sensor, but achieving a high-fidelity data-driven simulation requires a wide variety of sensors and behaviors. It was necessary to rethink the basics of how to synthesize the interactions of
Enter VISTA 2.0. It is a data-driven system capable of simulating complex sensor types and large-scale interactive scenarios and intersections at scale. Using far less data than previous models, the team was able to train self-driving cars that were far more robust than those trained on large amounts of real-world data.
“This is a huge leap in the capabilities of data-driven simulation for self-driving cars, an increase in scale and ability to handle driving complexities,” said CSAIL Ph.D. Co-author of the new paper Alexander Amini said. with his fellow doctoral student Tsun-Hsuan Wang. “VISTA 2.0 not only offers the ability to simulate sensor data well beyond 2D RGB cameras, but also very high-dimensional 3D lidar with millions of points, irregularly timed event-based cameras, and even demonstrates its ability to simulate even interactive and dynamic scenarios with other vehicles.”
The team of scientists was able to scale the complexity of interactive driving tasks such as passing, following, and negotiating, including multi-agent scenarios in highly realistic environments.
Since most of our data is (thankfully) nothing more than mundane, everyday driving, training AI models for self-driving cars ensures that they fall prey to all sorts of edge cases and weird and dangerous scenarios. difficult to do. Logically, we can’t crash into other cars to teach the neural network how to avoid crashing into them.
More recently, we’ve moved away from the more classical human-designed simulated environments to environments built from real-world data. The latter is very photorealistic, while the former can easily model a virtual camera and his LIDAR. With this paradigm shift comes important questions. Can we accurately synthesize all the rich and complex sensors required by autonomous vehicles, such as lidar and event-based cameras?
Lidar sensor data is much more difficult to interpret in a data-driven world. I’m trying to effectively generate a brand new 3D point cloud containing millions of points only from a sparse view of the world. To synthesize a 3D LIDAR point cloud, the researcher used data collected by the car, projected it into his 3D space from the LIDAR data, and created a new virtual vehicle locally from where the original vehicle was. enabled it to run. Finally, with the help of a neural network, we projected all of the sensory information into the field of view of this new virtual vehicle.
Along with simulating an event-based camera operating at speeds in excess of thousands of events per second, the simulator was able to not only simulate this multimodal information, but all in real time. This allows you to not only train your neural network offline, but also test it online in your car in an augmented reality setup for safe evaluation. “The feasibility of multi-sensor his simulations at this scale of complexity and photorealism in the realm of data-driven simulations has been a very open question,” he says.
The school is now a party. The simulation allows you to move around, use different types of controllers, simulate different types of events, create interactive scenarios, drop into brand new vehicles that weren’t in the original data, etc. They tested more dangerous scenarios such as lane following, lane changing, car following, and static and dynamic passing (seeing obstacles and moving around to avoid collisions). With multi-agency, both real and simulated agents can interact and new agents can be dropped into the scene and controlled in any way.
The team that took the full-size car “in the wild,” that is, Devens, Massachusetts, confirmed that the results, both failures and successes, were immediately portable. They were also able to demonstrate “robust”, the bold magic word for self-driving car models. They showed that his VISTA 2.0-trained AV is so robust in the real world that it can handle the elusive tail of difficult failures.
Now, one of the guardrails that humans rely on and cannot yet simulate is human emotions. This is a friendly wave of approval, a nod, or a blinker switch, the type of nuance the team would like to implement in future work.
“The central algorithm of this research is how to take a dataset and build a fully synthetic world for learning and autonomy,” says Amini. “This is a platform that we believe can one day be extended to different axes across robotics. Not just autonomous driving, but many other areas that rely on vision and complex behavior. We have released VISTA 2.0. allows the community to collect their own datasets, transform them into virtual worlds, directly simulate their own virtual autonomous vehicles, drive around these virtual terrains, train autonomous vehicles in these worlds, and then , you can transfer them directly to a full-size true self-driving car.”
Reference: “VISTA 2.0: An Open Data-Driven Simulator for Multimodal Sensing and Policy Learning in Autonomous Vehicles” Alexander Amini, Tsun-Hsuan Wang, Igor Gilitschenski, Wilko Schwarting, Zhijian Liu, Song Han, Sertac Karaman, Daniela By Rus November 23, 2021 Computer Science > Robotics.
Amini and Wang wrote the paper together with MIT CSAIL PhD student Zhijian Liu. Igor Gilitschenski, Assistant Professor of Computer Science at the University of Toronto. Wilko Schwarting, AI Research Scientist, MIT CSAIL PhD ’20; Song Han, Associate Professor, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology; Sertac Karaman, associate professor of aeronautics and astronautics at MIT. Daniela Luz, Massachusetts Institute of Technology Professor and Director of CSAIL. The researcher presented this work at her IEEE International Conference on Robotics and Automation (ICRA) in Philadelphia.
This work was supported by the National Science Foundation and the Toyota Research Institute. The team thanks his NVIDIA support with a Drive AGX Pegasus donation.