Skip to the content.

Research Problem

Due to the cost and risk of deploying autonomous driving systems (ADS) in the real world, simulation testing has become an important complementary technique for the safety assessment of ADS. In essence, simulation testing is a scenario-driven approach, whose effectiveness is highly dependent on the quality of given simulation scenarios. Moreover, simulation scenarios should be encoded into well-formatted files, otherwise, ADS simulation platforms cannot take them as inputs. Without large public datasets of simulation scenario files, both industry and academic applications of ADS simulation testing are hindered.

To fill the gap, we propose a transformation-based approach SCTrans to construct simulation scenario files, utilizing existing traffic scenario datasets (i.e., naturalistic movement of road users recorded on public roads) as data sources. Specifically, we try to transform existing traffic scenario recording files into simulation scenario files that are compatible with the most advanced ADS simulation platforms, and this task is formalized as a Model Transformation Problem. Following this idea, we implement an automated tool called SCTrans and construct a dataset of over 1,900 diverse simulation scenarios.

Both source code and dataset of SCTrans are public, please see our sharing policy for more details.

Approach Overview

To address the aforementioned issue, we propose to achieve a format transformation, i.e., to transform traffic scenario recording files into simulation scenario files while preserving the scenario semantics. To ensure the correctness of this format conversion, we formalize this task as a Model Transformation Problem.

Specifically, we begin by creating Meta-Models for both traffic scenario recording files and simulation scenario files, referred to as the source Meta-Model and target Meta-Model respectively. These Meta-Models outline the abstract syntax of the respective languages, encompassing elements, structures, attributes, and their constraints.

Once we meticulously establish the precise mappings between the elements and attributes of the source Meta-Model and the target Meta-Model, we proceed to formulate concrete transformation rules. These rules enable the automated conversion of a traffic scenario recording file into a simulation scenario file.

Our approach is denoted as SCTrans.

SCTRans-running-example

Dataset

To evaluate the prototype of SCTrans, we selected random traffic scenario recording files from representative traffic scenario datasets: CommonRoad[1], inD[2], and highD[3]. This resulted in a collection of 1,994 simulation scenario files. The breakdown is as follows: 994 from CommonRoad, 500 from inD, and another 500 from highD. These scenarios were built on 406 different maps and are compatible with advanced simulation platforms like LGSVL[4]+Apollo[5] and Carla[6]+Autoware[7].

Each output includes two description files (VSE Scenario and OpenScenario), four map description files (Apollo HD Map, Autoware Vector Map, OpenDrive, and Lanelet2), and a map assets file (LGSVL AssetBundle). Additionally, we provide scenario players for two simulators.

We intend to release this comprehensive dataset, containing all curated scenario files generated by SCTrans. To access this dataset, please follow our sharing protocol.

Please note that according to our policy, we cannot provide access to the source traffic scenarios directly. If you wish to access these source traffic scenarios, kindly refer to the provided reference[8,9,10] and adhere to the specified requirements.

Source Code

We are dedicated to providing comprehensive access to the source code of all SCTrans modules. This encompasses SCTrans itself, the scenario runner for two simulators, and any other modified code. For straightforward deployment, we now offer a Docker container to facilitate easy setup and usage.

Accessing the Source Code:

Community Contributions:

We welcome contributions from the community to improve the source code. If you have enhancements or find bugs, please share them by creating issues or submitting pull requests on our GitHub repository. Your contributions help us and the wider community by enhancing the tool’s effectiveness and reliability.

Source Code

We are also in the process of releasing the complete source code for all SCTrans modules. This includes SCTrans itself, the scenario runner for two simulators, and any other modified code. For access to the source code, please refer to our GitHub repository at SCTrans. If you find the repository useful, please consider citing our paper and give this repository a Star🌟.

Supplemental Source

To complement the SCTrans tools and facilitate integration with popular simulation platforms, the following resources are available:

Open Source Protocol

Usage Limitation

We invite and encourage everyone to utilize the SCTrans tool and dataset for their projects and research. To preserve the integrity of our tool and dataset and to prevent potential misuse, we have established an authentication process to verify the identities of users seeking access. The SCTrans tool and dataset are exclusively intended for non-commercial academic research purposes. Use for commercial purposes, including but not limited to licensing, selling, or any activity aimed at commercial gain, is strictly prohibited. Additionally, redistribution of the dataset or source code to third parties requires our express consent.

Accessing the Dataset

To access the SCTrans source code and dataset, please follow these detailed instructions and send your request via email to our designated contacts.

Instructions for Request Email

For All Applicants: Please send your access request email containing the following items:

Emails should be sent to Jiarun Dai (jrdai14@fudan.edu.cn) and Bufan Gao (bfgao22@m.fudan.edu.cn).

Acknowledgements: If you use our dataset or tool in your publications, we would appreciate a citation to our ICSE 2024 paper.

Community Engagement and Error Reporting

We encourage active participation from our dataset community. If you encounter any discrepancies or errors within the dataset, please report them via email. After confirming the issue, we will:

Feel free to submit issues as your insights are valuable to the continuous improvement of SCTrans.

Transparency

We reserve the right to publicly display on the SCTrans homepage the names and logos of institutions, research laboratories, and companies that have successfully applied for dataset access. By sending an email request for access, you acknowledge and agree to abide by the above policy.

Paper Info

[ICSE 2024] SCTrans: Constructing a Large Public Scenario Dataset for Simulation Testing of Autonomous Driving Systems

Jiarun Dai, Bufan Gao, Mingyuan Luo, Zongan Huang, Zhongrui Li, Yuan Zhang, Min Yang.

Paper Link: ICSE24-SCTrans

Citation

@inproceedings{SCTrans-ICSE-2024,
    author={Jiarun Dai and Bufan Gao and Mingyuan Luo and Zongan Huang and Zhongrui Li and Yuan Zhang and Min Yang},
    booktitle={Proceedings of the 2024 International Conference on Software Engineering},
    title={Constructing a Large Public Scenario Dataset for Simulation Testing of Autonomous Driving Systems},
    year={2024},
}

Team Members

Reference

[1] M. Althoff, M. Koschi, and S. Manzinger. CommonRoad: Composable Benchmarks for Motion Planning on Roads. In Proceedings of the IEEE Intelligent Vehicles Symposium (IV), 2017.

[2] J. Bock, R. Krajewski, T. Moers, S. Runde, L. Vater, and L. Eckstein. The Ind Dataset: A Drone Dataset of Naturalistic Road User Trajectories at German Intersections. In Proceedings of the IEEE Intelligent Vehicles Symposium (IV), 2020.

[3] R. Krajewski, J. Bock, L. Kloeker, and L. Eckstein. The Highd Dataset: A Drone Dataset of Naturalistic Vehicle Trajectories on German Highways for Validation 1256 of Highly Automated Driving Systems. In Proceedings of the 21st International 1257 Conference on Intelligent Transportation Systems (ITSC), 2018.

[4] G. Rong, B. H. Shin, H. Tabatabaee, Q. Lu, S. Lemke, M. MozÌŒeiko, E. Boise, G. Uhm, M. Gerow, S. Mehta, et al. Lgsvl Simulator: A High Fidelity Simulator for Autonomous Driving. In Proceedings of the 23rd IEEE International Conference on Intelligent Transportation Systems (ITSC), 2020.

[5] Open-sourced Version of Baidu Apollo. https://github.com/ApolloAuto/apollo, 2023.

[6] A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun. CARLA: An Open 1236 Urban Driving Simulator. In Conference on Robot Learning, 2017.

[7] Autoware-AI. https://github.com/autowarefoundation/autoware, 2023.

[8] CommonRoad. https://commonroad.in.tum.de, 2023.

[9] InD Dataset. https://www.ind-dataset.com, 2023.

[10] HighD Dataset. https://www.highd-dataset.com, 2023.