Abstract:
Autonomous Vehicle (AV) systems demand precise and timely processing of perception, pre-diction, and planning components to ensure safe and efficient operation. However, evaluating these components in real-world environments remains challenging. This thesis presents a novel framework for benchmarking AV systems, featuring a scalable gRPC-based data pipeline and fine-tuned models for tasks including object detection, semantic segmentation, tracking and audio detection. Hosted across distributed locations to emulate edge-cloud architectures, the pipeline captures latency-accuracy tradeoffs under diverse network and computational constraints. We created a comprehensive dataset of standardized performance metrics, such as latency, throughput, resource consumption, and timely accuracies, across various algorithms and scenarios, extending the Pylot platform’s evaluation framework. Experimental results show that fine-tuned models out-perform baseline implementations, while distributed hosting highlights network-induced latency impacts on safety and efficiency. This work provides a robust, network-aware benchmarking resource, fostering reproducible, context-dependent AV research and advancing safer, scalable autonomous driving systems.