Traffic Shaping and Control in Mixed Autonomy Environments – Strategies for Congestion Management
Advisor: Professor Alireza Talebpour
Abstract
Numerous strategies for congestion management have been proposed over the years to tackle the ever-evolving problem of traffic congestion. With the advancement of connected and automated vehicle technology, the door is open to use this technology in various fields of research. One such area of interest is one that connected and automated vehicles are expected to directly impact, i.e., traffic control and management. This dissertation presents a comprehensive investigation of the interactions between human drivers and automated vehicles in real world traffic environments and a deeper look into the impacts of automated vehicles on human driver behavior and traffic conditions as a whole. This work then proposes novel approaches to tackle traffic congestion from a micro level perspective through automated vehicle mandatory lane changing, and from a macro level perspective through traffic shaping and congestion alleviation by leveraging the capabilities of connected and automated vehicles. The first part of this dissertation utilizes real-life collected datasets while the latter parts rely on building accurate simulations of relevant traffic events and utilizes them to develop and test models and congestion management strategies.
Accordingly, the first study involves the examining of human-automated vehicle interactions and measuring the difference in driving behavior between human drivers and automated vehicles. To that end, a comprehensive data collection is first conducted in multiple locations within the United States such as the I-90/I-94 expressway in Chicago, IL and downtown Phoenix, AZ. The data collection involved introducing automated vehicles of varying levels of autonomy into the traffic stream (i.e., Level 1 to Level 4, where level 1 refers to longitudinal adaptive cruise control only, and level 4 refers to full autonomous driving with no human driver behind the wheel). Interactions around the automated vehicles were captured, and through a comprehensive trajectory extraction method, vehicle trajectory information is acquired. Subsequently, in order to measure similarity in driving behavior between automated vehicles and human drivers, a car following model (i.e., the intelligent driver model) is calibrated for both automated vehicle-captured following and again for human driver following behavior. The models were calibrated iteratively through a genetic algorithm and using multiple following samples which resulted in a probability distribution for each calibration parameter. The probability distributions of each parameter for human driving and automated driving are tested against each other for similarity through a number of statistical tests. The statistical tests results reveal no significant differences in driving behavior between human drivers and automated vehicles. This in turn means that the impacts of automated vehicles are not far different than that of
regular vehicles driven by humans. This study paved the way to look into employing connected and automated vehicle technology in unconventional congestion management strategies that exploit the capabilities of this technology for an overall improvement in traffic conditions, rather than relying on automated vehicles' default decision making and driving behavior to improve traffic flow.
To that end, the next study looks to employ connected and automated vehicle technology in a micro-level traffic management problem. The micro-level study proposes a reinforcement learning based framework for mandatory lane-changing of automated vehicles in a non-cooperative environment. The objective is to create a reinforcement learning agent that is able to perform lane-changing maneuvers successfully and efficiently and with minimal impact on traffic flow in the target lane. For this purpose, this study utilizes the double deep Q-Learning algorithm structure, which takes relevant traffic states as input and outputs the optimal actions (policy) for the automated vehicle. A realistic approach is put forward to deal with this problem, where for instance, actions selected by the automated vehicle include steering angles and acceleration/deceleration values, and where states are treated as continuous x,y coordinates rather than, for instance, a simplistic grid representation of locations. The study shows that the reinforcement learning agent is able to learn optimal policies for different scenarios it encounters and performs the lane changing task safely and efficiently. This work not only illustrates the potential of reinforcement learning as a flexible framework for developing real-time lane-changing models that take into consideration multiple aspects of the road environment, but also demonstrates the possibility to use connected and automated vehicle technology to improve micro-level traffic operations and thus improve traffic flow as a whole.
The next study takes a macro-level view of traffic congestion. This study explores the performance of several multi-agent reinforcement learning systems in traffic shaping, i.e., achieving a particular flow level given a density, under a variety of traffic operation scenarios. Particularly, this study investigates two main approaches: centralized and decentralized multi agent reinforcement learning. Centralized systems are usually defined by a central model that transmits actions to all the agents under its control. The model is trained using inputs from all the agents, and depending on the model choice and algorithm used, the model can either combine inputs from the agents during its training or use each agent's input separately to train. Subsequently, the trained model transmits actions to the agents in one of two ways: either one action is transmitted to all the agents to perform according to the global state of the system, or a unique action is transmitted to each agent based on its local state. Naturally, the two different action transmittal methods require different state-action definitions for the model to be developed. On the other hand, despite decentralized systems being more straightforward in the way they are defined (since each agent is assigned a unique model and that model outputs actions directly to their agent based on that agent's local observations), they are often more complex to build and harder to train and optimize due to the nonstationarity that becomes inherent in the system as each agent learns and changes its actions over time. On the other hand, it is hypothesized that in non-stationary traffic operations, decentralized systems that process local signals may be better equipped to deal with the different proposed tasks since each agent chooses an action unique to itself based on the local observations it makes. Centralized systems, on the other hand, may be able to leverage combined knowledge from different sources to come up with more holistic solutions, but become lacking when each agent has different local conditions, which is usually the case in traffic operations. The main reinforcement learning model (which takes the role of the controller) used for this study is based on the Deep Q-Learning algorithm (DQN), and the two main systems differ in how and whether the agents are able to interact with each other and whether a central controller sends actions to the agents or a unique controller is assigned to each agent. A simulation environment of a single-lane loop road is the main setting within which the proposed systems are trained and tested. In general, and since only single-lane environments are considered, the actions of the agents are confined to acceleration, deceleration, and do-nothing actions, as opposed to more generalized cases of multi-lane scenarios that allow for lane-changing maneuvers. Loop roads are suitable for simulating congestion formation and dissipation events, which makes them ideal for the tasks proposed in this study. This study illustrates the advantages and
disadvantages of reinforcement learning-based approaches especially as they relate to traffic control and congestion mitigation tasks. It also demonstrates their ability to perform traffic shaping with relatively small market penetration rates.
The final study in this dissertation builds on the previous study but looks to compare the performance of different communication and control protocols (within the general umbrella of reinforcement learning) for several additional tasks including alleviating congestion, maintaining flow levels for extended periods of time, and shaping traffic flow to a pre-desired value. This study explores practicality issues when deploying connected and automated vehicles on the road and discusses challenges in the different methods of communication and cooperation between agents.
Finally, this dissertation opens the door for practitioners and researchers to look at traffic congestion from a new point of view. Rather than focusing on passive congestion management strategies that seek to mitigate congestion before traffic breakdown occurs and become irrelevant once it does, this work illustrates the possibility to devise congestion management strategies that are effective beyond traffic breakdown and paves the way to further explore this area of traffic flow theory and applications.