longitude trajectory optimization for autonomous ...

5 downloads 561 Views 831KB Size Report
School of Sustainable Engineering and the Built Environment ... vehicles based on the simplified (linear) car following model to minimize vehicles' total travel.
LONGITUDE TRAJECTORY OPTIMIZATION FOR AUTONOMOUS VEHICLES: AN APPROACH BASED ON SIMPLIFIED CARFOLLOWING MODEL Yuguang Wei School of Traffic and Transportation Beijing Jiaotong University Beijing, 100044, China Email: [email protected] Jiangtao Liu School of Sustainable Engineering and the Built Environment Arizona State University Tempe, AZ 85281, USA Email: [email protected] Pengfei Li School of Computing, Informatics and Decision Systems Engineering Arizona State University Tempe, AZ, 85281, USA Email: [email protected] Xuesong Zhou* School of Sustainable Engineering and the Built Environment Arizona State University Tempe, AZ 85281, USA Email: [email protected] Tel.: (1)-480-9655827

Word count: 4,659 words text + 11 tables/figures x 250 words (each) = 7,409 words *Corresponding Author

Submission Date: 08/01/2015 Submitted for presentation only

ABSTRACT In this paper, we present an approach to optimize longitude vehicle trajectories for autonomous vehicles based on the simplified (linear) car following model to minimize vehicles’ total travel cost and guarantee the minimal safety spacing between vehicles. Based on Newell’s linear carfollowing model, we formulate the multi-vehicle simultaneous trajectory optimization problem as a linear programming (LP) problem solvable with the commercial solvers, such as GAMS, at a small scale. In addition, we also formulate the same problem using the dynamic programing technique and develop efficient DP algorithms to guarantee the minimal safety distance between vehicles by a shifted state representation during searching the optimal trajectories. In the end, several numerical experiments are performed to illustrate how to use the approach proposed in this paper to optimize vehicle trajectories under different scenarios.

Keywords: Traffic flow management, autonomous vehicle, vehicle trajectory optimization, carfollowing model, vehicle emissions, eco-drive.

1. INTRODUCTION As population, economic growth and personal travel activities continue to increase, traffic congestion remains an extremely challenging problem due to limited road capacity and limited budgets for expanding infrastructure. A recently emerging technology, autonomous vehicle (AV), is likely to create a revolutionary paradigm shift in the near future for real-time traffic system automation and control. In the core of automated highway systems, real-time scheduling and management of AVs’ maneuvers in a real-time manner requires transformative advances within both the fields of machine learning/automated control and traditional domains of transportation scheduling and traffic flow control. In this paper, the main focus includes (1) how AV environment could influence the microscopic car-following model, (2) how to optimize/coordinate AV’s trajectories in a realtime manner, and (3) how to design theoretically rigorous and computationally efficient algorithms to solve this problem. The general review is as follows: 1.1 Overview of Car-following Models To answer these questions, we first briefly review of related car-following models and widely accepted AV driving behavior models. Since the 1950s, there are a wide range of car following models proposed for vehicles driven by humans. After the earliest car-following models developed by Reuschel (1) and Pipes (2), many car-following models were developed based on the responsestimulus mechanism between lead vehicle and following vehicle. As examples, Kometani and Sasaki (3), Forbers (4) and Chandler, Herman, and Montroll (5) had developed nonlinear carfollowing models, respectively. To overcome the complexity of those nonlinear models, Newell (6) first presented a simplified car-following theory which is consistent with the macroscopic triangular flow-density relationship. 1.2 Overview of AVs’ Impact on Traffic Flow Characteristics While most car-following models focus on human-operated vehicles, researchers in automated control and artificial intelligence started characterizing the driving behaviors of AVs and their potential impact on the road capacity in 1990s. There are two types of research efforts in parallel along this research line: one focuses on the interactions between AVs based on vehicle dynamics to derive possible changes to traffic characteristics; the other focuses on overall changes to the performance of road capacities (e.g., capacity) brought by AVs under various conditions. As examples of the first type research, Horowitz and Varaiya (7) described the findings out of the automated highway system (AHS) development in 1990s at the California PATH program. In general, the actuators make AVs react much faster than a normal or even sensitive human driver. Sensitive drivers can have a short perception-reaction time of 1.0 s to 1.5 s as reported by NAHSC (8) compared to a typical perception-reaction time of 2.0-2.5 s. Further shorter AV reaction times, such as 0.7 s reported by Bose and Ioannou (9) can lead to closer spacing between cars and a higher roadway capacity. Another important aspect that motivates the development of AHS is based on optimal flow control through reducing or smoothing random errors in human drivers via the deterministic and possibly optimized vehicle trajectory planning/control. An early prototype for single-lane vehicle platooning on automated highways was reported by Alvarez and Horowitz (10).

Horowitz and Varaiya (7) also evaluated many platooning methods in simulation as well as in the physical test beds. 1.3 Overview of Vehicle Trajectory Optimization and algorithm design Vehicle trajectory optimization and control has been extensively studied in a broader domain than the surface vehicles. As summarized by Betts (11), the formulation for vehicle trajectory control problems mainly includes four types: nonlinear programming (NLP) with equality and inequality constraints; optimal control with dynamic constraints, algebraic equality constraints, singular arcs or algebraic inequality constraints and numerical analysis. The solution algorithms include: Direct Shooting, Indirect Shooting, Multiple Shooting, and Direct/Indirect Transcription. In addition, dynamic programming and genetic algorithm can also be applied to vehicle trajectory optimization. In particular, dynamic-programming (DP)-based optimization algorithms are the most widely used in vehicle trajectory optimization. In order to formulate the problem in DP, it is necessary to define the boundary of the search scope, or map, first. Each vehicle is assumed fully aware of the whole scope and the vehicle will split the map into a grid of equal sized unit areas (i.e, discretize the map). Vehicles then plan a discretized path to their destinations so as to meet certain global goals, such as minimal fuel consumption or shortest travel times. At any time step, a vehicle needs to make a decision which grid it should move to at next time step given the latest constraints, such as maximum acceleration or decelerations, road accessibility or the location of its surrounding vehicles, etc. As examples, Mensing et al. (12) proposed a dynamic-programming (DP)-based vehicle trajectory optimization to minimize the fuel consumption. Although the DP-based trajectory optimization can reach the exact optimum, it is often too slow for real-time applications involving multiple vehicles. To address this issue, Flint et al. (13) proposed an approximate DP algorithm for multiple vehicles to cooperatively search for targets. The rest of this paper is organized as follows: In section 2, we extensively analyze how to use the underlying simplified safety-distance-based car-following models to represent the traffic system constraints and safe distance requirements. In section 3, we will develop one linear programming model to optimize the vehicle speeds subject to minimal safe driving distances between cars. Then, dynamic programming based solution algorithms are further developed for different scenarios to optimize the vehicle speed profiles as well as to predict the optimized vehicle’s impact on the following AVs and human-operated vehicles. Finally, we demonstrate the potential of optimal trajectory control for both single AV and multiple AVs through numerical analysis in section 4. 2. VEHICLE KINEMATICS ANALYSIS BASED ON SIMPLIFIED CAR-FOLLOWING MODELS 2.1 Notations 𝐿 =length of vehicle, e.g. 20 feet or 4 meters. 𝑑𝑚𝑖𝑛 = minimum distance between the front of the leading car and the front of following car. 𝑣𝑚𝑎𝑥 = free-flow driving speed. 𝑣𝑟 = reduced driving speed at bottleneck.

𝑣𝑛 = speed of the following car. 𝑣𝑙 and 𝑣𝑓 = speeds of leading and following cars before the break. 𝑎𝑙 and 𝑎𝑓 = the deceleration rate of leading and following cars, respectively. 𝑥𝑛 (𝑡) and 𝑥𝑛−1 (𝑡)= the position of following and lead vehicles, respectively, at a given time 𝑡. 𝑆𝑛 =distance headway between the positions of lead and following vehicles. 𝐼𝑛 (𝑡) = time headway between 𝑛𝑡ℎ and 𝑛 − 1𝑡ℎ vehicles. 𝑑𝑛 =distance offset in a linear spacing-speed function . 𝑑𝑓 = the minimum safe rear-to-end distance before an emergency breaking event, 𝑑𝑠 = the minimum safe rear-to-end distance after an emergency breaking event, 𝑑𝑏 = emergency braking distance of the platoon with maximum permission lane/track speed. 𝑇𝑃𝑅 = perception-response time (PRT).

𝑛 = slope in a linear spacing-speed function. 𝑏 = redundant time buffer used in linear car following model for automated driving cars. 𝑘𝑗𝑎𝑚 = jam density. 𝑤𝑏 =backward wave speed. 2.2 Safety Distance-based Car Following Model Focusing on the time dimension in the car following behavior, Forbers’ model (4) considers two major elements: (i) the reaction time (e.g., 1.5 seconds) needed for human drivers to perceive the need to decelerate and apply the brakes; and (ii) the time duration for the lead vehicle traversing its length (to avoid collision). The equivalent distance headway can be derived as Eq. (1). 𝑑𝑚𝑖𝑛 = 1.5𝑣𝑛 + 𝐿

(1)

Based on field results, Figure 1 shows the transition of minimum safe time headways to minimum safe distance headways. On the other hand, the early model by Pipes (2) has a similar model but based on a general rule derived from the minimum safe stopping distance; that is, the minimum safe distance headway increases linearly with driving speed. In general, under the boundary regions of very low speed and very high speed, there are nonlinear relationships between the distance headway and driving velocity for human driving behavior, and the widely used Greenshields model corresponds to a nonlinear spacing-speed function.

140

2.0

120

Field results

Distance headway, dmin (meter)

Time headway, hmin (sec)

3.0

100

Minimum safe time headway

1.0

0

14.7

44 Speed (km/hour)

29.3

58.7

73.3

88

Minimum safe distance headway

Field results

80 60 40 20 L 0

10

20

40 30 Speed(km/hour)

50

60

FIGURE 1 Minimum safe time headway and space headway relation as a function of driving speed (14). Also assuming distance headway 𝑆𝑛 changes linearly with speed 𝑣𝑛 , Newell (6) further proposed a simplified car-following theory. His model considers a more general term of slope 𝑛 in Eq. (2). It should be remarked that, in Newell’s model, the selection of intersect 𝑑𝑛 should be determined or calibrated according to critical jam density or spacing, which is greater than the length of vehicles 𝐿 in Eq. (1), with another amount of additional distance offset. 𝑆𝑛 = 𝑥𝑛−1 − 𝑥𝑛 = 𝜏𝑛 𝑣𝑛 + 𝑑𝑛

(2)

The derivation for the above spacing –speed relationship is performed as follows. In the case of one leading vehicle and one following vehicle under emergency breaking condition as illustrated in Figure 2, there are a few steps in this process. Step 1: Before the breaking event, the minimum safety rear-to-front distance is 𝑑𝑓 . Step 2: The lead vehicle 𝑛𝑙 breaks from position 𝐵 to position 𝐷. Step 3: The following vehicle 𝑛𝑓 may take a time interval of 𝑇𝑃𝑅 to detect the emergency breaking event of the lead vehicle, and start breaking from the current driving speed 𝑉𝑓 . Step 4: The following vehicle then breaks from positon 𝐴 to position 𝐶. The minimum safe spacing after both cars stop is 𝑑𝑠 between position 𝐶 and 𝐷, plus a vehicle length 𝐿 denoted by the distance between positions 𝐷 and 𝐸.

V

l1 l2

nl

nf

df

A

nf

C

B

nl ds

D

L Location E

FIGURE 2 Minimum safe distance before and after emergency break process, 𝒅𝒇 and 𝒅𝒔 , between the front of the following vehicle and the rear of the lead vehicle The following is to derive the minimum safety rear-to-front distance 𝑑𝑓 based on the driving speed of both cars (namely 𝑣𝑙 and 𝑣𝑓 ) before the emergency breaking. First, it is easy to drive the breaking distances 𝑙1 and 𝑙2 , as Eqs. (3) and (4), given the deceleration rates 𝑎𝑙 and 𝑎𝑓 . 𝑣2

𝑙1 = 𝑑𝑓 + 2𝑎𝑙

(3)

𝑙

𝑣𝑓2

𝑙2 = 𝑣𝑓 × 𝑇𝑃𝑅 + 2𝑎

(4)

𝑓

The we can establish Eq. (5) between 𝑑𝑓 and 𝑑𝑠 . 𝑣𝑓2

𝑣2

𝑑𝑠 = 𝑙1 − 𝑙2 = 𝑑𝑓 + 2𝑎𝑙 − 𝑣𝑓 × 𝑇𝑃𝑅 − 2𝑎 𝑙

𝑓

(5)

which can be rewritten as 𝑣2

𝑣𝑓2

𝑑𝑓 = 𝑑𝑠 + 𝑣𝑓 × 𝑇𝑃𝑅 + 2𝑎𝑙 − 2𝑎 𝑙

𝑓

(6)

Without loss of generality, if 𝑣𝑙 = 𝑣𝑓 and 𝑎𝑙 ≈ 𝑎𝑓 , one can obtain Eq. (7) 𝑑𝑓 = 𝑑𝑠 + 𝑣𝑓 × 𝑇𝑃𝑅

(7)

Now we discuss possible values of perception and reaction time parameter 𝑇𝑃𝑅 under different cases. (I)

If the leading car is human operated and the following one is an automated car, 𝑇𝑃𝑅 should include a detection delay (about 0.3s) and emergency breaking delay (about 0.4s), as shown by Ioannou et al.(16).

(II)

If both vehicles are AVs and their driving speed information is completely shared in real time, 𝑇𝑃𝑅 can be significantly small as 0. This indicates that 𝑑𝑓 = 𝑑𝑠 . This case applies to an AV platoon.

To reduce the unexpected secondary accident impact, and to ensure the overall system reliability and stability, in the AHS designed by Varaiya (16), loannou and Bose (17), an extra distance buffer 𝑣𝑓 × 𝜏𝐵 is given as shown in Eq. (8). 𝑑𝑓 = 𝑣𝑓 × 𝜏𝐵 + 𝑑𝑠

(8)

where 𝜏𝐵 in Eq. (8) is the desired time headway of autonomous vehicles, defined as the time taken to cover the distance 𝑑𝑓 − 𝑑𝑠 . A unified formula for the slope parameter 𝑛 can be represented as 𝜏 𝑛 = 𝜏𝑃 + 𝜏 𝑅 + 𝜏 𝐵

(9)

where 𝜏𝑃 and 𝜏𝑅 are the corresponding perception and reaction time and 𝑇𝑃𝑅 = 𝜏𝑃 + 𝜏𝑅 . TABLE 1 Different interpretations of linear car following model with sample setting data Time Offset 𝝉𝒏 Response Redundant time time buffer 0.4 𝑛/𝑎

Distance offset

Model

Equation

Forbes’ model Newell’s model

𝑑𝑚𝑖𝑛 = 1.5 × 𝑣𝑛 + 𝐿

Perception time 1.1

𝑆𝑛 = 𝜏𝑛 × 𝑣𝑛 + 𝑑𝑛

1-1.3

0.4

𝑛/𝑎

𝑑𝑛

0.3

0.4

𝐵

𝐿 + 𝑑𝑠

0

0

𝐵

𝐿 + 𝑑𝑠

Automated car model

𝑆𝑛 = 𝜏𝑛 × 𝑣𝑛 + 𝑑𝑛 + 𝐿

Case (Ⅰ) following car human operated Case (Ⅱ) Both AV with perfect info

𝐿

For autonomous vehicles, 𝑑𝑛 is mechanically determined by the minimum rear-to-front distance headway 𝑑𝑠 , under both emergency breaking or normal car-following conditions. On the other hand, 𝑑𝑛 for human drivers should be calibrated using an average value of jam density, which could be significantly less than the bumper-to-bumper density. Table 1 further examines the differences of the above mentioned car-following models in both human operated and autonomous vehicles. 3. VEHICLE TRAJECTORY OPTIMIZATION In this research, we will discretize the continuous vehicle dynamics to the space-time dimension, and then characterize the vehicle dynamics models with the space-time constraints for safety. 3.1 Linear Programming Formulation The general travel cost we focus includes travel time and emission impact. It can be generally represented as 𝑐[𝑥𝑛 (𝑡), 𝑣𝑛 (𝑡)] = 𝜇 × 𝑉𝑂𝑇 × 𝑡𝑡𝑜𝑡𝑎𝑙 + (1 − 𝜇) × 𝑉𝑂𝐺 × 𝑔(𝑣𝑛 (𝑡))

(10)

Where, 𝜇 is the weight for the travel time, 𝑉𝑂𝑇 is value of time, 𝑉𝑂𝐺 is value of green, and 𝑔(𝑣) is emission function of travel speed. To have a comparable travel time and fuel consumption, we set 𝜇=0.2, 𝑉𝑂𝑇 = 0.01 and 𝑉𝑂𝐺 = 1 in this paper. Since 𝑣𝑛 (𝑡) = 𝑥𝑛 (𝑡 + 1) − 𝑥𝑛 (𝑡) with the assumption that the time step is 1, the generalized travel cost function can be also expressed as 𝑐[𝑥𝑛 (𝑡), 𝑥𝑛 (𝑡 + 1)]. The objective function is min ∑𝑡 ∑𝑛 𝑐[𝑥𝑛 (𝑡), 𝑣𝑛 (𝑡)]

(11)

Subject to, (1) Location constraint at each time stamp: 0 ≤ 𝑥𝑛 (𝑡 + 1) − 𝑥𝑛 (𝑡) ≤ 𝑣𝑚𝑎𝑥 × 1, ∀𝑛, ∀𝑡

(12)

(2) Newell’s car-following constraint based on Eq.(2): 𝑥𝑛+1 (𝑡 + 𝜏𝑛+1 ) ≤ 𝑥𝑛 (𝑡) − 𝑑𝑛+1 , ∀𝑛, ∀𝑡

(13)

Boundary condition: 𝑥𝑛 (𝑡) = 0 is given for the initial departure time 𝑡 of all vehicles, and 𝑡 ∈ 𝑇, where 𝑇 is sufficiently large enough so that all vehicles can reach the downstream node from the upstream node of the segment we studies. 3.2 Formulation in Dynamic Programming Based on the simplified car-following model and discretized space-time representation in Figure 3, we transform the vehicle control model as a dynamic programming problem. A simple example is illustrated for our approach. τ d

Backward wave = τ/d

Space x

7 Trajectory of vehcile 1

6

Trajectory of vehcile 2

B 5 4 3 D

2 1 A0

0

1

2

3

Safety boundary of vehcile 1

4

T

6

Time t

FIGURE 3 Discretized space-time network for vehicle trajectory optimization In this example, there are two vehicle’ trajectories to be optimized in the segment 𝐴𝐵. Their initial condition for departure time and position is given. A large time 𝑇 (𝑇 = 5) is also assumed to ensure that the two vehicles can pass through the segment 𝐴𝐵 (𝐷 = 5). When building this space-time network for this case, we perform the initialization with a longer time and distance horizon, because constraint (13) shows that if the following vehicle 2 needs to arrive at point B, it requires at least 𝐷 + 𝑑 for the leading vehicle 1 to keep the safe spacing 𝑑, and similarly if the leading vehicle 1 arrive point b at time 𝑇, it requires at least 𝑇 + 𝜏 for the following vehicle 2 to have a response time 𝜏. In this specific space-time network, the discretized time unit is one time interval, and the discretized space unit is also defined on the basis of feasible vehicle speeds, which have three alternatives from 0 to maximum speed limit for constraint (12). Assume that the optimized vehicle trajectory has been found through dynamic programming as shown in Figure 3. The green trajectory is for vehicle 1 and the blue one is for vehicle 2. The dashed purple line is the backward wave for tight car following condition. For satisfying constraint (13), the dashed red trajectory represents the safety boundary of vehicle 1 that all following vehicles cannot enter. In order to formulate the vehicle trajectory control problem, Table 2 lists all basic elements of dynamic programming for single vehicle trajectory control (mode 1) and two coupled vehicle trajectory control (mode 1+1). TABLE 2 All elements of dynamic programming for single vehicle and two vehicle control No. Elements Single vehicle (mode (1)) Two vehicles (mode(1+1)) (1) Stage 0,1, . . , 𝑇 0,1, . . , 𝑇 (2) State 𝑆 = [𝑥1 (𝑡)] 𝑆 = [𝑥1 (𝑡), 𝑥2 (𝑡 + )] Control/decision 𝑣1 (𝑡), (𝑖. 𝑒. 𝑥1 (𝑡 + 1) (3) 𝑣1 (𝑡), 𝑣2 (𝑡 + ) variables − 𝑥1 (𝑡) 𝑥1 (𝑡 + 1) = 𝑥1 (𝑡) + 𝑣1 (𝑡) Discrete time 𝑥2 (𝑡 +  + 1) = 𝑚𝑖𝑛 {𝑥2 (𝑡 + ) + 𝑣2 (𝑡), 𝑥1 (𝑡 (4) 𝑥1 (𝑡 + 1) = 𝑥1 (𝑡) + 𝑣1 (𝑡) dynamics + 1) − 𝑑2 } Cost function: Value function

𝑐[𝑥1 (𝑡), 𝑥1 (𝑡 + 1)] 𝐿(𝑥1 (𝑡))

(7)

Bellman Equation

𝐿∗ (𝑥1 (𝑡 + 1)) = min{ 𝐿(𝑥1 (𝑡)) + 𝑐(𝑥1 (𝑡), 𝑣(𝑡))}

(8)

Policy/control law:

(5) (6)

𝑣1 (𝑡) = 𝑎𝑟𝑔𝑚𝑖𝑛{𝐿(𝑥1(𝑡)) + 𝑐(𝑥1 (𝑡), 𝑥1 (𝑡 + 1))}

𝑐[𝑥1 (𝑡), 𝑥1 (𝑡 + 1), 𝑥2 (𝑡 + ), 𝑥2 (𝑡 +  + 1)] 𝐿(𝑥1 (𝑡), 𝑥2 (𝑡 + )) 𝐿∗ (𝑥1 (𝑡 + 1), 𝑥2 (𝑡 +  + 1)) = min{ 𝐿∗ (𝑥1 (𝑡), 𝑥2 (𝑡 + )) + 𝑐[𝑥1 (𝑡), 𝑥1 (𝑡 + 1), 𝑥2 (𝑡 + ), 𝑥2 (𝑡 +  + 1)]} [𝑣1 (𝑡), 𝑣2 (𝑡 + )] = 𝑎𝑟𝑔 min{ 𝐿∗ (𝑡, 𝑥1 (𝑡), 𝑥2 (𝑡 + )) + 𝑐[𝑡, 𝑥1 (𝑡), 𝑥1 (𝑡 + 1), 𝑥2 (𝑡 + ), 𝑥2 (𝑡 +  + 1)]}

As for the state: (a) Mode (1): its state can be represented by its time and location as 𝑆 = [𝑥1 (𝑡)]. For example, if the space-time network in Figure 4 is built for mode (1), the state at stage 1 is {0,

1, 2} shown as the purple area. (b) Mode (1+1): since the two vehicles need to keep a safety spacing, we can name the state as safety state represented as 𝑆 = [𝑥1 (𝑡), 𝑥2 (𝑡 + )]. If the spacetime network in Figure 4 is used for illustrating mode (1+1), in order to satisfy constraint (13), the safety state at stage 2 for node (2, 4) shown as the yellow node will be {(3, 3), (3, 2), (3, 1), (3, 0)} marked in the green area. τ

d

Backward wave = τ/d

Space x

B g f D

e d c

A 0

1

2

3

4

5 Time t

FIGURE 4 Illustration for states of “1” mode and “1+1” mode The aforementioned problem contains two principle features: (1) it is in essence a discrete-time (e.g., every second) dynamic (i.e., time-dependent) system; (2) the total cost is additive in a sense that the emission incurred at time 𝑘 accumulated over time. The system’s state at time 𝑡 + 1 is only determined by the decisions made at 𝑡 and its previous state at 𝑡. As a result, the optimal vehicle trajectories can be solved using dynamic programming (DP). Without loss of generality, we consider several modes/scenarios shown in table 3. Those algorithms mentioned in table 3 will be stated in details at section 3.3 TABLE 3 Different modes we consider for vehicle trajectory control No. (1)

Mode (“1”)

(2)

(“1+1”)

(3)

(“1+ m'”)

(4)

(1>1>1…)

Description One single optimized autonomous vehicle Two jointly optimized autonomous vehicles(the second vehicle is not necessary at following mode) One optimized lead AV and m' following vehicles which are either autonomous vehicle or human operated car Vehicles’ trajectories are optimized sequentially

3.3 Algorithms of Dynamic Programming Mode (“1”) The DP-based optimization algorithms is described as below:

Algorithm and Cost function DP algorithm 1 DP algorithm 2 DP Algorithm 3: Use 3-detector Eq. (28) to predict the position of human-operated drivers DP algorithm 4

Denote 𝐿(𝑡, 𝑥(𝑡)) as the value function of state 𝑥(𝑡) at 𝑡 , 𝑡 ∈ [0, 𝑇], 𝑥 ∈ [0, 𝐷] , 𝑑 𝑚𝑎𝑥 as the maximum distance one vehicle can travel at one time step, equal to 𝑣𝑚𝑎𝑥 in our paper; and the initial 𝐿(𝑡, 𝑥(𝑡)) values are all positive infinity. After all iterations, search the corresponding time index with the minimal fuel consumption at 𝐷 and trace back to get the optimal vehicle trajectory. Total cost 𝐿(𝑡 = 𝑇 ∗ , 𝑥 = 𝐷) = min𝑡 {L(t, x = D)} // initialization 𝐿(𝑡, 𝑥1 (𝑡)) ∶= +∞; // value of state (𝑥1 (𝑡)) 𝐿(0, 𝑥1 (0)) ∶= 0; for 𝑡 = 0 to 𝑇 do begin for 𝑥1 (𝑡) = 0 to 𝐷 do begin for 𝑥1 (𝑡 + 1) = 𝑥1 (𝑡) to min{𝐷, 𝑥1 (𝑡) + 𝑑𝑚𝑎𝑥 } do begin if 𝐿(𝑥1 (𝑡)) + 𝑐[𝑥1 (𝑡), 𝑥1 (𝑡 + 1)] < 𝐿(𝑥1 (𝑡 + 1)) then begin 𝐿(𝑥1 (𝑡 + 1)) = 𝐿(𝑥1 (𝑡)) + 𝑐[𝑥1 (𝑡), 𝑥1 (𝑡 + 1)]; end; end; end; end;

Mode (“1+1”) // initialization 𝐿(𝑥1 (𝑡), 𝑥2 (𝑡 + 𝜏)) ∶= +∞; // value of state (𝑥1 (𝑡), 𝑥2 (𝑡 + 𝜏)) 𝐿(𝑥1 (0), 𝑥2 (0 + 𝜏)) ∶= 0; for 𝑡 = 0 to 𝑇 do begin for 𝑥1 (𝑡) = 0 to 𝐷 do begin for 𝑥2 (𝑡 + 𝜏) = 0 to 𝑥1 (𝑡) − 𝑑 do begin for 𝑥1 (𝑡 + 1) = 𝑥1 (𝑡) to 𝑥1 (𝑡) + 𝑑 𝑚𝑎𝑥 do begin for 𝑥2 (𝑡 + 𝜏 + 1) = 𝑥2 (𝑡 + 𝜏) to min{𝑥1 (𝑡 + 1) − 𝑑, 𝑥2 (𝑡 + 𝜏) + 𝑑 𝑚𝑎𝑥 } do begin if 𝐿(𝑥1 (𝑡), 𝑥2 (𝑡 + 𝜏)) + 𝑐(𝑥1 (𝑡), 𝑥2 (𝑡 + 𝜏) < 𝐿(𝑥1 (𝑡 + 1), 𝑥2 (𝑡 + 𝜏 + 1)) then begin 𝐿(𝑥1 (𝑡 + 1), 𝑥2 (𝑡 + 𝜏 + 1)) = 𝐿(𝑥1 (𝑡), 𝑥2 (𝑡 + 𝜏)) + 𝑐(𝑥1 (𝑡), 𝑥2 (𝑡 + 𝜏); end; end; end; end; end; end;

Mode (“1+m'”) // initialization 𝐿(𝑡, 𝑥1 (𝑡)) ∶= +∞; // value of state (𝑡, 𝑥1 (𝑡)) 𝐿(0, 𝑥1 (0)) ∶= 0; for 𝑡 = 0 to 𝑇 do begin for 𝑥1 (𝑡) = 0 to 𝐷 do begin for 𝑥1 (𝑡 + 1) = 𝑥1 (𝑡) to 𝑥(𝑡) + 𝑑 𝑚𝑎𝑥 do begin 𝑐 = 𝑐(𝑥1 (𝑡), 𝑥1 (𝑡 + 1)); for 𝑚 = 2 to 𝑚’ do

begin

𝑚

𝑚

𝑥𝑚 (𝑡 + 𝜏) = min{(𝑥𝑚 (0) + 𝑣𝑓 × (𝑡 + ∑ 𝜏𝑖 ) , 𝑥1 (𝑡) − ∑ 𝑑𝑖 } 𝑖=2

𝑖=2

𝑚 𝑥𝑚 (𝑡 + 𝜏 + 1) = min{(𝑥𝑚 (0) + 𝑣𝑓 × (𝑡 + ∑𝑚 𝑖=2 𝜏𝑖 + 1), 𝑥1 (𝑡 + 1) − ∑𝑖=2 𝑑𝑖 }; 𝑐 = 𝑐 + 𝑐(𝑥𝑚 (𝑡 + 𝜏), 𝑥𝑚 (𝑡 + 𝜏 + 1))

end; if 𝐿(𝑥1 (𝑡)) + 𝑐[𝑥1 (𝑡), 𝑥1 (𝑡 + 1)] < 𝐿(𝑥1 (𝑡 + 1)) then begin 𝐿(𝑡 + 1, 𝑥1 (𝑡 + 1)) = 𝐿(𝑡, 𝑥1 (𝑡)) + 𝑐; end; end; end; end;

Mode (“1>1>1…”) // initialization 𝐿(𝑡, 𝑥1 (𝑡)) ∶= +∞; // value of state (𝑡, 𝑥1 (𝑡)) 𝐿(0, 𝑥1 (0)) ∶= 0; // optimize the first leading vehicle for 𝑡 = 0 to 𝑇 do begin for 𝑥1 (𝑡) = 0 to 𝐷 do begin for 𝑥1 (𝑡 + 1) = 𝑥1 (𝑡) to 𝑥1 (𝑡) + 𝑑 𝑚𝑎𝑥 do begin if 𝐿(𝑥1 (𝑡)) + 𝑐[𝑥1 (𝑡), 𝑥1 (𝑡 + 1)] < 𝐿(𝑥1 (𝑡 + 1)) then begin 𝐿(𝑥1 (𝑡 + 1)) = 𝐿(𝑥1 (𝑡)) + 𝑐[𝑥1 (𝑡), 𝑥1 (𝑡 + 1)]; end; end; end; end; // optimize the following vehicles sequentially for 𝑛 = 2 to 𝑚 do begin for 𝑡 = 0 to 𝑇 do begin for 𝑥𝑛 (𝑡 + 𝜏) = 0 to 𝑥𝑛−1 (𝑡) − 𝑑𝑛 do begin for 𝑥𝑛 (𝑡 + 𝜏 + 1) = 𝑥𝑛 (𝑡 + 𝜏) to min{𝑥𝑛−1 (𝑡 + 1) − 𝑑𝑛 , 𝑥𝑛 (𝑡 + 𝜏) + 𝑑 𝑚𝑎𝑥 } do begin if 𝐿(𝑥𝑛 (𝑡 + 𝜏)) + 𝑐[𝑥𝑛 (𝑡 + 𝜏), 𝑥𝑛 (𝑡 + 𝜏 + 1)] < 𝐿(𝑥𝑛 (𝑡 + 𝜏 + 1)) then begin 𝐿(𝑥𝑛 (𝑡 + 𝜏 + 1)) = 𝐿(𝑥𝑛 (𝑡 + 𝜏)) + 𝑐[𝑥𝑛 (𝑡 + 𝜏), 𝑥𝑛 (𝑡 + 𝜏 + 1)]; end; end; end; end; end;

4. NUMERICAL EXPERIMENTS In this section, we perform two experiments to demonstrate how to optimize vehicle trajectories to minimize the emissions. Two scenarios are considered: (1) optimize the trajectory of single vehicle so that it will incur the minimal emission after reach its destination; (2) simultaneously optimize the trajectories for two vehicles so that the total emission will be minimal after both vehicles reach their destinations. Figure 5a shows the hypothetical environment. The total length of the road segment is 400 meters and maximum time horizon is 200 seconds; the speed limit is 45 kilometers per hour and from 150 meters to 250 meters is the speed-reduction zone with the speed of 15 kilometers. At 200 meters is a traffic signal where the red phase duration is 30 s and green is 50 s. For the configuration

Space

of Newell’s car-following model, we set𝜏 = 2𝑠, 𝑑0 = 2𝑚. Vehicle emissions are calculated based on vehicles’ instantaneous speeds. Figure 5b shows two hypothetical fuel-consumption curves.

Vf=45 KMPH

Reduced Speed Area (15 km/h)

150 m

30s

80s

T=200 s

FIGURE 5a Layout of hypothetical road segment

D=400 meters

250 m 200 m

Time

FIGURE 5b Hypothetical emission functions for cars and trucks

4.1 Experiment One: Eco-drive trajectory optimization for single vehicle (Mode “1”) In this experiment, we optimize one vehicle’s trajectory from zero to 𝐷 so to minimize its overall emission. Assuming vehicle’s initial speed is 0, vehicle trajectories are optimized using dynamic programming over time. Figure 6 shows the optimal vehicle trajectory and speed incurring the minimal emission and the typical trajectory and speed under the same condition. The resulting emissions under the best fuel efficiency trajectory and Eco-drive vehicle trajectory (based on Eq.(10)) is 1.21 gram and 1.51 gram respectively, a 24.8% emission reduction. However, the travel time of best fuel consumption will be longer than Eco-drive trajectory.

FIGURE 6 Optimal and typical vehicle trajectories and speeds 4.2 Experiment Two: simultaneous eco-drive trajectory optimization for two vehicles (Mode “1+1”)

In this experiment, we consider two vehicles within the scope and their trajectories are optimized simultaneously. The finding of this experiment is particularly useful in automated vehicle research because this experiment can be easily extended to identify when two fleets should merge into one bigger fleet and when these two fleets should be split into smaller fleets while many automated vehicles are on the road. Following the DP algorithm 2 described in Section 3.3, Figure 7 shows the optimal trajectories of two vehicles. From Figure 7, we can tell that both vehicles prefer to drive in a way to avoid stops at intersections. It make sense because the emission rate when a vehicle is idle is the highest according to Figure 5b. When two vehicles leave the intersection, they merge into a fleet for a short time and then break up again until they reach the destination. Figure 7 compares the optimal trajectories and typical trajectories of two vehicles. In this experiment, the resulting emission of two vehicles from optimal trajectories and typical trajectories are 3.37 gram and 2.51 gram respectively, a 34% reduction in fuel consumption.

FIGURE 7 Optimal trajectories for two vehicles 4.3 Experiment Three: Interaction between Eco-drive vehicle and other vehicles (Mode “𝟏 → 𝟏 → 𝟏 …”) In this experiment, we focus on sequentially optimizing three autonomous vehicles’ Eco-drive trajectories based on algorithm 4. Each vehicle focuses on its own trajectory whereas the following vehicles will be restricted by the minimal safe distance from their lead vehicles. Without loss of generality, we consider three autonomous vehicles which enters the scope at 𝑡 = 0, 𝑡 = 20 and 𝑡 = 50 respectively. Figure 8 shows the optimal trajectory.

FIGURE 8 Optimal trajectories for three autonomous vehicles 5. Conclusion In this paper, we focus on longitude vehicle trajectory optimization problem and it is closely related to vehicle dynamics. The longitude vehicle trajectory optimization is fundamental to many AV applications, such as vehicle platooning or adaptive cruise control. Since the vehicle dynamics is nonlinear in nature, most existing formulations for this problem are also nonlinear and typically hard to solve. To provide fast and reliable solutions to real-time vehicle trajectory optimization, we first discretize the time domain into stages and describe vehicle dynamics and minimum safe distance using the simplified car-following model. As a result, we approximate the vehicle trajectory optimization into a linear programming (LP) problem which contains a rich body of mature optimization algorithms. In addition, we further discretize the space domain to create a grid over a time-space plane and cast the vehicle trajectory optimization problem in dynamic programming. Distance-based DP algorithms are also designed to reach the vehicle optimal trajectory. This paper also discusses the simultaneous trajectory optimization for multiple vehicles. It is envisioned that AVs will frequently join or leave platoons while many AVs are on the streets. To our knowledge, most vehicle trajectory optimization research in the past either focuses on single vehicle or on the first vehicle in a platoon. However, in reality, chances are that some vehicles may not have to stay in a platoon for a long time to achieve the overall system optimum when multiple vehicles are within the scope. Through three numerical experiments, we demonstrate how to use dynamic programming technique to optimize AV trajectories to achieve various goals. In the future, we also plan to make more use of the relationship between simplified car-following model and the macroscopic flow-density model for traffic state prediction.

REFERENCES 1. A. Reuschel. Vehicle Movements In a Platoon with Uniform Acceleration or Deceleration of the Lead Vehicle, Zeitschrift des Oesterreichischen Ingenieur-und Architekten-Vereines, No.95, 1950, pages 193-215. 2. L. A. Pipes. Operational Analysis of Dynamics, Journal of Applied Physics, 24(3), 1953, Pages 274-287 3. E. Kometani, and T. Sasaki. Dynamic Behavior of Traffic With a Non-liner Spacing-Speed Relationship, Theory of Traffic Flow Symposium Proceeding, 1961, pages 105-119 4. T.W.Forbes, and Simpison. Driver and Vehicle Response in Freeway Deceleration Waves, Transportation science, vol.2, no.1,1968, pages 77-104 5. R. E. Chandler, R. Herman, and E. W. Montroll. Traffic dynamics: studies in car-following, Operations Research, vol.6. no.2, 1958, pages 165-184 6. Newell, G.F. A Simplified Car-following Theory: A Lower Order Model, Transportation Research Part B, 36, 2002, 195-205 7. Horowitz, R. and P. Varaiya. Control design of an automated highway system, Proceedings of the IEEE, 2000. 8. National Automated Highway System Consortium (NAHSC), Milestone 2 Reporter, Hard Braking Safety Analysis Method and Detailed Results, Appendix J, 17, 1996. 9. P. Ioannou, and A. Bose. Automated Vehicle Control Handbook of Transportation Science, Ed. R. Hall, pp.187-233, 1999, Kluwer Academic Publishers. 10. L. Alvarez, and R. Horowitz. "Safe Platooning in Automated Highway Systems. Part I: Safety Regions Design," Vehicle System Dynamics Special Issue: IVHS, Vol. 32, No. 1, pp 23-56, July 1999. 11. John T. Betts. Survey of Numerical Methods for Trajectory Optimization, Journal of Guidance, Control, and Dynamics, Vol. 21, No. 2, 1998, pp. 193-207. 12. Mensing, F., Trigui, R., and E. Bideaux. Vehicle Trajectory Optimization for Application in ECO-driving. In: IEEE Vehicle Power and Propulsion Conference. 2011. 13. Flint, M, Polycarpou, M and Fernandez-Gaucherand, E. 2002. Cooperative Path-Planning for Autonomous Vehicles Using Dynamics Programming. Proceedings of 15th IFAC World Congress. 2002, Barcelona. 14. A. D., May. Traffic Flow Fundaments, Prentice Hall, 1990 15. Ioannou, P.A. et al. Activity D: Lateral and longitudinal control analysis. Final Report. 1994. 16. P. Varaiya. Smart Cars on Smart Roads: Problems of control, IEEE Trans. Automat. Contr., vol. 38, pp.195 -207, 1993. 17. Petros L., and A. Bose. Handbook of Transportation Science. International Series in Operations Research & Management Science Volume 56, 2003, pp 193-241.