# I. Introduction raffic congestion affects traveler mobility and accessibility and produces problems and challenges for transportation agencies. Reduction in traffic congestion improves these conditions while also reducing transportation-related energy and environmental impacts. Accordingly, optimizing the utilization of the available infrastructure using advanced control techniques has become increasingly necessary to mitigate traffic congestion in a world with growing pressure on financial and physical resources. An FP is developed off-line using historical traffic data to compute traffic signal timings; real-time traffic data is not considered. Thereafter, the order and duration of all phases remain fixed and do not adapt to fluctuations in traffic demand. As a result, FPs are known to age with time, they are suitable for relatively stable and regular traffic flows. However, because the traffic system is a dynamic system, one particular predefined traffic signal plan cannot efficiently fit all realtime traffic conditions [2]. Examples of software that compute signal timings are TRANSYT-7F, and PASSER. TRANSYT-7F is a macroscopic deterministic optimization and simulation model that considers platoons of vehicles instead of individual vehicles. The model attempts to minimize a disutility index based on delay, stops, and queue lengths [3]. This approach has been found to only be appropriate for under-saturated conditions [4]. PASSER is an arterial-based, bandwidth optimizer (i.e., it maximizes the green band to move the anticipated platoon of vehicles through the arterial signal system without stopping) that computes phase sequences, cycle lengths, and offsets for a maximum of 20 intersections in a single run [4]. PASSER works within a given cycle length and split to find offsets that maximize an arterial green band. Actuated traffic signal control, on the other hand, responds to changes in traffic demand patterns. This type of control requires that vehicle detectors be installed at approach stop lines to the intersection. The actuated timing plan responds to traffic demand by placing a call to the controller at th presence or absence of vehicles approaching or leaving the intersection, respectively. Once a call is received, the controller decides whether to extend or terminate the green phase in response to the actuation source. Note, however, that while actuated signal control was proven to perform better than fixed-time traffic signal control in most cases, actuated traffic signal control does not offer any realtime optimization to properly adapt to traffic fluctuations. e-A signalized intersection is designed (controlled) to allow traffic flow to proceed efficiently and safely by separating conflicting movements in time rather than in space. Traffic signal control methods attempt to minimize various traffic parameters (e.g., delay, queue length, and energy and emission levels), by optimizing traffic signal parameters, including the cycle length, phase scheme, phase split and offset. Consequently, traffic signal optimization algorithms attempt to identify the optimal values of one or more traffic signal parameters for specific traffic conditions. Most of the currently implemented traffic signal systems could be categorized as one of the following: fixed-time plan (FP), actuated (ACT), or adaptive [1]. Consequently, actuated signal control is less sensitive to the traffic demand (i.e., number of vehicles) calling for the actuation and might result in very long queues in grid-like networks [5]. One of the main disadvantages of actuated and adaptive traffic control algorithms is that their operation is constrained by maximum and minimum values for cycle lengths, splits, and offsets. In addition, some of todays most sophisticated traffic control systems use hierarchies that either partially or completely centralize the decisions, making the systems more vulnerable to failures in one of the master controllers. In such events, the entire area of influence of the master traffic signal, which may include several intersections, will be compromised by a single failure. Hierarchies also make systems more difficult to scale up, as centralized computers will need to interconnect all intersections within pre-defined subareas, creating limitations and requirements as the network is expanded [11]. Traffic flow is highly dependent on factors such as time-ofday, day-of-the-week, weather, and unpredictable events such as incidents, special events, work zones, etc. Consequently, improvements to traffic control strategies could be made if the control system is able to not only respond to the actual conditions found in the field, but also to adapt their actions to transient conditions. Cycle-free strategies could also offer a new, less restrictive perspective to accommodate changes in traffic conditions. Game theory is considered a suitable method that has the potential to adapt to traffic fluctuations and randomness of traffic systems, and therefore alleviate traffic congestion more effectively than the more commonly used FP and ACT systems [12]. Game theory studies the interactive cooperation between intelligent rational decision makers, and has been widely used in economic, military, and communication. Game theory has also been applied to model traveler route choice behavior [13], control connected vehicle movements [14], and in route guidance [15]. Tan et al. [16], were the first to use Nash bargaining (NB) to optimize the operation of a twophase traffic signal. The performance of their algorithm was assessed using the average speed of all vehicles in the network. Apart from this study, the literature indicates that game-theoretic traffic signal control is very limited. The paper is organized as follows. Section II describes the game theory concept and NB solution, and describes how to control a signalized intersection using a game theoretic framework. Section III discusses and summarizes the simulation setup and the results for different traffic volume situations. Section IV summarizes and concludes the work. # II. Traffic Signal Control Nash Bargaining Solution This section describes the NB solution for two players as shown in Section II-A, and how the approach is extended to four players to control a signalized intersection, as shown in Section II-B. # a) NB Solution for Two Players Considering a Cooperative Game A bargaining situation is defined as a situation in which multiple players with specific objectives cooperate and benefit by reaching a mutually agreeable outcome (agreement). In bargaining theory, there are two concepts: the bargaining process and the # B Adaptive systems use detector inputs, historical trends, and predictive models to predict traffic arrivals at intersections. Using these predictions, they determine the best gradual changes in cycle length, splits, and offsets to optimize an objective function, such as minimizing the delay or the queue length, for intersections within a predetermined subarea of a network [6]. Examples in this category are the SCOOT and SCATS systems. The SCOOT system minimizes a performance index that is a function of delay and number of vehicle stops at all approaches in the network [7]. SCOOT performs effectively in under-saturated traffic conditions, and is a macroscopic model that does not capture microscopic behavior such as gap acceptance and lane changing behavior. SCATS monitors the traffic flows and headways at the stop bars [8]. Based on the volumes and headways gathered in one-minute intervals, green times (splits) are reallocated to the phases of greatest need. Other examples of adaptive systems are RHODES [9] and OPAC [10], which optimize an objective function for a specified rolling horizon (using traffic prediction models) and have pre-defined sub-areas (limited flexibility) in which the signals can be coordinated. RHODES and OPAC are based on dynamic programming that require a state transition probability model for the traffic environment, which is difficult to obtain. In this paper, we develop the NB algorithm, which uses a cycle-free control strategy to optimize isolated signalized intersection traffic signal timings. The algorithm is then tested on a signalized intersection located in the heart of downtown Toronto's financial district, with four approaches comprised of three lanes each, considering different traffic demand levels. To evaluate the performance of the NB approach, each of the following is calculated per movement: average travel time, average stopped delay, average queue length, average vehicle speed, average vehicle throughput, average fuel consumption and average emission levels. Results are then compared with the results obtained using FP and ACT controllers, given that it is difficult to find a benchmark with available operational details due to commercial reasons. bargaining outcome. The bargaining process is the procedure that bargainers follow to reach an agreement (outcome), and the bargaining outcome is the result of the bargaining process. Nash adopted an axiomatic approach that abstracts the bargaining process and considers only the bargaining outcome [17], [18]. Bargaining theory is related to cooperative games through the concept of NB. The NB solution has been applied in a number of applications, including multimedia resource management [19], allocating multiuser channels to networks [20], a wireless cooperative relaying network [21], investment, wages and employment [22], [23], and for downlink beam forming in an interference channel [24]. The bargaining problem consists of three basic elements: players, strategies, and utilities (rewards). Bargaining between two players is illustrated in the bimatrix shown in Table I. Each player, namely P1 and P2, has a set of possible actions A1 and A2, whose outcome preferences are given by the utility functions u and v, respectively, as they take relevant actions. The utility area (S) of the two player cooperation game is shown in Fig. 1; the vertices of the Table 1: Two Players Matrix Game area are the utilities where each player chooses their pure strategy. The disagreement or the threat point d = (d1; d2) corresponds to the minimum utilities that the players want to achieve. The disagreement point is a benchmark, and its selection affects the bargaining solution. Each player attempts to choose their disagreement point in order to maximize their bargaining position. The NB solution can be obtained from the following maximization problem: The NB solution (u*; v*) of this optimization problem can be calculated as the point in the bargaining set that maximizes the product of the players utility gains relative to a fixed disagreement point. This section describes the game model and the NB solution for four players, and shows how the model is adapted and applied to control a four-phase signalized intersection. First, we use the standard NEMA phasing for a four-legged intersection to represent the intersection phases as shown in Fig. 2, with protected, leading main street left-turn phases. Fig. 2: Standard NEMA phasing [25] In the game model, the four phases represent the players P1, P2, P3, and P4 of a four player cooperation game. For each player (phase), there are two possible actions: maintain (A1) or change (A2). These actions represent the state of the traffic signal. Specifically, maintain indicates that the state of the signal will not change (i.e., if it is green, it will remain green; if it is red, it will remain red.). Change means the state of the signal will change (i.e., if it is green, it will switch to yellow and then red; if it is red, it will become green.) in the simulated time interval. The combinations of phases offer four possibilities, where only one player holds the green indication and all others hold red indications. In the simulation, the INTEGRATION traffic simulation software monitors the vehicle speeds and the vehicle flow approaching the intersection and continuously updates them for each lane connected to the signalized intersection. If the vehicle (v) speed ( ) is less than a certain threshold speed (s Th ) at time (t), the vehicle is assigned to the queue, and the current queue length associated with the corresponding lane (l) is updated. Once the vehicle's speed exceeds (s Th ), the queue length is updated (i.e., shortened by the number of vehicles leaving the queue) and formulated mathematically is the number of queued vehicles in lane l at time t. The utilities (rewards) for each player (phase) in the game can be defined as the estimated sum of the queue lengths in each phase after applying a specific action. The estimated queue length after applying a specific action is calculated according to the following equation: Isolated Traffic Signal Control Using Nash Bargaining Optimization P 2 A 1 A 2 P 1 A 1 u 1 , v 1 u 2 , v 2 A 2 u 3 , v 3 u 4 , v 4 max u,v (u ? d 1 )(v ? d 2 ), s.t.(u, v) ? S, (u, v) ? (d 1 , d 2 )(1q t l = v?v t l q t v q t v = ? ? ? ? ? ? ? 1 if s t?1 v > s T h & s t v ? s T h -1 if s t?1 v ? s T h & s t v > s T h 0 if s t?1 v ? s T h & s t v ? s T h if s t?1 v > s T h & s t v > s T h(2) (3) s t v q t l Q P (t + ?t) = l?P q t l + Q inl ?t ? Q outl ?t (4) Where t is the updating time interval, is the current queue length at time t, QP (t + t) is the estimated queue length after t for phase P, Qinl is the arrival flow rate (veh/h/lane), and Qoutl is the departure flow rate (veh/h/lane). The objective is to minimize and equalize the queue lengths across the different phases [26], [27]. We use minus queue length as the utility of each strategy. The NB solution is extended to four players with a fourdimensional utility space and disagreement points. The solution for the NB over the four phase combinations has the following formula: The NB solution can be calculated as the vector that maximizes the product of the player's utility gains relative to a fixed disagreement point. # III. Simulation Setup and Results This section describes the testbed intersection used in the simulation study (Section III-A), the traffic simulator used in the simulation (Section III-B), the measures of effectiveness used to evaluate the performance of the system (Section III-C), the simulation parameters (Section III-D), and the simulation results when applying the various control strategies (Section III-E). # a) Test bed Intersection The simulation results were tested on an intersection with four approaches comprised of three lanes each located in the heart of downtown Toronto's financial district (intersection of Front and Bay streets) The traffic demand origin-destination (O-D) matrix provided in Table II [29], represents the highest total demand approaching the intersection during the afternoon rush hour (PM Peak) for the year 2005. # Table 2: Origin Destination Demand Matrix b) Traffic Simulator IINTEGRATION software was used to model the intersection [30]. It is a microscopic traffic simulation model that traces individual vehicle movements every deci-second. Driver characteristics such as reaction times, acceleration and deceleration rates, desired speeds, and lane-changing behavior are examples of stochastic variables that are incorporated in INTEGRATION [31]. # c) Measures of Effectiveness (MOEs) The following measures of effectiveness were used to evaluate the performance of the system: ? Average Total Delay (s/veh): the sum of delay each decisecond for all vehicles for the entire simulation horizon divided by the number of vehicles. (5) [28], as shown in Fig. 3 ? Average Fuel (L): the total volume of fuel consumed by vehicles divided by the number of vehicles. ? Average CO2 (grams): the total amount of CO2 produced divided by the total number of vehicles. ? Last Vehicle Arrival Time(s): the arrival time of last vehicle to its destination. # d) Simulation Parameters The fixed time signal plan was optimized using the Webster method [2], with yellow time of 3s, and all red time of 2s. The optimized effective green time for the four phases shown in Fig. 2 were, 19s, 47s, 14s, 32s, respectively. The actuated control was implemented with minimum green time of 10s, maximum green time of 78s, and green extension time of 5s. The simulations were conducted using the following parameter values; speed at capacity = 60 (km=h), free flow speed = 80 (km=h), jam density = 160 (veh=km=lane), saturation flow rate = 1900 (veh=h=lane), and threshold speed sTh= 4:5 (km=h). # e) Results and Discussion An optimum FP and an ACT controllers were simulated to serve as benchmarks to evaluate the performance of the NB approach. Vehicles were allowed to enter the links in the first hour, and the simulation ran for an extra half hour to guarantee that all vehicles exited the network. Three scenarios were simulated: one for the original O-D demand shown in Table II, the second for a lower demand (L-D), i.e., (-25%) of the original demand, and the third for higher demand (H-D), i.e., (+25%) of the original demand. # 1) Original Demand (O-D): The simulation results shown below were obtained using three signal control systems: FP, ACT, and NB. The MOEs are shown in Table III to quantify the effect of each control system on the performance of the signalized intersection. Five cases were conducted at different threat points (d), and at different updating intervals for NB ( t) in order to study their effect on the performance of the NB algorithm. First, the performance of the intersection using the three control systems (FP, AC, NB) was investigated, at the following parameters values: Table 3: Overall Intersection Performance Measure For Different Control Systems back into the through lane, where this number is duplicated for the right and the through movements. The simulation results shown in Table III show that the NB approach outperforms the optimum FP and ACT controller. Since the traffic flow is high on all approaches, no considerable difference is reported between the FP and the ACT controllers. The NB approach exhibits significant savings in the average total delay, average stopped delay, average queue length, and average travel time. The NB shows an increase in the average vehicle speed and in the throughput. Subsequently, the performance of the intersection using the proposed NB approach was investigated using different threat points values and at the same updating interval, using the following parameters values In this case, the threat point was chosen based on the number of cars that each phase can accommodate based on the lane lengths, shown in Fig. 3, where the right turn and through lanes can accommodate more cars than the left turn lanes. The results shown in Table III show that MOEs in case 2 outperform the results in case 1. Finally, three more simulations were conducted using the proposed NB algorithm to investigate the effect of the choice of the updating time interval on the algorithm performance using the same threat point values. The results shown in Table III show that case 3 outperforms the results of the other cases, as well as the FP approach and the ACT approach. Fig. 4 shows the average queue length and the standard deviation across all movements for each control system, (FP, ACT, and NB). The NB algorithm shows significant reduction in the queue length. The threat point was chosen based on the number of cars that left turn pocket lanes could accommodate to prevent spill Fig. 5 shows the average values and the standard deviations of the MOEs across all movements over the entire simulation time for each control system, (FP, ACT, and NB). The NB algorithm outperforms both FP and ACT for all movements with significant reduction in both the average Fig. 4: Average queue length values and the standard deviations for the total delay, stopped delay, arrival time, fuel consumption, and CO2 emission. In addition the NB algorithm shows an increase in the average vehicle speed. The simulation results showed that, the NB control approach exhibited major improvements in both the average values and the standard deviations of all MOEs for different movements, which indicates that the system efficiency is improved. 2) Lower And Higher Demand: To better evaluate the performance of the NB approach, two other simulations were conducted, one at lower demand (L-D), and the other at higher demand (H-D). Table IV shows the results of using the three control approaches at the O-D, L-D, and H-D levels using the following NB algorithm parameters In addition, Table IV shows the percent improvement in MOEs using the proposed NB algorithm over using either the FP or the ACT approach. The analysis of the results in Table IV leads to the following findings: the proposed NB algorithm outperforms the FP and ACT approaches in terms of average stopped delay, average queue length, average travel time, average vehicle speed, average throughput, average fuel consumed, average CO2 emitted, and time in which the last vehicle clears the network for different demand levels. To further investigate the achieved improvements using the NB approach, simulations were conducted at different flow ratios (Y ). The flow ratio can be formulated mathematically where, yi is the approach flow ratio for lane group i, vi is the traffic volume, si is the saturation flow rate, yc;j is the critical flow ratio for all lane groups that discharge during phase j, and Y is the sum of all critical follow ratios for all phases. Fig. 6 shows the average queue length, the average total delay, and the average CO2 at different flow ratios; Y ratios vary from 0:1 to 1:2. These results show that significant improvements are achieved using the NB approach at different traffic volumes. In summary, the simulation results showed that the NB control approach exhibited major improvements in the MOEs for all movements when compared to FP and ACT algorithms, which improves the system efficiency. y i = vi si , Y = y c,j(6) # IV. Summary & Conclusions The paper developed a Nash bargaining (NB) isolated traffic signal controller. The INTEGRATION microscopic traffic assignment and simulation software was used to evaluate the performance of the algorithm relative to an optimum fixedtime plan and an actuated controller on a major intersection in downtown Toronto using observed traffic data. Five NB algorithm cases were simulated considering different update time intervals and different threat point values to study the effect of these parameters on the algorithm's performance. The simulation results using the NB approach show that, using relatively short cycle lengths, it is possible to minimize delay and maximize traffic flow efficiency. To evaluate the benefits of using the proposed approach, three scenarios were simulated using the three control approaches for different traffic demand levels. The results show significant reductions in the average total delay ranging from 41% to 64%, a reduction in the average queue length ranging from 58% to 77%, a reduction in the emission levels ranging from 6% to 17%, a reduction in the average travel time ranging from 37% to 65%, and a reduction in the network clearance time ranging from 1% to 12%. To further investigate the achieved improvements using the NB approach, simulations were conducted at different flow ratios. The simulation results demonstrate a significant potential for the NB approach over FP and ACT. Moreover, the results show that major improvements are achievable using the NB algorithm regardless of the traffic demand level. Ongoing research entails extending the work to test the NB algorithm on an arterial facility. 1![Fig. 1: Utility region](image-2.png "Fig. 1 :") ![Journals Inc. (US)](image-3.png "") 3![Fig. 3: Simulated Intersection in downtown Toronto.The traffic demand origin-destination (O-D) matrix provided in Table II[29], represents the highest total demand approaching the intersection during the afternoon rush hour (PM Peak) for the year 2005.](image-4.png "Fig. 3 :") ![? d i ) s.t.(u 1 , ..., u 4 ) ? S, (u 1 , ..., u 4 ) ? (d 1 , ..., d 4 )](image-5.png "") 2![d = (?17, ?55, ?19, ?51), ?t = 15s Case 3 ? d = (?17, ?55, ?19, ?51), ?t = 10s Case 4 ? d = (?17, ?55, ?19, ?51), ?t = 5s Case 5 ? d = (?17, ?55, ?19, ?51), ?t = 20s](image-6.png "Case 2 ?") ![Journal of Researches in Engineering ( ) Volume XVI Issue I Version I 32 Year 2016 B d = (?17, ?55, ?19, ?51), ?t = 10s (a) Average total delay. (b) Average stopped delay. (c) Average travel time. (d) Average vehicle speed.](image-7.png "Global") 5![Fig. 5: Measure of effectiveness](image-8.png "Fig. 5 :") ![as (a) Average queue length.](image-9.png "") 6![Fig. 6: Measure of effectiveness vs. flow ratio Fig. 7 shows the average queue length at two different flow ratios (Y) (i.e., 0:1 and 1:2). Considerable reductions in the queue lengths were found for all movements.](image-10.png "Fig. 6 :") 7![Fig. 7: Average queue length vs. flow ratio.](image-11.png "Fig. 7 :") Zone #2468Total11223-13412114783-84486278120858871721-8807188100-8061094Total1499101594112054660 5 © 2016 Global Journals Inc. (US) * Computational intelligence in urban traffic signal control: A survey YDai DZhao ZZhang IEEE Transactions on Systems, Man, and Cybernetics 42 4 July 2012 * CDaganzo Fundamentals of Transportation and Traffic Operations 1997 * Using genetic algorithm for traffic light control system with a pedestrian crossing AMTurky MSAhmad MYusoff BTHammad 4th International Conference RSKT, Gold Coast, Australia July 2009 * Comparison among computer packages in providing timing plans for iowa arterial in lawrence, kansas XYang Journal of Transportation Engineering 127 2001 * RRoess EPrassas WMcshane Traffic Engineering 2010 * Benefits of signal timing optimization and its to corridor operations LJFrench MSFrench August 2006 French Engineering, LLC, Tech. Rep. * Scoota traffic responsive method of coordinating signals PBHunt DIRobertson RDBretherton RIWinton 1981 Transport and Road Research Laboratory, Tech. Rep. * Scat-the sydney co-ordinated adaptive traffic systemphilosophy and benefits AGSims KWDobinson International Symposium on Traffic Control Systems 1979 * Hierarchical framework for real-time traffic contro KLHead PBMirchandani DSheppard Transportation Research Record 1360 1992 * Opac: A demand-responsive strategy for traffic signal control NHGartner Transportation Research Record: Journal of the Transportation Research Board 906 1983 * Balancing safety and capacity in an adaptive signal control system -phase 1 MREvans OFederal Highway Administration, Tech. Rep October 2010 * Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers STantawy BAbdulhai HAbdelgawad IEEE transactions on intelligent transportation systems 2013 * Game-theoretic formulations of interaction between dynamic traffic control and dynamic traffic assignment JChen Transportation Research Record 1617 1998 * An intersection game-theory-based traffic control algorithm in a connected vehicle environment MElhenawy AAElbery AAHassan HARakha Proceedings of the 2015 IEEE 18th International Conference on Intelligent Transportation Systems the 2015 IEEE 18th International Conference on Intelligent Transportation SystemsWashington, DC, USA * Study on game-theory-based integration model for traffic control and route guidance LJun 2003 Tian Jin University * A study of single intersection traffic signal control based on two-player cooperation game model TLinglong ZXiaohua HDunli SYanzhang WRen WASE International Conference on References Références Referencias Information Engineering Beidaihe, Hebei August 2010 * ZHan DNiyato WSaad TBasar AHjorungnes Game Theory in Wireless and Communication Networks: Theory, Models, and Applications New York Cambridge University Press 2012 1st ed * Literature review on multiattribute negotiations GLai CLi KSycara JGiampapa December 2004 Carnegie Mellon University, Robotics Institute, Tech. Rep. * Bargaining strategies for networked multimedia resource management HPark MVan IEEE Transactions on Signal Process 55 2007 * Fair multiuser channel allocation for ofdma networks using nash bargaining solutions and coalitions ZHan KJ RLiu * IEEE Transactions 53 August 2005 Communications * A cooperation strategy based on nash bargaining solution in cooperative relay networks ZZhang HHChen MGuizani PQiu Vehicular Technology 57 July 2008 IEEE Transactions * Investment and wages in the absence of binding contracts: A nash bargaining approach PAGrout Econometrica: Journal of the Econometric Society 1984 * Wage bargaining and employment AMMcdonald RMSolow The American Economic Review 1981 * Competition versus cooperation on the miso interference channel ELarsson EJorswieck IEEE Journal on Selected Areas in Communications 26 2008 * Traffic analysis toolbox: Guidelines for applying corsim microsimulation modeling software PHolm DTomich JSloboden CLowrance Tech. Rep January 2007 Office of Operations Federal Highway Administration * Distributed learning agents in urban traffic control ECamponogara WKraus 11 * Portuguese Conference on Artificial Intelligence 2003 * Reinforcement learningbased control of traffic lights in non-stationary environments DOliveira ABazzan Fourth European Workshop on Multi-Agent Systems 2006 * An agent-based learning towards decentralized and coordinated traffic signal control SEi-Tantawy BAbdulhai Annual Conference on Intelligent Transportation Systems Madeira Island, Portugal September 2010 * Design of reinforcement learning parameters for seamless application of adaptive traffic signal control SEl-Tantawy BAbdulhai HAbdelgawad Intelligent Transportation Systems: Technology, Planning, and Operations July 2014 18 * Integration release 2.40 for windows: User's guide-volume ii: Advanced model features MVAerde HARakha Tech. Rep June 2013 * Enhancing and calibrating the rakhapasumarthy-adjerid carfollowing model using naturalistic driving data JSangster HRakha Transportation Science and Technology 3 2014