Study on Identification of Driver Steering Behavior Characteristics Based on Pattern Recognition
Department of State Key Laboratory of Automotive Simulation and Control, Jilin University, China
Personalization is one of the general automotive development trends in the world [1]. Vehicle driving is a process among driver, vehicle, and environment. A good design of vehicle dynamic should not only consider safety, but also comfort and driving pleasure [2]. The control strategy of vehicle dynamic should vary with drivers of different driving characteristics. On the premise of ensuring safety, the design of vehicle dynamic should be driver-oriented to realize the “personalized ideal vehicle dynamic characteristics” [3-4]. The so-called "personalized ideal vehicle dynamic characteristics" refers to the preferred vehicle dynamic characteristics by drivers of different driving characteristics [5]. Taking the steering manipulation for instance, the aggressive driver may prefer a car with dynamic performance of bigger yaw rate and lateral acceleration. While, the cautious driver may be fond of a car with dynamic performance of smaller yaw rate and lateral acceleration under the condition of the same speed and steering wheel angle input. If the difference of characteristics between different drivers can be taken into account at the design stage of vehicle dynamic, the “Personalized driving” can be realized. That is to say, change the current situation of “Driver adapts to Car” to the situation of “Car adapts to Driver actively”.
With the popularity of the advanced driver assistance systems (ADAS) on vehicles, the problem of the interaction between ADAS and drivers became prominent. How to make the system work in a familiar way for individual drivers and enhance the ADAS’s user acceptance are the keys [6-8]. Taking the design of control strategy for adaptive cruise control (ACC) for example, the ideal spacing policy for different drivers may be different. The cautious driver may want ACC to work earlier. While, the aggressive driver may want ACC to work later for the pursuit of driving pleasure under the same condition [9]. In order to make ACC system adapt to individual drivers, the driver characteristics need to be considered in the design of control algorithm.
Based on the above, on the premise of ensuring driving safety and comfort, it is necessary to study on identification of driver characteristics to implement “Car adapts to Driver actively” and improve ADAS’s user acceptance.
From the point of view of information processing, the driver characteristics are reflected in the process of information processing of perception, judgment and driving manipulation in driver-car-environmental interaction. The influences lie in physiology and psychology. The driver characteristics are a relatively stable internal behavior tendency and a combination of personal psychology, personality traits and behavior patterns [10].
Driver characteristics include driving habits, driving experience, driving skills and driving style etc. Considering the risk of real road vehicle testing, many of the previous researches on driver characteristics are based on the widely used driving behavior questionnaires, such as Driver Behaviour Questionnaire (DBQ) [11], Driving Anger Scale (DAS) [12]. Driving Skill Inventory (DSI) [13] and Driving Behavior Inventory (DBI) [14] the perceptual analysis and statistical analysis are taken on the data obtained from the driving behavior questionnaires to improve the road traffic security from the perspective of the driver.
Compared to that, research on driver characteristics based on virtual experiments, such as driving simulator experiments, is an effective research method which is worthy of population. Most of the present researches on driver characteristics serve the driver fatigue monitoring [15] driver intention recognition [16], personalized driving assistant system development [8], improvement of the road traffic security [17], performance optimization of driver-vehicle closed-loop system [18]. In this paper, we mainly research the driving habits when drivers face different traffic conditions, especially about the steering behavior. The research is based on the drivers’ steering manipulations and vehicle’s motion state to identify the drivers’ steering behavior characteristics. The overall process is shown in Figure 1. Based on a real vehicle driving simulator, the steering experiments are designed which can describe drivers’ steering manipulations. According to the primitive data obtained, the feature parameters are extracted as clustering samples. These samples are ultimately clustered into five classes with the aid of K-means and Gaussian Mixture Model (GMM). In the cluster process, K-means is to roughly classify and approves initial parameters for GMM that improves the rate of convergence. Finally, according to the clustering samples and the corresponding cluster labels, two identification models based on BP Artificial neural networkï¼ˆBP_ANNï¼‰and Support vector machine (SVM) are built respectively. Through comparison, BP_ANN has higher testing accuracy and is adopted as the final identification model of driver steering behavior characteristics in this paper (Figure 1). Process of identification of driver steering behavior characteristics
Figure 1: Process of identification of driver steering behavior characteristics.
Design of steering experiment
This paper aims to study on the identification of driver steering behavior characteristics. So the experiment design is very important for drivers to express their steering behavior characteristics, especially the design of road geometric alignment. The road should consist of many bends for drivers to implement the steering manipulations. Considering that it is expensive and unsafe to obtain data from real vehicle test, the steering experiments are conducted on a driving simulator which provides drivers a virtual traffic environment.
The driving simulator is reconstructed by a real A-class car with the aid of Carsim RT, MATLAB/ Stimulant, and DS1103 real-time simulating system from space. Detail descriptions of the driving simulator are shown in Figures 2&3. DS1103 as the core of hardware platform collects the driving manipulations on real-time for the carsim dynamic model, and the calculated steering resistance moment is sent to the real A-class car for road feeling simulation by CAN bus. Control Desk as the debugging tool can monitor the total status of the experiments (Figure 2). Framework of the driving simulator (Figure 3). Driving simulator the road is designed in the Carsem RT, and its geometric alignment is shown in Figure 4 it is made up of many straight-aways and bends. The straight-aways are set for drivers to adjust the vehicle position on the road and vehicle speed before steering. The bends are set for drivers to implement the steering manipulations as the way they feel good. The road length is 2234.5m. And the road elevation is set to 0. There are two lanes, each of whose width is set to 3.6m. There are other traffic vehicles, houses, trees, and traffic cones along the road. Drivers can hear engine roar while accelerating and brake squeal noise while braking. A motor provides steering resistance for drivers. All these help to make drivers feel driving a real vehicle on real road (Figure 4). Geometric alignment of experiment road
Figure 2: Framework of the driving simulator.
Figure 3: Driving simulator.
Figure 4: Geometric alignment of experiment road.
Several drivers experiment on the driving simulator as the way they feel comfortable. Each driver experiments on the driving simulator, manipulating the steering wheel, the brake pedal and the accelerator pedal. At the same time, the experiment/debugging tool Control Desk, is applied to record the driver's manipulation information (steering wheel angle, displacement of brake pedal and the accelerator pedal) as well as the vehicle’s motion status information (speed and yaw velocity). The recording period is 0.005s. Each driver implements experiments on the driving simulator twice. If the vehicle runs out of on the road or appears instability, this test is not successful and the data obtained is invalid.
Though only several drivers are selected as the experiment drivers, their steering manipulations recorded are many enough for us to analyze the driver steering behavior characteristics. And the basic analysis unit is one time of driver steering manipulation which is called a sample in this paper. The drivers’ driving years range is from 2 years to 20 years. Their mean driving years are 5.5 years.
Feature parameters extracting
It is shown that driver steering behavior characteristics are different from another driver, referring to which the steering wheel angle is chosen as the driver steering behavior signal [19]. Vehicle speed and yaw rate are chosen as the vehicle’s state signals.
The perceptual analysis on driving habits of drivers with different steering characteristics is as follows. First, compared to a cautious driver, an aggressive driver may tend to drive at a higher speed and steer the steering wheel more rapidly. Second, difference of drivers’ habits may also lie in whether he usually adjusts the steering wheel angle in small range frequently. Third, drivers of different driving habits may be accustomed to different yaw rate. According to the above all, mean rotation velocity of steering wheel, standard deviation of steering wheel angle, mean speed and mean absolute value of yaw rate while steering are selected as the feature parameters profiling each driver steering manipulation sample extracted from the primitive signals.
The steering wheel’s rotation motion can be divided into two sections when vehicle turning. Firstly the driver steers steering wheel to the maximum angle actively. And then, steering wheel goes back to 0 degree driven by opposite rotary moment of force from road surface. Feature parameters are extracted from the first section. If the maximum absolute value of steering wheel angle during the time section is less than 10 degree, this section is rejected. Because, it is considered that the driver is not implementing the steering manipulation. He may be adjusting the steering wheel on a small scale. It is shown in figure 5, in which the valid sections during steering are shown in blue and boldface.
(Figure 5) valid sections during steering.
Figure 5: Valid sections during steering.
There are 541 efficient sections in all extracted from all the steering experiment. 541 driver steering manipulation samples are figured out, each of which is a vector of four dimensions consisting of four quantities, the mean rotation velocity of steering wheel, standard deviation of steering wheel angle, mean speed and mean absolute value of yaw rate while vehicle turning. To eliminate the effect of index dimension and quantity of data, all the feature parameters are normalized from 0 to 1. All the following data processing is based on the feature parameters after uniformization.
Clustering process
Clustering or Cluster Analysis is the task of grouping data in such a way that samples in one group share some properties with the samples from the same group but not with the samples from any other group. Clustering can be distinguished in: Hard Clustering (an object belongs to a cluster or not), and Soft Clustering (an object belongs to each object to a certain degree (likelihood of belonging to the cluster).
Gaussian Mixture Model (GMM) is nothing else but actually a parametric estimation of a probability density function which is represented by a weighted sum of a number of individual Gaussian distributions. The GMM form is as follows.
$P(x)={\displaystyle \sum _{i=1}^{k}{\pi}_{i}*N}(x|{\mu}_{i},{\sigma}_{i}{}^{2})$
(1)
K is the number of components,
${\pi}_{i}$
(i=1,2,…,k) are the mixture weights.
$N\left(X/{\mu}_{i,{\sigma}_{{i}^{2}}}\right)\left(i=1,2,\mathrm{...}k\right)$
With mean
$\mu $
and variance
${\sigma}_{i}^{2}$
are the component Gaussian density functions. Each component density is a univariate Gaussian function of the form:
$N(x|{\mu}_{i},{\sigma}_{i}{}^{2})=\frac{1}{\sqrt{2\pi}{\sigma}_{i}}\mathrm{exp}\left[-\frac{1}{2{\sigma}_{i}^{2}}{(x-{\mu}_{i})}^{2}\right]$
(2)
The mixture weights
${\pi}_{{}_{i}}$
indicate the percentage of the driver steering manipulation samples belonging to each category i and satisfy the constraint
$\sum _{i=1}^{k}{\pi}_{i}}=1$
. The complete GMM is parameterized by the means
$\mu $
variances and mixture weights
${\pi}_{{}_{i}}$
from all of the component Gaussian densities. These parameters are collectively represented by the notation
$\theta =\left({\pi}_{i,{\mu}_{i,}}{\sigma}_{i}^{2}\right)$
The parameters
$\theta =\left({\pi}_{i,{\mu}_{i,}}{\sigma}_{i}^{2}\right)$
of the GMM are estimated from given training data by utilizing the Expectation Maximization (EM) and Maximum Likelihood (ML) algorithms to maximum the likelihood
$\theta $
$J(\theta )=ln[{\displaystyle \prod _{j}^{M}P({x}_{j})}]={\displaystyle \sum _{j}^{M}\mathrm{ln}[P({x}_{j})]}$
,
$j=1,\text{}2,\text{}\dots ,\text{}M.\text{}\left(3\right)$
M is the total number driver steering manipulation samples 541 in this paper. is a vector representing one driver steering manipulation sample, which consists of four parameters, the mean rotation velocity of steering wheel, standard deviation of steering wheel angle, mean speed and mean absolute value of yaw rate while steering. Expectation-Maximization algorithm is a commonly used standard approach for estimating the parameters for GMM. Maximum Likelihood algorithm can be considered as part of Expectation Maximization. EM is an iterative algorithm for estimating parameters for maximizing the likelihood for the driver steering manipulation samples. GMM is a soft clustering algorithm that gives the likelihood of each object with each cluster [20].
${X}_{j}$
In the rest of this paper, EM is short for Expectation Maximization Algorithm and GMM is short for Gaussian Mixture Model.
The most powerful attribute of the GMM is its ability to form smooth approximations of arbitrarily shaped densities. And GMM is a soft clustering algorithm that gives the likelihood of each object with each cluster. Considering that advantage, GMM is utilized in this paper for unsupervised clustering of 541 driver steering manipulation samples. The issues in using GMM for unsupervised clustering are to determine the number of components and to initialize the parameters of each component in GMM.
The whole clustering process in section 3 is organized as follows: 4.4 finds the optimal clustering number for driver steering manipulation samples clustering, which is also the number of components k. 4.5 clusters the driver steering manipulation samples roughly by k-means clustering method. The clustering result initializes the parameters of each component in GMM. Next, all the driver steering manipulation samples are ultimately clustered with the aid of Gaussian Mixture Model (GMM) by utilizing the Expectation Maximization (EM) and Maximum Likelihood (ML) algorithms in 4.6.
Optimal clustering number for driver steering manipulation samples clustering
The jury is still out on the optimal classification number for driver behavior characteristics clustering so far. The previous studies on classification of driver behavior characteristics classify driver behavior characteristics into from three to six classes from different perspectives [21-24]. The commonality is that most of these analyses are from the perspective of perceptual analysis. It is necessary and meaningful to study the optimal number for driver steering manipulation samples clustering from the perspective of rational analysis. So the clustering validity is taken into account to solve the problem.
A clustering validity index BWP, proposed by Zhou, the bigger the clustering validity index is, and the better the clustering result is [25]. Considering that clustering numbers less than three and more than eight are of not much practical concern, the alternative clustering numbers are set from three to eight. Go through the alternative clustering numbers from three to eight for finding the optimal clustering number. Repeat the previous step 50 times, and figure out that the mean optimal clustering number is 5.3. So the optimal clustering number for driver steering manipulation samples clustering is set to 5. One optimization process of the 50 times is shown in Figure 6 Optimization process to find the optimal clustering number
Figure 6: Optimization process to find the optimal clustering number.
K-means clustering results
GMM is used in this paper to cluster the driver steering manipulation samples finally. And Expectation Maximization Algorithm (EM) is usually applied to estimate the parameters of GMM [20]. EM is sensitive to the initial parameters. K-means clustering method is used to cluster the driver steering manipulation samples roughly. And the K-means clustering results, proportion of each cluster of the total samples, mean and variance of every cluster, initialize the parameters of each component in GMM. According to the previous paragraph, the optimal clustering number for driver steering manipulation samples clustering is set to 5. The proportion of each cluster of the total samples is shown in Table 1. The mean of each cluster is shown in Table 1. Proportion of each cluster of K-means clustering results (Table 2) Mean of each cluster of K-means clustering results.
Cluster |
1 |
2 |
3 |
4 |
5 |
Proportion |
0.2218 |
0.2514 |
0.0684 |
0.3512 |
0.1072 |
Table 1: Proportion of each cluster of K-means clustering results.
Cluster Num
Parameters |
1 |
2 |
3 |
4 |
5 |
Mean Rotation Velocity of Steering Wheel (Deg/S) |
0.2027 |
0.0853 |
0.5273 |
0.1067 |
0.2830 |
Standard Deviation of Steering Wheel Angle (Deg) |
0.0565 |
0.1059 |
0.2773 |
0.2896 |
0.5126 |
Mean Speed (Km/H) |
0.6527 |
0.3492 |
0.6638 |
0.5868 |
0.6341 |
Mean Absolute Value of Yaw Rate (Deg/S) |
0.0945 |
0.1988 |
0.4158 |
0.4004 |
0.7103 |
Table 2: Mean of each cluster of K-means clustering results.
The variance of each cluster is as follows.
$val(:,:,1)=\left[\begin{array}{cccc}0.0143& 0.0018& -0.0036& 0.0019\\ 0.0018& 0.0024& 0.0002& 0.0024\\ -0.0036& 0.0002& 0.0150& -0.0003\\ 0.0019& 0.0024& -0.0003& 0.0037\end{array}\right]$
(4)
$val(:,:,2)=\left[\begin{array}{cccc}0.0102& -0.0016& -0.0009& -0.0022\\ -0.0016& 0.0041& 0.0001& 0.0045\\ -0.0009& 0.0001& 0.0070& 0.0013\\ -0.0022& 0.0045& 0.0013& 0.0088\end{array}\right]$
(5)
$val(:,:,3)=\left[\begin{array}{cccc}0.0296& 0.0018& -0.0061& 0.0007\\ 0.0018& 0.0107& 0.0044& 0.0102\\ -0.0061& 0.0044& 0.0114& 0.0023\\ 0.0007& 0.0102& 0.0023& 0.0171\end{array}\right]$
(6)
$val(:,:,4)=\left[\begin{array}{cccc}0.0044& -0.0003& 0.0001& 0.0014\\ -0.0003& 0.0097& 0.0047& 0.0014\\ 0.0001& 0.0047& 0.0159& -0.0020\\ 0.0014& 0.0014& -0.0020& 0.0097\end{array}\right]$
(7)
$val(:,:,5)=\left[\begin{array}{cccc}0.0174& 0.0015& -0.0053& 0.0088\\ 0.0015& 0.0167& 0.0036& 0.0067\\ -0.0053& 0.0036& 0.0197& -0.0026\\ 0.0088& 0.0067& -0.0026& 0.0168\end{array}\right]$
(8)
GMM clustering results
The K-means clustering results, proportion of each cluster of the total samples, mean and variance of every cluster, initialize the parameters of each component in GMM. Expectation-Maximization algorithm is applied to estimate the parameters for GMM. After GMM clustering, the sample number in each cluster is 109, 116,124, 107 and 85 respectively. Then, there is one class label corresponding to each steering manipulation sample, one, two, three, four and five. The mean of every cluster after GMM clustering after reverse uniformization is shown in Table 3. Mean of every cluster of GMM clustering after reverse uniformization
Cluster Num
------------------------
Parameters |
1 |
2 |
3 |
4 |
5 |
Mean Rotation Velocity of Steering Wheel (Deg/S) |
37.865 |
9.353 |
20.065 |
4.461 |
19.799 |
Standard Deviation of Steering Wheel Angle (Deg) |
17.284 |
19.808 |
4.613 |
12.458 |
26.200 |
Mean Speed (Km/H) |
45.090 |
40.308 |
41.403 |
32.899 |
39.078 |
Mean Absolute Value of Yaw Rate (Deg/S) |
7.880 |
7.022 |
2.000 |
3.400 |
10.571 |
Table 3: Mean of every cluster of GMM clustering after reverse uniformization.
The mean of every cluster describes the drivers’ steering behavior characteristics of its own cluster in general. It can be known that, while vehicle turning, drivers in cluster one are accustom to steering the steering wheel rapidly, adjusting the steering wheel frequently, driving at a relatively high speed and yaw rate. While the drivers in cluster four express the opposite steering behavior characteristics. It may be inferred that drivers in cluster one may be aggressive drivers, and drivers in cluster four may be cautious drivers. The steering behavior characteristics of drivers and their corresponding steering behavior characteristics in other clusters can also be seen in Table 3.
Modeling process for identification of driver steering behavior characteristics
This part builds the identification model of driver steering behavior characteristics by designing a classifier. Two commonly used methods in the field of pattern recognition, BP Artificial neural network (BP_ANN)and Support vector machine (SVM), are applied in this part [26-28]. In the rest of this paper BP_ANN is short for BP Artificial Neural Network, and SVM is short for Support Vector Machine. The 541 samples are randomly divided into two parts, one consists of 441 for training, and another one consists of 100 for testing.
BP_ANN modeling process
BP_ANN modeling process is a process of training with training samples. The BP_ANN model applied in this paper has three layers, the input layer, the hidden layer and the output layer. Each of the 541vectors corresponding to 541 driver steering manipulation samples consists of four feature parameters. And it is the input of BP Neural Network. And the output corresponding to cluster one, two, three, four and five, is encoded with [10000], [01000], [00100], [00010] and [00001] respectively. Therefore, BP Neural Network has four inputs and five outputs. The input layer has four neurons corresponding to feature parameters of four dimensions after uniformization. The output layer has five neurons. There is a one-to-one correspondence between the five clusters and the coding, 10000, 01000, 00100, 00010 and 00001.
The hidden layer neuron number has a great influence on the training time and model accuracy. Various factors must be taken into consideration to determine the number of hidden layer neurons. But there is no generally accepted method of determining the number. The predecessors' experience and trial are the only reference. The hidden layer neuron number usually refers to the following formula
$l<=\sqrt{m+n}+a$
. In the formula, l is the neuron number of hidden layer. m is the neuron number of output layer, and n is the neuron number of input layer. a is a constant between 0 and 10. While, the optimal hidden layer neuron number mainly depends on the testing accuracy on testing samples.
Arrange the training samples to train the BP_ANN model. The selected learning algorithm is LM learning algorithm (levenberg-marquardt), whose corresponding training function is trainlm. The learning rate is 0.05, and the training goal is 0.0000004. The largest iterations number is 1000 times. All the other parameters in BP_ANN training in this paper are defined as default. Next, the testing samples are applied to test the accuracy of the BP_ANN identification model after training. Go through the alternative neuron number of hidden layer l from 10 to 35, and find the optimal neuron number of hidden layer with the highest testing accuracy. The optimization process of the optimal neuron number of hidden layer is shown in Figure 7. Optimization process to find the optimal hidden layer neuron number It can be seen from Figure 7 that the optimal neuron number of hidden layer is 23, and the corresponding maximum testing accuracy on the testing samples is 0.92. After set the hidden layer neuron number at 23, the BP_ANN model training result are shown in Figure 8. Training result of BP_ANN model.
Figure 7: Optimization process to find the optimal hidden layer neuron number.
Figure 8: Training result of BP_ANN model.
SVM modeling process
Support vector machine (SVM) is one of the most popular classification techniques in pattern recognition community, which has been commonly used in machine learning [29]. SVM assumes that all samples in the training set are independent and identically distributed. It solves a binary classification problem by constructing a hyper plane that separates two classes of data points. And it can be developed for solving classification problem of more than two classes, which is called SVM multiclass. The main advantage of SVM is that only a small percentage of data points lie at the minimum distance from the hyper plane. These points are called support vectors (SV). The key operating principle in SVM is that a nonlinear kernel is used to map the input data into a higher dimensional feature space improving linear reparability of the data. The kernel that is used for this task plays a significant role in performance of the SVM.
The SVM toolbox, developed by faruto, liyang, Chih-Chung Chang and Chih-Jen Lin, is applied in the paper [30,31]. The parameters setting in SVM is as follows. Kernel function applied in this paper is RBF kernel function. Cross validation parameter v is set to 441. Other parameters are defined as default. Parameters c and g are very important [32]. Go through from -3 to 7, that is c and g are both from 2-3 to 27, to find the optimal c and g with the cross-validation method according the accuracy. The optimization process is shown in Figure 9 .Optimization process to find the optimal c and g.
Figure 9: Optimization process to find the optimal c and g.
It can be seen that the best c and g are 22.6274 and 2 according to the accuracy in the figure 11. After setting c to 22.6274 and g to 2, train the SVM model with the training samples. And then, import the testing samples into the SVM model after training to get the corresponding testing accuracy. The corresponding testing accuracy is 0.81. Above all, it can be seen that identification model of driver steering behavior characteristics built by BP_ANN has higher testing accuracy 0.92 on testing samples than by SVM 0.81. So BP_ANN identification model of driver steering behavior characteristics is adopted as the final identification model of driver steering behavior characteristics in this paper.
Figure 10: Predictive results for driver one.
Figure 11: Predictive results for driver two.