Video traffic analysis using computer vision


Video traffic analysis using computer vision

For the first time in Russia, transport and economic surveys were carried out using computer vision as part of the design of the Northern Bypass of the city of Perm, carried out by Institute Giprostroymost JSC. Traffic intensity was recorded through video filming from an unmanned aerial vehicle and subsequent analysis of video materials using the TrafficData software package.

With the advent of quadrocopters in our life, it became possible to record entire interchanges on video, which allows us to track the movement of cars in all directions. However, manual processing of such video materials presents a certain difficulty. The development of computer vision and machine learning technologies has made it possible to automate this task.

With the help of TrafficData, labor costs for collecting data on traffic flows for an intersection, a road section are significantly reduced. As a result, it is possible to derive a real distribution of intensity throughout the day within the framework of conventional transport and economic surveys. Now there is no need to use the coefficients of unevenness to determine the average daily intensity given in [1]. It is known that these coefficients are determined on the basis of measurements carried out on the Moscow Ring Road using sensors [2]. Essentially, this data is a special case and cannot be used everywhere. Thus, the collection of data at an intersection or road section is automated, for example, for the purpose of micromodeling traffic flows.

Analysis of the dependence of video analytics results on the quality of the initial data

Obviously, the quality of the result depends on the quality of the video materials loaded into the TrafficData software. Let’s define the limits of TrafficData capabilities. According to [7], a traffic flow analysis is considered valid if its accuracy is at least 85%. The accuracy of the analysis is defined as the ratio of the number of detected and correctly typed vehicles to the total number. That is, firstly, the car must be detected, and secondly, it must be assigned to the appropriate group. And thirdly, in order to determine the intensity by directions, it is necessary to construct the trajectory of the car’s movement, that is, to determine where this car moved in the next frame. Let’s consider what requirements must be met to implement these functions.

  1. Car size in the frame
    A car is distinguishable for a neural network if it is distinguishable for a person. As a rule, this is achieved by the condition that the minimum size of the car should not be less than 10 pixels. The size of the vehicle in pixels depends on the distance to the object and the resolution of the camera. We have derived for ourselves the following dependence of the required resolution on the distance to the detected object:
  2. Frames per second
    During the frame change, the vehicle must not travel more than its length. This can usually be guaranteed at 25 frames per second. If, when changing frames, the car travels more than its length, then breaks in the trajectory are possible, which can worsen the quality of calculating the traffic intensity in directions. But since the TrafficData software has built-in functions for interpolating the movement of objects, which will be discussed below, it was determined through testing that an acceptable quality of building tracks is achieved already at 15 frames per second.
  3. Overlapping objects
    Vehicles and pedestrians can overlap each other and other objects (bridges, road signs, trees, etc.), which can also cause ruptures. When analyzing junctions, the loss of a vehicle’s trajectory when passing under an overpass is a serious problem, since in this case we lose the ability to determine the traffic flow correspondence matrix for a given junction (see Fig. 1b). It is this problem that is solved in the TrafficData software by predicting the area of ​​a car’s appearance when overlapping based on information about its speed and interpolating its trajectory. The software also copes with other local obstacles (trees, signs, poles, billboards) just as easily.
    However, when working with videos that record traffic at an intersection from a height of less than 5 m, vehicles may overlap with each other. And if these cars are similar, then the track can jump from one car to another. TrafficData developers promised to solve this problem in the next release.
  4. Shooting conditions
    Shooting in the dark. It turned out that this is not a problem. TrafficData neural networks are trained to work both at night under headlights and with night vision cameras with a quality of about 95% (see Fig. 2).
    Camera immobility. In case of significant movement of the camera, which is possible when shooting “hand-held” or in windy weather from a quadcopter, the trajectories of vehicles will be distorted. In this case, it will be problematic to arrange sections to isolate directions. If the movements are smooth and no more than 50 °, then the issue is solved by the built-in video stabilization function. Otherwise, it is better to cut a low-quality piece of video. Picture quality. There are a number of hindrances that prevent high-quality video processing. These can be highlights, glare, drops, dirt, incorrectly adjusted image focus, incorrect exposure, which leads to blurred images. Also, artifacts of video signal transmission during video streaming can be referred to this group. In general, if the car cannot be seen in the video with the human eye, then the neural network will not recognize it either, since it is marked using the same human eye. If the above conditions were met, the TrafficData software video analytics quality varied from 83% to 96%. The lower limit was reached when processing video from surveillance cameras due to periodically occurring overlaps of objects, while video from UAVs almost always provided quality over 90%.

Prospects of the considered approach from the point of view of ITS development

Institute Giprostroymost considers three ways to increase population mobility:

  • construction of new interchanges;
  • improving the organization of traffic;
  • adaptive regulation.

The most expensive is, of course, the first one, but is it always the most effective? It is clear that one cannot do without it when the transport system has exhausted its capacity in terms of throughput. However, in most cases, the existing transport network is not used to its maximum efficiency, since it is not balanced, not regulated.
Therefore, the optimization of road traffic through the creation of ITS has long attracted engineers as an economical and efficient way to solve transport problems. Unfortunately, previously there was a lack of technologies that would make it possible to fully implement the optimal use of the UTS. Today the situation has changed. With the advent of computer vision in our life, we can use conventional CCTV cameras to make a more balanced decision on new construction and resort to it after we have achieved the maximum efficiency of the road traffic system by optimizing traffic flows.

1. ODM 218.2.020-2012 “Methodological recommendations for assessing the throughput of highways.” FDA Rosavtodor,
Moscow, 2012.
2. Mendeleev GA Regularities of change in time of the intensity of urban automobile traffic. Diss. for competition
title of Cand. tech. sciences. 2001.
3. Order of the Ministry of Transport of the Russian Federation of April 18, 2019 No. 114 “On approval of the traffic monitoring procedure.”
4. Order of the Ministry of Transport of the Russian Federation of December 26, 2018 No. 479 “On the approval of methodological recommendations for the development and implementation of measures
receptions on the organization of road traffic in terms of calculating the values ​​of the main parameters of road traffic “.
5. SP 34.13330.2012 “Highways”.
6. SP 396.1325800.2018 “Streets and roads of settlements. Urban planning rules “.
7. Silyanov VV Theory of traffic flows in the design of roads and traffic management. Moscow: Transport, 1977.