By Vlado Damjanovski
An Internet Protocol (IP) surveillance system is most often used to observe and protect people, objects and people’s activity inside and outside the objects, traffic and vehicles, money handling in banks, or games in a casino environment. All of these objects of interest may have different clarity when displayed on a workstation screen. The image clarity depends primarily on the camera used, the imaging sensor, its lens and the distance from the object.
In a typical installation, once all cables are run, IP addresses allocated to the cameras and recorders, and the system is running, installers would then position the cameras and set the lenses for focused and sharp images. Most often, the choice is the widest possible angle of view (shortest focal length of the lens) and then the installer would focus the image, whether that be manually or automatically, set the optimum recording parameters and complete the installation.
There is nothing wrong with this approach if the images are sharp and clear and the customer is happy with it. Very little attention is given to the clarity of the key objects for the given field of view, where clarity refers to the object size and the ability to recognise a person, a number plate or money, for example. Clearly, the closer the objects are to the camera, the easier it would be to identify an intruder and vice versa.
However, there is a better and more scientific approach. There is one parameter in IP CCTV that expresses the image clarity in a simple way – pixel density. The pixel density is usually expressed in pixels per metre (pix/m), at the object plane, although it can be expressed in pixels per foot. Pixel density in an IP CCTV sense should not be confused with the display pixel density quoted by various LCD display manufacturers, which defines the screen density in pixels per inch (PPI).
The advantage of expressing object clarity with its pixel density is that it combines the sensor size, pixel count, focal length and distance to the object in just one parameter. When using pixel density metrics, all variables are included and the details on an operator’s workstation screen will be universally understandable.
When designing a system, or a tender for a system, one can request pixel density for a particular image quality. So, instead of asking for a 6mm lens for a camera in a particular location for example (which means nothing without knowing the camera sensor it is used on), it would be much more useful if a particular pixel density is defined for the view. This will then be used to calculate the required lens for the camera used and the distance from the object. This will guarantee the clarity of the image (assuming the lens is focused optimally and there is sufficient light, of course).
Pixel Densities for Different Objects
Pixel density can be used for any object that an IP CCTV user might be interested in: face, licence plate, playing card, money and similar.
One of the most commonly referred pixel densities is for face identification. Face identification in CCTV means sufficient clarity of the image so that one can positively identify who the person on the screen is. According to Australian Standard AS 4806.2, for face identification in analogue CCTV, 100 percent of a person’s height is required to fit on the monitor screen display. This has been tested many times and has been verified to be sufficient for identification. A PAL signal is composed of 576 active TV lines so, according to AS 4806.2, a person’s height would occupy all of the active lines to make it 100 percent. The head occupies around 15 percent of a person’s height, which is equivalent to around 86 lines (576 x 0.15 = 86.4), which is the same when converted to pixels (assuming recording is made in full TV frame mode, which is equal to two TV fields). Assuming that an average person’s height is 170cm, the head would occupy around 25cm of that. The pixel density at the object, which is required to make a positive face identification according to AS 4806.2, can be calculated to be 86 pixels at 25cm of head height. Since there are four times 25cm in 1m of height, this becomes 4 x 86 = 344pix/m. So, one can say that with a pixel density of 344pix/m at the object’s plane, it should be possible to positively identify a face, according to AS 4806.2.
Some other standards may require different values, and one such (newer) standard is the IEC 62676-4, which defines 250pix/m to be sufficient (that is, it suggests that identification of a person is possible with a slightly lesser pixel density than the AS standards).
Clearly, this number is not fixed in concrete and it will depend on the observing ability of the operator, as well as other parameters (lens quality, illumination, compression artefacts and so on), but the key is to understand that such a pixel density can be calculated for any type of camera, irrespective if that is SD, HD, 4k or any other format.
The next image quality down, as defined by the standards, is for face recognition. The details of a face recognition image should be sufficient to recognise the gender of a person, what he/she is wearing and possibly make an assertion of who that person might be, if picked from a group of people that have already been identified somewhere else (for example, from a passport or drivers licence photo). This is basically an image with half the pixel density of the face identification, which according to AS 4806.2 should be around 172pix/m, while IEC 62676-4 suggests 125pix/m.
Similarly, pixel density can be defined for visual recognition of vehicle licence plates (not software automatic licence plate recognition). In AS 4806.2, this is defined as five percent of the character’s height on a display screen, which is around 30 TV lines (pixels; to be very accurate, 576 x 0.05 = 28.8). Assuming that a typical Australian number plate has characters of around 90mm in height, then this equates to 11 x 30 pixels = 330pix/m. The number 11 is obtained from dividing 1000mm (1m) with 90mm. Visual licence plate recognition requires a similar pixel density as for face identification.
When money and playing cards are observed in banks or casinos, many practical tests have shown that at least 50 pixels are required across the longer side of the notes or cards in order to positively identify the values. According to ISO216 standard, the dimensions of standard playing cards are B8, which is 62mm x 88mm. So, the 88mm card length needs to be covered with at least 50 pixels for proper identification. This means around 550pix/m (1000mm/88 mm = 11 => 50 pix x 11 = 550pix/m) should be sufficient for playing cards. A slightly better pixel density may be required for identifying money, since the size of notes is typically larger than playing cards, so using the face inspection pixel density of 1000pix/m should attain good identification, although as it can be seen from the real-life example in Figure 5, even 770pix/m might be sufficient.
As it can be concluded from the above examples, the pixel density can be defined for any object and any camera, large or small. The beauty of the pixel density parameter is, as said at the very beginning, that it includes all parameters influencing the clarity of the observed objects.
For this reason, ViDi Labs has developed the ViDiLabs iOS calc (search ViDiLabs calc in the iTunes App Store), a unique tool for the surveillance industry, which can also be used in cinematography, photography and any other imaging application dealing with object details.
The table below can be used as a rough guide for various pixel densities:
Vlado Damjanovski is an internationally renowned CCTV author, lecturer, innovator and consultant. He can be reached via his company website www.vidilabs.com