The ViDiLabs Calc application for iOS is designed by Vlado Damjanovski, the author of many well-known books, to assist all professionals working with digital and IP cameras.
The calculator is designed to help determine the best lenses for their desired coverage and calculate the required storage.
Although the ViDiLabs Calc has been designed specifically for the IP CCTV Industry, it may also be used by any professional using a digital camera of any size, including, but not limited to, photography, television, cinematography, medical, education, robotics and manufacturing.
The ViDiLabs Calc has a database of all commercially available image sensors, from the smallest 1/8” (1.6mmx1.2mm), through 1/3” (4.8mmx3.6mm), to Full frame FX (36mm x24mm), and up to Medium L size 53.6mmx40.2mm CMOS sensors. This database is used for precise calculations of various parameters and it’s regularly updated.
The application is designed to have landscape view due to the method used for entering and changing variables with scrolling windows, without the need for keyboard pop-ups.
The Visual module
The first module of the ViDiLabs Calc is the Visual module which is designed to calculate a range of variables when determining angles of view (horizontal and vertical), distances to objects to achieve certain quality (expressed in pix/m), as well as calculate the width and height of the viewed scene with a given sensor and lens. The visual part of the ViDiLabs Calc may also help in calculating the projected motion blur of an object moving in front of a camera.
The Digital module
The second module of the ViDiLabs Calc is the Digital module. This is designed to calculate the required storage space for various Video and Image compressions to achieve certain days and quality of recording. This also allows for determining the number of drives needed when using JBOD arrangement of drives, or RAID-1, RAID-5 and RAID-6 redundant configurations.
In addition, a simulated compression of the ViDi Labs SD/HD test chart is shown to indicate the approximate visual appearance of the selected compression.
Formulas compliant with the standards
The ViDiLabs Calc complies with many world standard definitions of Face Recognition, Face Identification and Inspection requirements, as defined by IEC/ISO, AS, BS standards and others.
ViDi Labs has developed all formulas, and designed the application with the best intentions to offer an objective and accurate calculation of observed target details.
The simulated compression appearance of the ViDi Labs SD/HD test chart is a simulation only. This was made based on our experience and results obtained in many practical tests in our labs. They are as accurate to real practical results with a test chart as we can make them.
The Video Compression simulations are based on H.264 compression, while the Image Compression simulations are based on JPEG compression. In real life situations various encoders may produce slightly better or worse results, depending on their profile used, GOP settings and internal filtering. The examples shown in the Digital module of the ViDiLabs Calc, do not simulate lens distortion or loss of resolution, so we recommend it only be used as an approximate guide.
All the ViDiLabs Calc results can be saved as presets, or exported, making them ideal for compiling laborious projects with various lenses, distances and/or storage requirements.
About Pixel Densities and what they mean
An IP surveillance system may be used to observe and protect people, objects and people’s activity inside and outside the objects, traffic and vehicles, money handling in banks, or games in casino environment. All of these objects of interest may have different clarity when displayed on a workstation screen. The image clarity depends primarily on the camera used, the imaging sensor, its lens and the distance from the object.
There is one parameter in IP CCTV that expresses the image clarity in a simple way with just one parameter – Pixel Density. The Pixel Density is usually expressed in pixels per metre (Pix/m), at the object plane, although it can be expressed in pixels per foot. Pixel Density in IP CCTV sense should not be confused with the Display Pixel Density quoted by various LCD display manufacturers which defines the screen density, in Pixels Per Inch (PPI).
The advantage of expressing object clarity with its Pixel Density is that it combines the sensor size, pixel count, focal length and distance to the object in just one parameter.
This is one of the main functionalities of the ViDi Labs Calc application.
When using Pixel Density metrics all variables are included and makes it universally understandable what details you will get on an operator’s workstation screen.
When designing a system, or a tender for a system, one can request Pixel Density for a particular image quality. So, instead of asking for a 6 mm lens for your camera in a particular location, for example (which means nothing without knowing the camera sensor it is used on), it would be much more useful if a particular Pixel Density is defined for the view. This will then be used to calculate the required lens for the particular camera used and the distance from the object. This will guarantee the clarity of the image (assuming the lens is focussed optimally and there is sufficient light, of course). This can be done very easily with the ViDiLabs Calc. Pixel Density can be used for any object that IP CCTV user might be interested in: face, licence plate, playing card, money and similar.
Let us now explore how many pixels per metre are attributed to various objects.
One of the most commonly referred pixel densities is for Face Identification. Face Identification in CCTV means sufficient clarity of the image so that one can positively identify who the person on the screen is.
According to Australian Standards AS4806.2, for Face Identification in analogue CCTV, we require 100% person’s height to fit in the monitor screen height. The details of 100% person’s height on a screen have been tested many times and it’s been verified that they are sufficient for such a person to be identified. We know that PAL signal is composed of 576 active TV lines, so, according to AS4806.2, a person’s height would occupy all of the active lines to make it 100%. Head occupies around 15% of a person’s height, which is equivalent to around 86 lines (576 x 0.15 = 86.4), which is the same when converted to pixels (assuming recording is made full TV frame mode, which is equal to two TV fields).
If we agree that an average person height is 170 cm, the head would occupy around 25 cm of that. The Pixel Density at the object, which is required to make a positive Face Identification according to AS 4806.2, can be calculated to be 86 pixels at 25 cm of head height. Since there are 4 times 25 cm in 1 m of height, this becomes 4 x 86 = 344 pix/m.
So, one can say that with pixel density of 344 pix/m at the objects plane it should be possible to positively identify a face, according to AS4806.2.
Some other standards may require different values, and one such (newer) standard is the IEC 62676-4, which defines 250 pix/m to be sufficient (i.e. suggests that with slightly lesser pixel density than the AS standards one should be able to identify a person).
Clearly, this number is not fixed in concrete, and it will depend on the observing ability of the operator, as well as other parameters (lens quality, illumination, compression artefacts, etc…), but the key is to understand that such a Pixel Density can be calculated for any type of camera, irrespective if that is SD, HD, 4k or any other format.
Any number for Face Identification Pixel Density can be specified in the ViDiLabs Calc, although the shortcut buttons are designed to show the IEC standard suggestion of 250 pix/m.
The next image quality down, as defined by the standards is for Face Recognition. The details of Face Recognition image should be sufficient to recognise the gender of a person, what he/she is wearing and possibly make an assertion of who that person might be, if picked from a bunch of people that have already been identified somewhere else (e.g. passport or drivers licence photo). This is basically an image with half the pixel density to the Face Identification, which according to AS4806.2 should be around 172 pix/m, while IEC62676-4 suggests 125 pix/m.
The following examples are from real systems:
Similarly, pixel density can be defined for vehicle licence plates visual recognition (not software automatic LPR). In the AS 4806.2, this is defined as 5% characters height on a display screen, which is around 30 TV lines (pixels) (to be very accurate 576 x 0.05 = 28.8). If we assume that a typical Australian number plate has characters of around 90 mm in height, than this equates to 11 x 30 pixels = 330 pix/m. The number 11 is obtained from dividing 1000 mm (1 m) with 90 mm. One may say that for visual licence plates recognition similar pixel density is required as for face identification.
Licence Plates Recognition as per AS4806.2
When money and playing cards are observed in banks or casinos, many practical tests have shown that at least 50 pixels are required across the notes or cards longer side in order to positively identify the values. Standard playing cards dimensions are B8 according to ISO216 standard, which is 62 mm x 88 mm. So, we need the 88 mm card length to be covered with at least 50 pixels for proper identification. This means around 550 pix/m (1000 mm / 88 mm = 11 => 50 pix x 11 = 550 pix/m) should be sufficient for playing cards. We may require slightly better pixel density for identifying money, since notes size is typically larger than playing cards, so if one takes the Face Inspection pixels density of
1000 pix/m, it should attain pretty good identification, although as it can be seen from the real life example below, even 770 pix/m might be sufficient.
Playing cards and money shown above with 770 pix/m
So the following table can be used as a rough guide for various pixel densities.
Object | Minimum required pixel density (Pix/m) |
Inspect (IEC-62676-4) | 1000 |
Face Identification (AS-4806.2) | 350 |
Face Identification (IEC-62676-4) | 250 |
Face Recognition (AS-4806.2) | 175 |
Face Recognition (IEC-62676-4) | 125 |
Observe (IEC-62676-4) | 60 |
Intrusion Detection (AS-4806.2) | 35 |
Detect (IEC-62676-4) | 30 |
Licence Plates visual identification (AS-4806.2) | 300 |
Playing cards | 500 |
Casino chips (39mm) | 1200 |
Money (notes) | 800 |
Money (coins) | 1500 |
About the pixel blur effect of moving object
Most objects that we observe in IP CCTV, such as people and vehicles, are not static, but moving. When objects are moving their image will never be sharp and clear like static objects. The faster the objects moves the more blurry it will appear. The closer the moving object is to the camera, the more blurry it will appear. The longer the camera exposure is the more blurry the object will appear. The camera sensor size and focal length of the lens play also a role in how blurry the image will appear. And finally, the angle under which such an object moves relative to the camera viewing direction also plays a role. So, there is a very complex correlation between all the parameters mentioned above. The ViDiLabs Calc has been designed to calculate the effects of such a motion in the recorded video, and show it as pixel blur. To put it simply, this calculation shows how smudged a moving object image is.
This blurriness is an unwanted effect, as it makes it difficult to recognise the details of the moving object even if the camera is in focus at that point. By knowing how many “blurry pixels” will appear for a given object speed and the camera exposure setting, using the ViDiLabs calc it is possible to find the camera Exposure setting which will produce lesser or acceptable sensor blur.
To produce “live” video in CCTV, we require at least 25 fps (or 30fps). Each of the TV frames are therefore typically exposed at 1/25s = 40ms (in analogue 1/50s for TV Fields). In the bright daylight, the auto iris lens closes to reduce the amount of light for a correct exposure. If it is very bright, then the sensor electronic exposure kicks in. In low light, the
auto iris lens opens up fully, and if this is not sufficient, the sensor electronic exposure increases further (this is usually called “integration”).
The formula for calculating pixel blur (pixel shift) is shown below.
Here are some practical examples.
If the object is moving at an angle relative to the camera optical axis, the same rules apply, but this time the projected speed “v” has to be used as a “real speed” of the moving object.
The projected speed can be found as a “cosine” of the speed “v” relative to the angle alpha that is between the moving object direction and the perpendicular direction to the optical axis.
For example, if a bicycle rider moves with 40 km/h at an angle of 30˚ relative to the optical axis, this would produce an angle of 60˚ between the direction of movement of the bicycle rider and the perpendicular plane to the optical axis. Then, the cos 60˚ = 0.5, which means the projected speed of 40 km/h will be 20 km/h for the purpose of calculating the pixel shift.
To continue with the same example, let’s assume the bicycle rider is passing at 100 m away from the camera, and riding at the mentioned angle above. Let’s also assume we have an HD camera, with 1/3” sensor and have 8 mm lens installed. If we use the normal camera shutter of 1/25 s to produce live video, the resulting object motion blur from such movement will be 7.1 pixels. Over 7 pixels of smudged moving image might be just too much to be able to recognise the rider. So, we need to reduce the shutter speed so that there are much less blurred pixels. Using 1/250s shutter exposure will bring the blurriness to less than 1 pixels (0.7 in our example) which is much more acceptable.
About the storage calculation
The ViDiLabs Calc can calculate digital storage space required for a particular system, with a number of IP cameras using video or image compression, to achieve certain number of days recording.
The two major groups of compression, the image compression and video compression, are treated slightly differently, since video compression defines the amount of storage needed by it’s Mb/s requirement and it works with Group Of Pictures (GOPs) and motion vectors prediction, whilst the image compression “doesn’t care” about any “history” of images prior or post an image, so the frame size as well and how many such frames are produced every second is needed.
An IP camera encoder that produces video streaming at 4 Mb/s for example, will need 4 Mb/s each second, and this is multiplied by the number of minutes, and hours and days, to calculate the recording capacity. How many images per second are captured at the sensor level doesn’t affect the storage requirement, but only the quality. So, it is important to clarify the very often misunderstood nature of video compression where images per second are somehow influencing the storage requirement. This is not the case with video compression. Images per second captured by the sensor only defines the image quality, not the length of storage of such a stream. It is the Mb/s that describes the compression strength which defines how many days a certain IP camera stream can be recorded on a particular storage capacity.
If a particular camera cannot produce a video stream, but rather image compression stream (JPG or Motion-JPG for example), then the storage calculation is slightly different, and needs to include compressed image size as well as images per second produced by such a camera, in order to calculate the storage required for the days set in the calculator.
One thing that might be useful for system designers with this ViDiLabs Calc is the visual representation of the compression quality using the ViDi Labs SD/HD test chart as a reference. Although this is a simulated representation, it has been made to be very close to the actual compression appearance, which could be useful in determining the setting one may have on a particular camera.