It was a week ago when i received an email from my research colleague in Faculty of Technology and Information Science, UKM about an individual in the university who claimed to manage snapped a photo of UFO while he was in Putrajaya. The individual added that all these UFO photos he took by using his Samsung smartphone.

The photo of the said object was send via email together with other images on the same event. The email was circulated among peers and then from there the photo spread to the public. The photos even managed to get the attention of a few independent news website.

Therefore, we conducted an analysis on the authenticity of this UFO photo.We laid out the methodology we used to debunked this UFO myth.

The Analysis

The analysis was divided into three parts: 1) the EXIF data examination, 2) The Structural Similarity Index Measurement (SSIM) analysis, and 3) Blur & Light direction analysis.

1. The EXIF data analysis

The EXIF data of the photos pointed out that these photos were taken using Camera360 app. Okay, remember there’s a saying about first impression is always counts? This one about Camera360 is never a good one. It has tonnes of effects that can be used to modify an image and that includes aliens and UFOs. From the Google search we also found the UFO template used to make the UFO effect.


From the Google search we found this:


2. The Structural Similarity Index Measurement (SSIM) analysis

The SSIM is traditionally used for assessing the quality of an image. Its particularly being used to assess the similarity between before and after processing of an image. This kind of assessment is popularly being applied n the image compression assessment.  For this case, among the photos provided by the individual, we found one of the image in which looks similar with the UFO’s. Therefore we confirmed the similarity with SSIM. The result of analysis is as follow:


The SSIM result is 99.7% similar. This indicates the two images are almost similar to each other. The highest score of SSIM is 100% while the lowest is 0%. The SSIM brightness map shows the UFO object area is actually tampered. The pixels surrounding the UFO are seen in contrast in term of brightness to the rest of the image. The pixels also in a rectangular shape, which describes the manner of how the image was pasted on the original image.

3. Blur & Light Direction Analysis

The light direction is due to the movement of the hands that snapped the photo. The building lights at the background are found to be in a sort of direction (which is shown in the figure below). The UFO object in the other hand is having none similar movements. This indicates the UFO was never in the picture when the photo was snapped.


4. The Conclusion

Okay, its an obvious fake. Maybe, even without the analysis we can logically deduct of how fake this photo is. But it surely interesting to have it analysed in order to find out of how the tampering of the image was done. I guess, its cool and fun once in a while to prank your friends, family and peers with photo such as this. But to take it to the public is surely a hardly laughing matter. 😉


Can we reverse the video degradation factors?

The major challenge in video forensics especially involving CCTV surveillance is to reverse the degradation factors which contributing to the low clarity of objects in a video exhibit. A video here can be in any source- whether it is from a CCTV surveillance system, a digital camera, a smart phone camera or from an iPad. These degradation factors usually come in many kinds of form such as noises, the illumination invariance and lens blur to name a few. The problem is made worst with other factors which followed in the storing of the video – the compression artifact. This artifact is observed in many CCTV video surveillance exhibits. Upon extraction of probe information, these problems affecting the quality of information on many video forensic processes.

There are many turn arounds on how to manage these problems. Many opt to use ‘on the shelf’ solutions in which provided by forensic software or any image editing software. Through our own experience, this should not always be the best solutions. Most of the time, we found the answer to be beyond of what offered by these software.

Many researches have contributed in the new generation of enhancement beyond linearity. No more the researchers think a simple filtering or resizing can cut the job. The new approach would be the super-resolution and blind deconvolution approaches or even better the combination of both. The latest discovery is both reversing the quality and synthesizing of low resolution to a higher resolution can be both carried out via sparse representation method which produces much better results.

Despite the promising trend in this field of research per se, to say it is the holy grail we the video forensic practitioner seek is a bit too optimistic. Nonetheless, the research has given rise to a new chapter in image processing research and technology. A new chapter where we can manipulate the low definition information and synthesizes it to a higher definition with superb quality. The quality would be so high that no criminal can escape once they are recorded in any surveillance video.

The Story in Your Eyes – The Experiment

Here’s a very interesting article written by Mr Hany Farid of Dartmouth on photo forensics from eyes.  Its very interesting of how we can extract meaningful information from the lights reflected from the eyes in photos, but still I found myself to be skeptic over this.

So here I conducted a simple experiment on to prove whether the claim is true. Therefore, armed with a Nikon D3100 DSLR, Photoshop and a Super-Resolution coding which i wrote from the library provided by the EPFL I started my own Myth Buster. Last but not least I manage to get my supportive 8 years old son to be the model for my experiment. Here I lay out 4 simple steps that I conducted.

Step 1 : Snap a picture

The setup was simple : i asked my son to sit on the sofa under the living rooms fluorescent lights while I snapped his smiley face. And that’s all.

Step 2 : Enhance the photo

Next is the preparation time. The image is sized about 4608×3072 resolution. It is truly a concern if I try to enlarged the whole image to get a better zoom on the pupil will consume a lot of computing time and memory. So it is better to just cropped out the right eye and use it for the analysis.

Here the size of the image is about 1240×920 resolution.

By using Photoshop I enhanced the image brightness, hue and saturation to enhance the clarity of the reflection on his right eye. Here you can see clearly some objects mirrored on the eye.

Step 3 : Super-Resolution 

I load up the image to the Super-Resolution coding i have written. For the zooming i set it at x3; with Vanderwall selected for estimating the rotation/shift and Robust Super-Resolution for the interpolation. The original code of the EPFL only call TIFF images as to take advantage of the SamplePerPixel information from the image metadata for the SR algorithm. As I’m using a JPEG therefore I changed the codes a little to override the SamplePerPixel call. The result is a good quality enlarged image at 3720×2760 resolution. As you can see from the image below, the quality is very much similar. From here on I only used the pupil to get the reflection information.

Step 4 : Second round of enhancement 

This is the final step – I cropped out the pupil and did the lens distortion correction. I then further enhanced the image with another round of hue/saturation, brightness correction and deblurring. And voila!! The result is in:

If you find its hard to figure out what are the objects in the reflection, here I will make it easier for you :

Mind you that this work was using a DSLR image.. therefore its a very good quality source to begin with. I dont think it will work on lower quality source from webcam or surveillance cameras. But its a brilliant idea! Thanks to FourAndSix for sharing their thought in this matter.

Video File Carving for CCTV Surveillance Video Exhibits

Another mystery debunked by us.

First of all, once we received a case where an investigation officer asked for our assistance to analyze a CD which contains some videos extracted from a CCTV system at a crime scene. The problem was, we were unable to playback the video. As to why the video cannot be played is due to the nature of the CCTV video files itself- the video can only be viewed via a proprietary video player. This is true to most CCTV DVR system as a security measure. You simply cannot viewed the video without its intended player. For this case of course no video player software is included, which explains our difficulties.

What we did is by studying the videos’ file system. Armed with WinHex, we identified the video file structure and from there we further investigate the header pattern and its associated indicators.  In detailing, we identified the H.264 stream from the CCTV file, carved it out and  we replace it into an AVI container. We tested by playing the carved out videos in several generic video player e.g. VLC and Windows Media Player Classic which show positive result.

Enter soft-biometrics – a new paradigm for individual identification in video forensics

In Video Forensics we are dealing with the video quality issues more than the investigation objectives on hand. This can be even harder with subsequent degradation of the information quality contained in the video which is crucial to investigation. Facial identification for instance, is made harder by the recording factors e.g. noise, lens distortion, resolution or compression artifacts which are always found coupled with other factors for example facial pose, facial orientation, illumination invariance and occlusions. Therefore, biometrics analysis in video forensics is a tough nut to crack.
Biometrics technology, by traditional concept is a system which always requires cooperation from users at any degrees and form. Fingerprint for example, requires a user to cooperate by giving the machine his or her finger. Other example such as iris recognition system, requires the user to provide their eyes in order to verify who they are in order to get access to certain room or facilities. In forensics however, there is none. The biometrics evidence is as the way as it was being recorded by a device (for this case its the CCTV)… and of course no criminal would be smart enough to smile at the cameras for that matter.
I was a firm believer of traditional biometrics – to uniquely identify an individual by the characteristics they owned. Years of involvement in this research has really build me up thick with such idea. When I joined forensics and was asked to develop a methodology and SOP for face recognition, i was shocked by the nature of evidences shown up in our investigation. And for the first time, I questioned on my every stand and understanding of the technology. I mean even if its to identify a face in a video exhibit it would take a lot of effort and resources to ensure the result is accurate and robust.  In my opinion, with no whatsoever cooperation from the user, it is hardly called a biometric system. For example, how many of us really care to face the surveillance cameras when we see one? Yes, I can guarantee the odd is almost to none. By the thumb rule of face recognition, the face should be at full-frontal and at a certain good resolution in the field of view. In video evidence, this is hardly to come by.  Most people in the video exhibit appears to be looking away due to 1) the camera position and 2) the facial position in the view. These really affect the forensics analysis.
Therefore, what is really important to an investigation? Is it to uniquely identify the suspect in a video or to uniquely described the suspect? This is what define soft-biometrics, a method to describes biometrics in a video or image with no engagement of recipient’s cooperation. This is somehow fit to the forensics analysis methodology. According to Simon [1] :
” Soft biometrics are characteristics that can be used to describe, but not uniquely identify an individual. These include traits such as height, weight, gender, hair, skin and clothing colour. Unlike traditional biometrics (i.e. face, voice) which require cooperation from the subject, soft biometrics can be acquired by surveillance cameras at range without any user cooperation. Whilst these traits cannot provide robust authentication, they can be used to provide coarse authentication or identification at long range, locate a subject who has been previously seen or who matches a description, as well as aid in object tracking. “
So how soft-biometrics analysis is conducted? What features are taken into consideration? According to Simon, the analysis can be conducted by extracting the soft-biometrics models (the head, the torso and the legs) from the surveillance video. The model is then segmented into each section and is treated separately. The crucial part is the segmentation. For that purpose color segmentation is applied. These segmented data can then be analyzed for the facial information, the attires and the Gait information. The setbacks for this methodology is the illumination factors. Therefore, a more robust detection algorithm for example Graph-Cut Segmentation and Active Appearance Model (AAM) should be put into consideration.
The challenges for soft-biometrics to be used in forensics (as I can think of) are all lies in other exhibits seized which can be associated with the probe individuals in the video. For example in confirming the suspect is wearing the same shirt and skirt on the event of the crime, the law enforcement should also seized the attires that is in the belonging to the suspect. Furthermore, to incorporate facial and gait analysis, the suspects facial and the way he/she walks should also be recorded as enrollment. In other word, the law enforcement should think of a way to get these information in conducting forensics analysis in their investigation.
There is a case where we applied a similar methodology. For that case, apart from conducting face recognition we were also required to conduct  attires matching. The problem was the color of the shirts they wore in the video and the ones seized from their belonging are the same type but not the color. This is due to the recording setting of the CCTV system in the premise which degrades (due to some unknown reason) the video color to another color space. A stripe of black and red was found to appear blue and purple in the video. To made things worse, to correct the color of the exhibit video is considered bias. Therefore, what we did- we brought all the clothes back to the crime scene and recorded them via the same cameras which recorded the video exhibits. The idea is to establish a connection in term of color obetween the suspect seized clothes to the one they wore on the time of the crime. From there we managed to successfully proof the same attires were being used. Plus, the face recognition results also showed a positive matching of their face to the face of the probes in the video.
As Simon claimed, soft-biometrics is not as accurate as traditional biometrics. Perhaps multi-modal biometrics approaches can further enhanced the current method. If we find both the face and the gait, why not establish the combination of both. Plus with the information of the clothes, or any other unique characteristics of the suspect taken into consideration for describing, the analysis can be without doubt a strong one.
[1].  Simon Denman, Clinton Fookes, Alina Bialkowski, Sridha Sridharan: Soft-Biometrics: Unconstrained Authentication in a Surveillance Environment. DICTA 2009: 196-203


The technique of video and audio authentication analysis has now crossed the boundary of ears and eyes. With the advanced technique which is introduced by Catalina Gregorias has proved our digital audio and video contained more than just signals and pixels. They contains more information than what we thought is there. ENF or Electrical Network Frequency is the humming background that we can observed in any digital recording. It is the noise which is caused by our national electric grid. In many countries the ENF is stabile but still the pattern may have shown some differences from day to day. Any tampering on audio can be traced if there are more than one pattern of ENF are observed in a recording. The analysis would also allow us to compare the pattern of ENF found in the audio to the national grid ENF to acquire the real date and time and possibly geographically the recording was made.The pattern of ENF may look like the following figure. For this we view the ENF of an audio recording by using our Cedar workstation by observing the signal decibel vs frequency spectogram.


For further reading, you may found this blog interesting.


There was a case we received where the investigation officer asked us to analyze the authenticity of a video exhibit. The other exhibit included was a hand cam from which the video is said to be recorded. To further elaborate, the video exhibit was in a mini DVD which is directly burned from the hand-cam.

The question now is, as to how we can established a connection of the video in the mini-DVD to the camera when there is absolutely nothing on the camera that we can refer to authenticate. We found the answer: PRNU.

What is a PRNU? Firstly, it is an abbreviation for Photo Response Non-Uniformity. In English? Its a method to find the relation of an image or video to its recording device’s sensor defects. This must be shocking, but all digital image and video recording sensors have a unique defects characteristics. And these unique characteristics can be the fingerprint of a camera sensor. Therefore, any image or video produced by a camera would have contained the PRNU noise which is unique. Even if there are many same model of cameras and one video, the PRNU noise contained in the video can be traced to its origin.

We have done a research on the PRNU. For the research we trained several cameras PRNU. For the test we tested a video which is treated as exhibit with the same enhancement used to extract the PRNU. In the following figure we show the result of correlation between the video exhibit against the PRNU of its origin camera. Included in the test is videos from other type of cameras against the original camera PRNU.


To rate the similarity between the correlation data we assessed by using Maximum Likehood Estimation. The next result shows the likehood result of the correlation result.

In conclusion, the PRNU is effective in linking a video to the camera via its sensor defects information.


Colloquium on Facial Identification analysis in video forensics

Yesterday 18th November 2011, I was invited to give a talk on face identification in video forensics. The talk started at 1500hrs and end up an hour later. Many of the Pattern Recognition groups students and lecturers attended the talk.

Here I would like express my heartiest thanks to everybody who came to see the presentation and also to those who gives support during my preparation for the event.

This slideshow requires JavaScript.

ERGS UPDATE : Finally the acquisition system is completed

Finally, things are going smoothly with the facial acquisition workstation. Today we finally manage to finish up the setup of the facial acquisition workstation. This is where we will enroll face and also for the testing phase. This also would be the test bed in our quest in studying the impact of video quality to the public safety.

It was quite a wait for all the equipment to be fully shipped. Some parts were out of stock. But i guess all is worth the wait. We took 3 days to fully assembled the cameras and the cabling. Plus another day to setup the working area (in which you can see from the gallery the ‘spider web’ on floor). Those lines on the floor acted as the guidance for the camera poles distance. From this working area we will acquire videos of recipients’ face from 16 different angles in which e will generate 3D profiles.

For now, we are going to examine the quality of the recording of each camera. As you can see from the gallery, we have summoned our mannequin (who we named Takeshi Kaneshiro) to help us out with the system settings and fine tuning. Takeshi here is a mannequin who we used in simulating video surveillance recording. He has helped us in some cases where we were required to verify objects under investigation such as the attires worn by criminals and also their heights.

This slideshow requires JavaScript.

Progress Trackback 20111017

On 17th October,  it was just me, Prof Madya Dr Norul Huda and our intern Jinjuli at the lab. The job that day was to firstly get the lab keys duplicated. Actually, we had the keys duplicated last week but out of 3 sets of copies only 1 works well. Therefore I took the keys back to the shop for repairs.

The task continues on preparing some working samples on exploring the AAM algorithm and the 3D aspects of it. For that Jinjuli and I snapped images of frontal, side-left and side-right of the face of the both of us. What we try to achieve is something like FaceGen. The generated 3D face will be further used for generating a range image which is the one required for recognition.

And to wrap up the evening, I switched the DVR on to let it run the recording for days.

This slideshow requires JavaScript.