Significant words

In Copy-Move a part of the image itself is copied and pasted into another part of the same image. This can make objects disappear. Because the copied part comes from the same image, the colour palette, noise components, dynamic range and other properties will be compatible with the rest of the image, making it very difficult to detect.

Image Splicing involves composition or merging of two or more images changing the original image significantly to produce a forged image. In case images with differing background are merged then it becomes very difficult to make the borders and boundaries indiscernible. Splicing detection is a difficult problem whereby the composite regions are investigated by a variety of methods.

In Image Retouching, the images are less modified than in other types of tampering. It just enhances some features of the image. There are several subtypes of digital image retouching, mainly technical retouching and creative retouching. Retouching may require rotation, scaling, or stretching of an image before combining it with other image. Cloning of a part of the image is also very common in image retouching. The detection is very difficult as there is no radical change in the different parts of the image.

In active digital image tampering detection approaches, the digital image requires some pre-processing such as watermark embedding or signature generation at the time of creating the image. Not very practical for purposes of detecting tampering of images from others.

Passive digital image tampering detection approaches rely on the assumption that although tampering may leave no visual clues, they may alter the underlying statistics of an image.

  • pixel-based techniques detect statistical anomalies introduced at the pixel level
  • format-based techniques use the statistical correlations introduced by a specific lossy compression scheme
  • camera-based techniques exploit artefacts introduced by camera lens, sensor, or on-chip post-processing
  • physically-based techniques explicitly model and detect anomalies in the three-dimensional interaction between physical objects, light, and the camera
  • geometric-based techniques make measurements of objects in the world and their positions relative to the camera

Imagine an m x n image as a lake. This image is represented by a bit-map. For each pixel in some location (x,y) the bitmap holds a real number representing its intensity value.

The image can also be expressed by a sum of waves (sinuses or cosines). Instead of having m x n pixels, we have m x n waves where each wave has a different amplitude and frequency. The “pixel” value at (x,y), now called a coefficient, contains the amplitude of some predetermined sine or cosine with a known frequency. This transformation of an image into sines or cosines is invertible - we can transform the image back and forth from the image space to the transform space. In practice, the discrete transform is linear and orthogonal, and is therefore very efficient and easy to use.

Assume a song. A Fourier Transform (FT) decomposes a signal (song) into it's frequency components. The temporal information of the time-domain signal is encoded indirectly in the phase of the frequency-domain signal. After a Fourier transform I do not know what note is being played when. Using Short-Time Fourier transform (STFT), the signal (song) is chopped up into fixed-size chunks and then an FT is applied to each chunk, resulting in frequency characteristics for each time interval.

The problem with the STFT is that the window size may be too small (to measure) for low frequency components or too large (redundant) for high frequency components. The wavelet transform is a mathematically rigorous way of ensuring this. Small windows are allocated for large frequencies and large windows for small frequencies. In other words, wavelets are mathematical functions that cut up data into different frequency components, and then study each component with a resolution matched to its scale. They have advantages over traditional Fourier methods in analysing physical situations where the signal contains discontinuities and sharp spikes.

Wavelet transforms are broadly divided into three classes: continuous, discrete and multi-resolution-based. Wavelet 'families' (like Haar, Deubechies) are different ways of decomposing a signal into chunks.

In Fourier analysis, the Discrete Fourier Transform (DFT) decomposes a signal into sinusoidal basis functions of different frequencies. No information is lost in this transformation. The original signal can be recovered from its DFT (FFT) representation. Discrete wavelet transforms can be used to reduce the features of an image to give an approximate image from the lowest energy sub-band.

In wavelet analysis, the Discrete Wavelet Transform (DWT) decomposes a signal into a set of mutually orthogonal wavelet basis functions. These functions differ from sinusoidal basis functions in that they are spatially localized – that is, non-zero over only part of the total signal length. Furthermore, wavelet functions are dilated, translated and scaled versions of a common function φ, known as the mother wavelet. As with Fourier analysis, the DWT is invertible, so that the original signal can be completely recovered from its DWT representation.

A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequencies. It is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using only real numbers. Like Discrete Wavelet Transforms, Discrete Cosine Transforms can also be used for feature reduction. The DCT is, for example, used in JPEG image compression, and in MJPEG, MPEG, DV, Daala, and Theora video compression. The DCT can be applied to the individual blocks obtained after dividing a DWTed image, after which the blocks can be compared on the basis of correlation coefficients.