Page 1
1 Module I
Introduction to Image Processing Systems :
1
DIGITAL IMAGE PROCESSING
Unit Structure
1.0 Objectives
1.1 Introduction
1.2 An Overview
1.2.1 What is an Image?
1.2.2 What is a Digital image?
1.2.3 Types of Image
1.2.4 Digital Image Processing
1.3 Imag e Representation
1.4 Basic relationship between Pi xels
1.4.1 Neighbors of a Pixel
1.4.2 Adjacency, Connectivity, Regions and Boundaries
1.4.3 Distance Measures
1.4.4 Image operations on a Pixel Basis
1.5 Elements of Digital Image Processing system
1.6 Elements of Visual Perception
1.6.1Str ucture of Human Eye
1.6.2 Image Formation in the Eye
1.6.3 Brightness and contrast
1.6.5 Hue
1.6.6 Saturation
1.6.7 Mach band effect
1.7 Simple Image Formation Model
1.8 Vidicon and Digital Camera wor king Principle munotes.in
Page 2
Imag e Processing
2 1.8.1 Vidicon
1.8.2 Digital Camera
1.9 Colour Image Fundamentals
1.9.1 RGB
1.9.2 CMY
1.9.3 HIS Models
1.9.4 2D Sampling
1.9.5 Quantization
1.10 Summary
1.11 References
1.12 Unit End Exercises
1.0 OBJECTIVES
After going through this unit, yo u will be able to:
❖ Gain the knowledge about evoluti on of digital image processing
❖ Analyse the limits of digital images
❖ Derive the representation and relationship of pixels
❖ Describe the functioning of digital image processing system
❖ Specify the color models of image processing such as RGB, CMY
and Hue
1.1 I NTRODUCTION
Digital Images plays main role in the day -to-day life. The visual effect
plays major role than any other media. When we see an image without
saying, without explaining anything we understand t he concept.
Evolution of Digital Images:
The digit al images started its role from newspapers. The pictures that are
sent through submarine cable between London to New York are the first
journey of digital Images.
1921
Bartlane cable picture transmission s ystem used specialized printing
equipment coded pic tures and then reproduced on telegraph printer fitted
with typefaces simulating a halftone pattern. This technology reduced the
time required to transmit a picture across Atlantic to less than 3 hours. munotes.in
Page 3
Digital Image Processing
3 Level of coding images was 5. Figure 1.1 shows the pic ture transmitted in
this way.
1922
Visual quality is improved through selection of printing procedures and
distribution of intensity levels. A technique based on photographic
reproduction made from tapes perforated at telegraph receiving terminal.
Level o f coding images was 5. Figure 1.2 shows the picture transmitted in
this way.
1929
The intensity level was increased to 15. Figure 1.3 shows the picture
transmitted in this way.
1964
The digital image used through digital computer and its advanced
technique s lead to Digital image processing. The Ranger 7 spacecraft of
U.S. took the first image of moon, shown in Figure 1.4. The enhanced
methods from the lessons learned from this imaging served as the basis fo r
Surveyor missions to moon, Mariner series mission s to Mars and Appolo
manned flights to the moon and others.
1970
In parallel to space applications, the medical imaging, remote earth
resources and astronomy the digital image processing was applied. Ex.
CAT - Computerized Axial Tomography and X -rays uses DIP.
1992
Berners -Lee uploaded the first image to the internet, in 1992. It was of Les
HorriblesCernettes, a parody pop band founded by CERN employees.
1997
Fractals: Computer generated images are in troduced based on the iterative
reproduction of a b asic pattern according to some mathematical rules.
Figure 1.1 : Figure 1.2 : Figure 1.1 : Figure 1.4 : munotes.in
Page 4
Imag e Processing
4 1.2 AN OVERVIEW
1.2.1. What is an Image?
Visual representation of an object is called as Image. An image is a two -
dimensional function that represents a m easure of some characteristic such
as brightness or color of a viewed scene.
Fig. 1.5. Sample Image1
(1 https://www.designyourway.net/diverse/amazingworld/28899053723 .
jpg)
1.2.2 What is a digital image?
Digital image is c omposed of a finite number of elements having a
particular location and value. These elements are called picture elements,
image elements, pels and pixels
A real image can be represented as a two dimension al continuous light
intensity function g(x,y) where x and y denote the spatial coordinates and
the value of g is proportional to the brightness (or gray level) of the image
at that point.
1.2.3 Types of Image
Generally the images can be classified into two types. They are
i) Analog Image
ii) Digital Image
i)Ana log Image
The image which is having continuously varying physical quantity in the
spatial data such as x, y of the particular axis is known as Analog Image.
Analog image can be mathematically represented a s a continuous range of
values representing positio n and intensity. The image produced on the
screen of a CRT monitor, Television and medical images are analog
images.
munotes.in
Page 5
Digital Image Processing
5 ii) Digital Image
A digital image is composed of picture elements called pixels with
discrete data. Pixels are the smallest sample of an im age. A pixel
represents the brightness at one point. The common formats of digital
images are TIFF, GIF, JPEG, PNG, and Post -Script.
Advantages of Digital Images
i) The processing of images is faster and cost-effective.
ii) Digital images can be effectiv ely stored and efficiently transmitted
from one place to another.
iii) Immediate output display to see the image.
iv) Copying a digital image is easy. The quality of the digital image will
not be degraded even if it is copied for several times.
v) The reproduction of the image is both faster and cheaper.
vi) Digital technology supports various image manipulations.
Drawbacks of Digital Images
i) Misuse of images has become easier.
ii) During enlarging the image, the quality of the image will be
compromised.
iii) Large volume o f memory is required to store and process the images.
iv) Fast processors required to process digital image processing
algorithms.
1.2.4. Digital Image Processing (DIP)
Processing the images using digital computers is termed as Digital Image
Processing .
Digital image processing concepts are allied in the fields of defence,
medical diagnosis, astronomy, archaeology, industry, law enforcement,
forensics, remote sensing etc.
Flexibility and Adaptability
Mod ification in hardware components is not required in order to
reprogram digital computers to solve different tasks. This feature makes
digital computers an ideal device for processing image signals adaptively.
Data Storage and Transmission
The digital data can be effectively stored since the development of
different image compression algorithm is in progress. The digital data can
be easily transmitted from one place to another and from one device to
another using the computer and its technologies. munotes.in
Page 6
Imag e Processing
6 Different image processing techniques include image enhancem ent, image
restoration, image fusion and image watermarking for its effective
applications.
1.3 IMAGE REPRESENTATION
● Represented as M N matrix.
● Each element in the matrix is a number that represents sa mpled
intensity.
● M N gives resolution by pixel.
Figure 1.6. Coordinate convention
used to represent digital images.
Digital image is a finite collection of discrete data samples (pixels) of any
visible object. The pixels represent a two or higher d imensional “view” of
the object, each pixel having its own discrete value in a finite range. The
pixel values may represent the amount of visible light, infra -red light,
absorption of x -rays, electrons, or any other measurable value such as
ultrasound wave impulses.
The result of sampling and quantization is matrix of real numbers. Assume
that an image f(x,y) is sampled so that the resulting digital image has M
rows and N Columns. The values of the coordinates (x,y) now become
discrete quantities thus the value of thecoordinates at origin become (x,y)
= (0,0). The next Coordinates value along the first signify the image along
the first row. munotes.in
Page 7
Digital Image Processing
7 f(x,y) =
Fig 1.7 Matrix representation format of a digital image
The right side of this equation is by defin ition a digital image. Each
element ofthis matrix a rray is called an image element , picture element ,
pixel , or pel.
Or the same can be represented as
A=
1.4 BASIC RELATIONSHIP BETWEEN PIXELS
There are several important relationships between pixels in a digital
image.
1.4.1 Neighbors of a Pixel
A pixel p at coordinates (x,y) has four horizontal and vertical neighbours
whose coordinates are givenby:
This set of pixels, called the 4 -neighbors of p, is denoted by N 4(p). Each
pixel is one unit distance fro m (x,y) and some of the neighbors of p lie
outside the digital image if (x,y) is on the border of the image. The four
diagonal neighbors of p have coordinates and are denoted by N D(p).
.These points, together with the 4 -neighbors, are called the 8 -neighbo rs of
p, denoted by N 8(p). (x,y-1) (x-1,
y) p(x,y) (x+1,y) (x, y+1)
munotes.in
Page 8
Imag e Processing
8 1.4.2 Adjacency, Connectivity, Regions and Boundaries
● To define adjacency the set of grey –level values V is considered.
● In a binary image, the adjacency of pixels with value 1 i s referred as
V={1}.
● In a grey -scale image, the ide a is the same, but Vtypically contains
more elements for example, V= {100, 101,…,150} that is subset of
any 256 values from 0 -255
Types of Adjacency:
(i) 4- Adjacency – two pixels p and q with value from V ar e 4 –
adjacency if A is in the set N 4(p)
(ii) 8- Adjacency – two pixels p and q with value from V are 8 –adjacency
if A is in the set N 8(p)
(iii) M -adjacency –two pixel p and q with value from V are m – adjacency
if
a) Q is in N 4(p) or
b) Q is in N D(q) and theSet N 4(p) ∩ N 4(q) has no pixel whose values are
fromV.
Mixed adjacency is a modification of 8 -adjacency. It is introduced to
eliminate the ambiguities that often arise when 8 -adjacency is used.
Fig.1.8 Arrangement of
pixels Fig.1.9 pixels that are 8 -adjacen t
(dashed lines) to the center pixel Fig.1.10 m -adjacency
Digital Path:
A digital path from pixel p(x,y) to pixel q(s,t) is a sequence of distinct
pixels with coordinates (x 0,y0), (x 1,y1), …, (x n, yn) where (x 0,y0) = (x,y)
and (x n, yn) = (s,t) and pixels (xi, yi) and (x i-1, y i-1) are adjace nt for 1 ≤ i
≤n, n is the length of the path.
If (x 0,y0) = (x n, yn), the path is closed.
Based on the type of adjacency paths are specified as 4, 8 or m -paths.
In figure 1.9 the paths between the top right and bottom right pixels are 8 -
paths. And the path between the same 2 pixels in figure 1.10 is m -path
0 1 1
0 1 0
0 0 1 0 1 1
0 1 0
0 0 1 0 1 1
0 1 0
0 0 1
munotes.in
Page 9
Digital Image Processing
9 Connectivity:
Let S represent a subset of pixels in an image, two pixels p and q are said
to be connected in S if there exists a path between them consisting entirely
of pixels in S.
For any pixel p in S , the set of pixels that are connected to it in S is called
a connected component of S. If it only has one connected component, then
set S is called a connected set .
Region and Boundary:
Region: Let R be a subset of pixels in an image, R is a region of th e image
if R is a connected set. Any pixels in the boundary of the region that
happen to coincide with the border of the image are included implicitly as
part of the region boundary.
Boundary: The boundary of a region R is the set of pixels in the region
that have one or more neighbors that are not in R.
If R is an entire image, then its boundary is defined as the set of pixels in
the first and last rows and columns in the image. There are no neighbors
beyond the pixels’ borders.
.1.4.3. Distance Measures
For pixel p,q and z with coordinate (x.y) ,(s,t) and (v,w) respectively D is
a distance function or metric if
D [p.q] ≥ 0 {D[p.q] = 0 iff p=q
D [p.q] = D [p.q] and
D [p.q] ≥ 0 {D[p.q]+D(q,z)
The Euclidean Distance between p and q is defined as:
Pixels ha ving a distance less than or equal to some value r from (x,y) are
the points contained in a disk of radius “ r” centered at (x,y)
The D 4distance (also called city -block distance) between p and q is
definedas:
D4(p,q) = | x – s | + | y – t |
Pixels having a D4 distance from (x,y), less than or equal to some value r
form a Diamond centered at (x,y)
Example:
The pixels with distance D 4≤ 2 from (x,y) form the following contours of
constant distance. munotes.in
Page 10
Imag e Processing
10 The pixels with D 4= 1 are the 4 -neighbors of (x,y)
2
2 1 2
2 1 0 1 2
2 1 2
2
The D 8distance (also called chessboard distance) between p and q is
defined as:
D8(p,q) = max(| x – s |,| y – t |)
Pixels having a D 8 distance from (x,y), less than or equal to some value r
form a square Centered at (x,y).
2 2 2 2 2
2 1 1 1 2
2 1 0 1 2
2 1 1 1 2
2 2 2 2 2
Example:
D8distance ≤ 2 from (x,y) form the following contours of constant
distance.
DmDistance:
Dm is the shortest m -path between the points.In this case, thedistance
between two pixels will depend on the values of the pixels along the path,
as well as the values of their neighbors.
Example:
P
3 P
4
P
1 P
2
p
Consider the fo llowing arrangement of pixels and assume that p, p2, and
p4 have value 1 and that p1 and p3 can have can have a value of 0 or 1
Consider the adjacency of pixels values; V ={1}.Compute the D m between
points p and p 4 munotes.in
Page 11
Digital Image Processing
11 There are 4 cases:
p p2 p4
Case1: If p1 =0 and p 3 = 0
Length of the shortest m -path (the D m distance) is 2;
Case2: If p 1 =1 and p 3 = 0
p1 and p will no longer be adjacent then, the length of the shortest path
will be 3
p p1 p2 p4 Case3: If p 1 =0 and p 3 = 1
p p2 p3 p4 The shortest –m-path will be 3 ;
Case4: If p 1 =1 and p 3 = 1
p p1 p2 p3 p4 The shortest –m-path will be 4 ;
1.4.4 Image operations on a Pixel Basis
For doing arithmetic and logic operations between the images, the
corresponding pixels in the images are involved i n those operations.
If any image is divided by another then the division is carried out between
the corresponding pixels in the two images.
Let f and g are two images
Applying the division operation, h=f/g
First element of image ‘h’ is the resultant of first pixel of image ‘f’
divided by image ‘g’
1.5 ELEMENTS OF DIGITAL IMAGE PROCESSING
SYSTEMS:
The basic elements of digital image processing systems are
i) Image Acquisition devices
ii) Image storage devices
iii) Image processing e lements
iv) Image display devices
munotes.in
Page 12
Imag e Processing
12
i) Image Acquisition devices
The term image acquisition refers to the process of capturing real -world
images and storing them into a computer. Conventional silver -based
photographs in the form of negatives, tra nsparencies or prints can be
scanned using a variety of scanning devices. Digital cameras which
capture images directly in digital form are more popular nowadays. Films
are not used in digital cameras. Instead, they use a charge -coupled device
or CMOS devi ce as the image sensor that converts light into electrical
charges. An image sensor is a 2D array of light -sensitive elements that
convert photons to electrons. Most of the digital cameras use either a CCD
or a CMOS image sensor.
Solid -state image sensor c onsists of
a) Discrete photo -sensing elements b) charge -transport mechanism c) an output circuit.
❖ The photo sensitive sites convert the incoming photons into electrical
charges and integrate these charges into a charge packet.
❖ The charge packet is then transferred through the transport
mechanism to the output circuit where it is converted into a
measurable voltage.
❖ The types of photo -sensing elements used in solid state imagers
include photodiodes, MOS capacitors, Schottky -barrier diodes and
photoconductive layers.
❖ The output circuit typically consists of a floating diffusion and
source -follower amplifier.
❖ In practical applications, image sensors are configured in a one -
dimensional (linear devices) or a two -dimensional manner.
Fig. 1.11 Elements of DIP system Image storage
devices
Computer Memory
Frame Buffers
Magnetic tapes
Optical disks Image Acquisition
devices
CCD Sensor
CMOS Sensor
Image Scanners Image processing
Computer Image display
devices
CRT
Computer Monitor
Printer
TV M onitor Projector munotes.in
Page 13
Digital Image Processing
13 ii) Image storage devices
If the image is not compressed the enormous volume of storage is required
There are three categories of storage devices. They are :
a) Short term storage b) Online storage c) Archival Storage
Short term storage : Used at the time of proc essing, Example: computer
memory, frame buffers. Frame buffers stores more than one image and can
be accessed rapidly at video rates. Image zoom, scrolling and pan shifts
are done through frame buffers.
Online storage : It is used while accessing the data o ften. It encourages the
fast recall, Example; magnetic disk or optical media.
Archival storage: It is characterized by frequent access, example:
magnetic tapes and optical disks. It requires large amount of storage space
and the stored data is accessed in frequently.
iii) Image processing elements
Computer and its related devices are the image processing elements for
various applications.
iv) Image display devices
Image displays are color TV monitors. These monitors are driven by the
output of image and gr aphics display cards which are a part of the
computer system.
1.6 ELEMENTS OF VISUAL PERCEPTION
1.6.1 Structure of Human Eye
Characteristics of Eye
❖ Nearly spherical
❖ Approximately 20 mm in diameter
❖ Three membranes
i) Cornea and Sclera
ii) Choroid
iii) Retina
i) Cornea; Sclera
The cornea is a tough, transparent tissue that covers the anterior , front
surface of eye. The sclera is an opaque membrane that is continuous with
the cornea and encloses the remaining portion of the eye.
munotes.in
Page 14
Imag e Processing
14 ii) Choroid
It is loc ated directly below the sclera. It contains network of blood vessels
which provides nutrition to the eye. The outer cover of the choroid is
heavily pigmented to reduce amount of extraneous light entering the eye.
Also contains the iris diaphragm and ciliar y body
Iris diaphragm
It contracts and expands to control the amount of light entering into the
eye. The central opening of the iris which appears black is known as pupil
whose diameter varies from 2mm to 8mm.
Lens
It is made up of many laye rs of fibrous cells. It is suspended and is
attached to the ciliary body. It contains 60% to 70% water and 6% fat and
more protein. The lens is colored by a slightly yellow pigmentation. This
coloring increases with age, which leads to clouding of lens. Ex cessive
clouding of lens happens in extreme cases which are known as cataracts .
This leads to poor color discrimination and loss of clear vision.
The lens absorbs approximately 8% of the visible light spectrum, with
relatively higher absorption at shorter wavelengths. Both infrared and
ultraviolet light are absorbed appreciably by proteinswithin the lens
structure and, in excessive amounts, can damage the eye.
iii) Retina
It is the inner most membrane, objects are imaged on the surface. The
central portio n of retina is called the fovea . Two types of receptors in
retina are Rods and Cones
Rods are long small receptors and Cones are short thicker in structure.The
rods and cones are not distributed evenly around the retina.
Fig. 1.12 St ructure of human Eye
munotes.in
Page 15
Digital Image Processing
15 Cones
Cones are highly sensitiv e to color and are located in the fovea . There are
6 to 7 million cones. Each cone is connected with its own nerve end.
Therefore humans can resolve fine details with the use of cones. Cones
respond to higher levels of illumination; their response is calle d photopic
vision or bright light vision
Rods
Rods are more sensitive to low illumination than cones. There are about
75 to 159 million rods. Many numbers of rods are connected to a common,
single nerve. Thus the amount of detail recognizable is less. Th erefore
rods provide only a general overall picture of the field of view. Due to
stimulation of rods the objects that appear color in daylight will appear
colorless in moon light. This phenomenon is called scotopic vision or dim
light vision.
The area where there is absence of receptors is called the blind spot
Fig 1.13 Rods and Cones in Retina
Receptor density measured in degrees from the fovea (the angle formed
between the visual axis and a line extending from the center of the lens to
the retin a
1.6.2 Image Formation in the Eye
The lens of eye is flexible, whereas the optical lens is not.
The radius of curvature of the anterior surface of the lens is greater than
the radius of its posterior surface.
The tension in the fibers of the ciliary bod y controls the shape of the lens
To focus distant object greater than 3m the lens is made flattened by the
controlling muscles and it will have lowest refractive index munotes.in
Page 16
Imag e Processing
16
Fig. 1.14 Graphical representation of the eye Point C is the optical center of
the lens
To focus nearer objects the muscles allow the lens to become thicker,and
strongest refractive index.
The distance between the centre of the lens and the retina is called focal
length.
It ranges from 14mm to 17mm as the refractive power decreases from
maximum to minimum.
1.6.3 Brightness
The following terms are used to define color light:
i)Brightness or Luminance: This is the amount of light received by the eye
regardless of color.
ii) Hue: This is the predominant spectral color in the light.
iii)Saturation: This indicates the spectral purity of the color in the light
Fig. 1.15 Color attributes
The range of light intensity levels to which the human visual system can
adapt is enormous from scotopic threshold to the glare limit. Subjective
brightness is a logarithmic function of the light intensity incident on the
eye.
Brightness adaptation :The human visual system has the ability to operate
over a wide range of illumination levels. Dilation and contraction of the munotes.in
Page 17
Digital Image Processing
17 iris of the eye can account for a change of only 16 times in the light
intensity falling on the retina. The process which allows great extension of
this range by changes in the sensitivity of the retina is called brightness
adaptation .
1.6.4 Contrast
The response of the eye to chan ges in the intensity of illumination is non -
linear
This does not hold at very low or very high intensities and it is dependent
on the intensity of the surround.
Perceived brightness and intensity
Perceived brightness is not a function of intensity. This can be explained
by Simultaneous contrast and Mach band effect
Simultaneous contrast
The small squares in each image are the same intensity.
Because the different background intensities, the small squares do not
appear equally bright.
Perceiving the t wo squares on different backgrounds as different, even
though they are in fact identical, is called the simultaneous contrast effect.
Psychophysically, we say this effect is caused by the difference in the
backgrounds.
The term contrast is used to emphas ise the difference in luminance of
objects. The perceived brightness of a surface depends upon the local
background which is illustrated in Fig. 1.16. In Fig. 1.16, the small square
on the right -hand side appears brighter when compared to the brightness
of the square on the left -handside, even though the gray level of both the
squares are the same. This phenomenon is termed ‘simultaneous contrast’.
It is to be noted that simultaneous contrast can make the same colours look
different.
Fig. 1.16 Simultane ous contrast
munotes.in
Page 18
Imag e Processing
18 1.6.5 Hue
Hue refers to the dominant color family like Yellow, Orange, Red, Violet,
Blue, and Green tertiary colors would also be considered hues. Hue is
mixed colors where neither color is dominant.
The pure hues are around the perime ter. The closer to the center of the
circle are more desaturated the colors, with white at the center. This Fig
1.17 shows hues, saturation and lightness.
Fig. 1.17 Hue
1.6.6 Saturation
Saturation is how “pure” the color is. For example, if its hue is cyan, its
saturation would be how purely cyan it is. Less saturated would mean
more whitish or grayish. If a color has greater -than-0 values for all three of
its red, green and blue primaries then it’s somewhat desaturated.
1.6.7 Machband effect
The Mec hband describes an effect where the human brain subconsciously
increases the contrast between two surfaces with different luminance . The
Mcehband effect is described in Fig 1.18. The intensity is uniform over the
bar.
munotes.in
Page 19
Digital Image Processing
19 Visual appearance of each strip is da rker at its leftside than its right. The
special interaction of luminance from an object and its surrounding creates
the Mechband effect which shows that brightness is not a monotonic
function of luminance.
Mechband is caused by lateral inhibition of recep tors in the eye.
Receptors receive the light they draw light -sensitive chemical compound
Receptors directly on the lighter side of the boundary can pull in unused
chemicals from the darker side, and produce a stronger response,and the
darker side of the boundary, gives a weaker effect..
Luminance within each block is constant
The apparent lightness of each strip vary across its length.
Close to the left edge of the strip it appears lighter than at the centre, and
close to the righ t edge of the strip it appears darker than at the centre.
The visual system is exaggerating the difference in luminance (contrast) at
each edge in order top detect it.
It shows that the human visual system tends to undershoot or overshoot
around the boun dary regions of different intensities.
1.18. Machband Effect
• The intensity is uniform over the width of each bar.
• However, the visual appearance is that each strip is darker at its right
side than its left.
1.7 SIMPLE IMAGE FORMATION MODEL
An image is denoted by a two dimensional function of the form f{x, y}.
The value or amplitude of f at spatial coordinates {x,y} is a positive scalar
quantity whose physical meaning is determined by the source of the
image. When an image is generated by a physical process, its values are
proportional to energy radiated by a physical source. As a consequence,
f(x,y) must be non -zero and finite; that is oThe function f(x,y) may be characterized by two components -
i) Illumination Component: The a mount of the source illumination incident
i(x,y) on the scene being viewed;
ii) Reflectance components: The amount of the source illumination r(x,y)
reflected back by the objects in the scene. munotes.in
Page 20
Imag e Processing
20 The functions combine as a product to form f(x,y). The intensit y of a
monochrome image at any coordinates (x,y) the gray level (l) of the image
at that point l= f (x, y.)
Lmin ≤ l ≤ L maxLminis to be positive
Lmaxmust be finite
Lmin=iminrmin
Lmax =imaxrmax
The interval [L min, Lmax] is called gray scale. The interval [0, L -l] where
l=0 is considered black and l= L -1 is considered white on the gray scale.
All intermediate values are shades of gray of gray varying from black to
white.
1.8 VIDICON AND DIGITAL CAMERA WORKING
PRINCIPLE
Vidicon
The vidicon is a storage -type camera tube in which a charge -density
pattern is formed by the imaged scene radiation on a photoconductive
surface which is then scanned by a beam of low velocity electrons.
The Vidicon operates on the principle of photo conductivity, where the
resistance of the target material shows a marked decrease when exposed to
light.
Vidicon is a short tube with a length of 12 to 20 cm and diameter between
1.5 and 4 cm.
Its life is estimated to be between 5000 and 20,000 hours.
The target consists of a thin photo conductive layer of eitherselenium or
antimony compounds which behaves like an insulator.
This is deposited on a transparent conducting film, coated on theinner
surface of the face plate. This conductive coating is known assignal
electrode or plate. munotes.in
Page 21
Digital Image Processing
21 With light focused on it, the photon energy enables more electronsto go to
the conduction band and this reduces its resistivity.
Image side of the photolayer, which is in contact with the signalelectrode,
is connected to DC supply through the load resistance.
The beam that emerges from the electron gun is focused on surfaceof the
photo conductive layer by combined action of uniformmagnetic field of an
external coil and electrostatic field of grid No 3.
Grid No. 4 provides a uniform decelerating field between its elf, andthe
photo conductive layer, so that the electron beam approachesthe layer with
a low velocity to prevent any secondary emission.
The fluctuating voltage coupled out to a video amplifier can be usedto
reproduce the target.
Digital camera
A digital camera is a camera that captures images and turns them into
digital form.
Digital camera shares an optical system which uses a lens with a variable
diaphragm to focus light onto an imagepickup device.
The diaphragm and shutter admit the correct amount ofli ght to the imager.
Digital camera contains image sensors that captures theincoming light rays
and turns them into electrical signals.
This image sensors can be of two types - i) charge -coupled device (CCD)
or ii)CMOS image sensor.
Light from the object zo oms into the camera lens.
This incoming light hit the image sensor, which breaks it upinto millions
of pixels.
The sensor measures the color and brightness of each pixeland stores it as
a number.
The output digital photograph is effectively a long string o fnumbers
describing the exact details of each pixel itcontains. munotes.in
Page 22
Imag e Processing
22 1.9 COLOUR IMAGE FUNDAMENTALS
1.9.1 RGB
In the RGB model, an image consists of three independent image planes,
one in each of the primary colors: red, green and blue. (The standard
wavelengt hs for the three primaries are as shown in figure). Specifying a
particular color is by specifying the amount of each of the primary
components present. Figure 1.21 shows the geometry of the RGB color
model for specifying colors using a Cartesian coordinat e system. The
grayscale spectrum, i.e. those colors made from equal amounts of each
primary, lies on the line joining the black and white vertices.
Fig 1.21 Schematic of the RGB color cube
Fig.1.21 The RGB color cube. The gray scale spectrum lies on th e line
joining the black and white vertices.
This is an additive model, i.e. the colors present in the light add to form
new colors, and is appropriate for the mixing of colored light for example.
The image on the left of figure 1.22 shows the additive mix ing of red,
green and blue primaries to form the three secondary colors yellow (red +
green), cyan (blue + green) and magenta (red + blue), and white ((red +
green + blue). The RGB model is used for color monitors and most video
cameras.
munotes.in
Page 23
Digital Image Processing
23 Fig.6.2 RGB 24 bit color cube
Fig. 1.22 24 -bit color cube
Fig.1.23 The figure on the left shows the additive mixing of red, green and
blue primaries to form the three secondary colors yellow (red + green),
cyan (blue + green) and magenta (red + blue), and white (red + g reen +
blue). The figure on the right shows the three subtractive primaries and
their pairwise combinations to form red, green and blue, and finally black
by subtracting all three primaries from white.
Fig 1.23 generating the RGB image of
the cross s ectional color plane Fig. 1.24 safe 216 RGB colors and
gray in 256 -color RGB system
Pixel Depth:
The number of bits used to represent each pixel in the RGB space is called
the pixel depth. If the image is represented by 8 bits then
the pixel depth of e ach RGB color pixel = 3*number of bits/plane=3*8=24
A full color image is a 24 bit RGB color image. Therefore total number of
colors in a full color image = (28)3 = 16,777,216
munotes.in
Page 24
Imag e Processing
24 Safe RGB colors:
Most of the system use 256 colors. Withoutt depending on the h ardware
capabilities of the system the system reproduces subset of colors which is
called the set of RGB colors or the set of all systems safe colors.
Standard safe colors:
It is assumed that a minimum number of 256 colors can be reproduced by
any system. Among these, 40 colors are found to be processed differently
by different operating system. The remaining 216 colors are called as
standard safe colors.
Component values of safe colors:
Each of the 216 safe colors can be formed from three RGB component
values. But each component value should be selected only from the set of
values {0, 51,102,153,204,255}, in which the successive numbers are
obtained by adding 51 and are divisible by 3 therefore total number of
possible values= 6*6*6=216
Hexadecimal represe ntation
The component values in RGB model should be represented using
hexadecimal number system. The decimal numbers 1,2,….14,15
correspond to the hex numbers 0,1,2,….9,A,B,C,D,E,F. the equivalent
representation of the component values is given in table:
Number System Color Equivalents
Hex 00 33 66 99 CC FF
Decimal 0 51 102 153 204 255
Applications:
Color monitors, Color video cameras
Advantages:
● Image color generation
● Changing to other models such as CMY is straight forward
● It is suitable for hardware implementation
● It is based on the strong perception of human vision to red, green
andblue primaries.
Disadvantages:
● It is not acceptable that a color image is formed by combining three
primary colors. munotes.in
Page 25
Digital Image Processing
25 ● This model is not suitable for describing colors in a w ay which is
practical for human interpretation.
1.9.2 CMY
The CMY (Cyan Mmagenta Yellow) model is a subtractive model
appropriate to absorption of colors. The CMY model asks what is
subtracted from white. The primary colors are cyan, magenta and yellow,
and secondary colors are with red, green and blue
The surface coated with cyan pigment is illuminated by white light, no red
light is reflected, and similarly for magenta and green, and yellow and
blue. The relationship between the RGB and CMY models is giv en by:
C 1 R
M = 1 - G
Y 1 B
The CMY model is used by printing devices and filters.
1.9.3 HIS MODELS
Colors are specified by the three quantities hue, saturation and intensity
which is similar to the way of human interpretation.
Hue: It is a color attribute that describes a pure color.
Saturation: It is a measure of the degree to which a pure color is diluted by
white light.
Intensity: It is a measureable and interpretable descriptor of
monochromatic imag es, which is also called the gray level.
i) Hue:
The hue of a color can be determined from the RGB color cube. if the
three points black, white and any one color are combined, a triangle is
formed. All the points inside the triangle will have the same hu e. This is
due to the fact that black and white components cannot change the hue.
HSI color space
The HIS color space is represented by vertical intensity axis and locus of
color points that lie on planes perpendicular to the axis. The shape of the
cube is defined by the intersecting points of these planes with the faces of
cube. As the planes move up and down along the intensity axis, the shape
can either be a triangle or a hexagon. In HSI space, primary colors are
separated by 120°. Secondary colors are also separated by 120° and the
angle between the secondary’s and primaries’ is 60°.
munotes.in
Page 26
Imag e Processing
26 Representation of Hue:
The hue of a color point is determined by an angle from some reference
point.
The angle between the point and the red axis is 0° is zero hue.
If the angle from red axis increases in the counter clock wise direction
then hue increases.
Fig. 1.25 Conceptual relationship between the RGB and HSI color models
ii) Intensity:
The intensity can be extracted from an RGB image because an RGB color
imag e is viewed as three monochrome intensity images.
Intensity Axis:
A vertical line joining the black vertex (0, 0, 0) and white vertex(1,1, 1) is
called intensity axis. The intensity axis represents the gray scale.
iii)saturation:
All points on the int ensity axis are gray which means that the saturation
i.e., purity of points on the axis is zero.
When the distance of a color from the intensity axis increases, the
saturation of that color also increases.
Representation of saturation
The saturation is described as the length from the vertical axis.
In the HSI space, it is represented by the length of the vector from the
origin to the color point.
If the length is more the saturation is high and vice versa. munotes.in
Page 27
Digital Image Processing
27
Fig.1.26 HIS Components of the images
Converting colors from RGB to HSI
Given an image in RGB color format the H component of each RGB
pixel is obtained usin the equation
H=
Converting colors from HSI to RGB
Converting equations depend on the value of H (H – Hue)> f or three
sectors the equation for conversion is given below:
RG (Red, Green) Sector (0
) : When H is in
this sector the RGB components are given by the equations
munotes.in
Page 28
Imag e Processing
28 GB (Green, Blue) Sector (120
) : When H
(H=H -120
is in this sector the RGB components are given by the
equations:
BR (Blue, Red) Sector (240
) : When H
(H=H -240
is in this sector the RGB components are given by the
equations:
Advantages of HSI model:
● It describes colors in terms that are suitable for human interpretation.
● The model all ows independent control over the color describing
quantities namely hue, saturation and intensity.
● It can be used as an ideal tool for developing image processing
algorithms based on color descriptions.
1.9.4 2D SAMPLING
To create a digital image, conver t the continuous sensed data into digital
form. This involves two processes.
i) Sampling
ii) Quantization
An image, f(x, y), may be continuous with respect to the x - and y -
coordinates, and also in amplitude. To convert it to digital form, sample
the func tion in both coordinates and in amplitude.
Digitizing the coordinate values is called Sampling.
The one -dimensional function in Fig. 1.27(b) is a plot of amplitude
(intensity level) values of the continuous image along the line segment AB
in Fig. 1.27 (a ). munotes.in
Page 29
Digital Image Processing
29 To sample this function, equally spaced samples along line AB, are
depicted in Fig. 1.27 (c). The spatial location of each sample is indicated
by a vertical tick mark.
The samples are shown as small white squares super imposed on the
function. The se t of these discrete locations gives the sampled function.
However, the values of the samples still span (vertically) a continuous
range of intensity values.
The intensity values must be (quantized) to form a digital function
The right side of Fig. 1.27 (c) shows the intensity scale divided into eight
discrete intervals, ranging from black to white. The vertical tick marks
indicate the specific value assigned to each of the eight intensity intervals.
The continuous intensity levels are quantized by assign ing one of the eight
values to each sample. The assignment is made depending on the vertical
proximity of a sample to a vertical tick mark. The digital samples resulting
from both sampling and quantization are shown in Fig1.27 (d). Starting at
the top of t he image and carrying out this procedure line by line produces
a two -dimensional digital image.
Fig. 1.27 Generating a digital image. (a) Continuous image. (b) A scan
line from A to B in the continuous image, used to illustrate the concepts
of sampling and quantization. (c) Sampling and quantization. (d) Digital scan line.
1.9.5 QUANTIZATION
Digitizing the amplitude values is called Quantization.
Quantisation involves representing the sampled data by a finite number of
levels bas ed on some criteria such as minimisation of quantiser distortion. munotes.in
Page 30
Imag e Processing
30 Quantisers can be classified into two types, namely, i) scalar quantisers
and ii) vector quantisers. The classification of quantisers is shown in
Fig. 1.29.
Selecting the number of individual mechanical increments for spatial
sampling at which the sensor to collect data for activation. Limits on
sampling accuracy are determined by the factors, such as the quality of the
optical components of the system.
Mechanical mot ion in the other direction can be controlled more
accurately, but it makes little sense to try to achieve sampling density in
one direction that exceeds the sampling limits established by the number
of sensors in the other.
The accuracy achieved in quantiz ation is highly dependent on the noise
content of the sampled signal. The method of sampling is determined by
the sensor arrangement used to generate the image.
When an image is generated by a single sensing element combined with
mechanical motion, and th en the output of the sensor is quantized as given
in Fig. 2.18.
The image after sampling and quantization is shown in fig 2.18 (b).
The quality of a digital image is determined to a large degree by the
number of samples and discrete intensity levels u sed in sampling and
quantization.
1.10 SUMMARY
Since1921 when the Bartlane cable picture transmission system was
introduced the Digital images started its evolution. In 1964 the computers
are used to process digital images and the actual digital image pro cessing
started working.
Digital image composed of elements called pixels. For the immediate
output display, fast processing and huge storage the digital images are
used. munotes.in
Page 31
Digital Image Processing
31 The position of the pixels to represent the digital images are identified
through n eibors of the pixels, adjacency, boundaries and connectivities of
the pixels.
Image Acquisition devices, Image storage devices, Image processing
elements and Image display devices are the basic elements of the digital
image processing sytem which are used to process the digital images. The
structure of human eye helps the human to understand and sense the colors
and structure of the images.
RGB , CMY are useful in representing the images with different colors,
brightness and contrasts.
1.11 REFERENCES
1. R.C.G onzalez&R.E.Woods, Digital Image Processing, Pearson
Education, 3rd edition, ISBN. 13:978 -0131687288
2. S. Jayaraman Digital Image Processing TMH (McGraw Hill)
publication, ISBN - 13:978 -0-07- 0144798
3. William K. Pratt, “Digital Image Processing”, John Wil ey, N J, 4th
Edition,2007
4. The Origins of Digital Image Processing & Application areas in
Digital Image Processing Medical Images,Mukul, Sajjansingh, and
Nishi
1.12 UNIT END EXERCISES
1. Define Image and Digital Image
2. Classify the images.
3. Write the advantages a nd di sadvantages of digital images.
4. What is digital image processing?
5. How do you represent the digital Images? Explain
6. Describe the relationship between pixels
7. How do measure the distance between pixels?
8. Explain the elements of digital image processing sys tem.
9. Explain the structure of human eye
10. Write short note on i) Hue ii)Mach band effect
11. Elucidate the working principles of digital camera with neat diagram
12. Write short note on i) RGB ii)CMY
munotes.in
Page 32
32 Module II
Image Enhancement in the spatial domain
2
SPATIAL DOMAIN METHODS
Unit Structure
2.0 Objectives
2.1 Introduction
2.2 An Overview
2.3 Spatial Domain Methods
2.3.1 Point Processing
2.3.2 Intensity transformations
2.3.3 Histogram Processing
2.3.4 Image Subtraction
2.4 Let us Sum Up
2.5 List of References
2.6 Bibliography
2.7 Unit End Exercises
2.0 OBJECTIVES
Enhancement's main goal is to improve the quality of an image so that it
may be used in a certain process.
● Enhancement of images Enhancement in the spatial domain and
Frequency domain f all into two categories.
● The word spatial domain refers to the Image Plane itself, which is
DIRECT pixel manipulation.
● Frequency domain processing approaches work by altering an image's
Fourier transform.
munotes.in
Page 33
Spatial Domain Methods
33 2.1 INTRODUCTION
The aggregate of pixels that m ake up an image is known as the spatial
domain.
Spatial Domain Methods are procedures that work on these pixels directly.
g(x,y)=T[f(x,y)]
F(x,y): Input Image, T: Image Operator g(x,y): Image that has been
processed.
T can also work with a group of images .
The neighborhood is defined as:
Input for Process: A one -pixel neighborhood around a point (x,y) The
most basic kind of input is a one -pixel neighborhood. s=T(r)
T:Transformation Function s,r: f(x,y) and g(x,y) grey levels, respectively.
The most basic t echnique is a rectangular sub -picture region centred at
(x,y).
● SPATIAL DOMAIN METHODS
The value of a pixel in the enhanced picture with coordinates (x,y) is the
outcome of executing some operation on pixels in the vicinity of (x,y) in
the input image, F.
Neighbourhoods can be any shape, however they are most commonly
rectangular.
● GREY SCALE MANIPULATION
When the operator T just acts on a pixel neighborhood in the input image,
it is the simplest kind of an operation because it only depends on the value
of F at that point (x,y). This is a greyscale mapping or transformation.
Thresholding is the simplest case, in which the intensity profile is replaced
with a step function that is active at a set threshold value. In this scenario,
any pixel in the input image w ith a grey level below the threshold is
mapped to 0 in the output image. The rest of the pixels are set to 255.
munotes.in
Page 34
Imag e Processing
34 Figure 1 depicts further greyscale adjustments.
● EQUALIZATION OF HISTOGRAMS
Equalization of histograms is a typical approach for im proving the
appearance of photographs. Assume that we have a largely dark image.
The visual detail is compressed towards the dark end of the histogram, and
the histogram is skewed towards the lower end of the greyscale. The
image would be much clearer if w e could stretch out the grey levels at the
dark end to obtain a more consistently distributed histogram. Figure 2 shows the original image, histogram, and equalised versions. Both photos have been quantized to a total of 64 grey levels.
munotes.in
Page 35
Spatial Domain Methods
35 Finding a grey scale translation function that produces an output image
with a uniform histogram is the goal of histogram equalisation (or nearly
so).
What is the procedure for determining the grey scale transformation
function? Assume that our grey levels are continuous and that they have
been normalised to a range of 0 to 1.
We need to identify a transformation T that converts the grey values r in
the input image F to grey values s = T(r) in the converted image.
The assumption is that
● T is single valued and monotonicall y increasing, and
●
for
.
The inverse transformation from s to r is given by
r = T-1(s).
We have a probability distribution for grey levels in the input image Pr if
we take the histogram for the input image and normalise it so that the area
under the hist ogram is Pr(r).
What is the probability distribution Ps(s) if we transform the input image
to s = T(r)?
It turns out that, according to probability theory,
where r = T-1(s).
Consider the transformation
The cumulative distribution function of r is repre sented by this. The
derivative of s with respect to r is calculated using this definition of T.
Substituting this back into the expression for Ps, we get
munotes.in
Page 36
Imag e Processing
36 for all
.Thus, Ps(s) is now a uniform distribution
function, which is what we want.
● DISCRETE FORM ULATION
The probability distribution of grey levels in the input image must first be
determined. Now
where nk is the number of pixels having grey level k, and N is the total
number of pixels in the image.
The transformation now becomes
Note that
,
the index
,and
.
So that the output values of this transformation span from 0 to 255, the
values of
must be scaled up by 255 and rounded to the nearest integer.
As a result of the discretization and rounding of
to the nearest in teger,
the modified image's histogram will not be exactly uniform.
● SMOOTHING AN IMAGE
Image smoothing is used to reduce the impact of camera noise, erroneous
pixel values, missing pixel values, and other factors. Image smoothing can
be done in a variety of ways; we'll look at neighborhood averaging and
edge -preserving smoothing.
● NEIGHBOURHOOD AVERAGING
The average pixel value in a neighbourhood of
is obtained from
the average pixel value in a neighbourhood of ( x,y) in the input image.
For example, if we u se a
neighbourhood around each pixel we
would use the mask
munotes.in
Page 37
Spatial Domain Methods
37 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9
Each pixel value is multiplied by 1/9and then totalled before being placed
in the resulting image. This mask is moved across the image in steps until
every pixel is covered. This soothing mask is used to convolve the image
(also known as a spatial filter or kernel).
The value of a pixel, on the other hand, is normally expected to be more
strongly related to the values of pixels nearby than to those further away.
This is because most points in a picture are spatially coherent with their
neighbours; in fact, this hypothesis is only false at edge or feature points.
As a result, the pixels towards the mask's center are usually given a higher
weight than those on the edges.
The rectangular weighting function (which just takes the average over the
window), a triangular weighting function, and a Gaussian are all typical
weighting functions.
Although Gaussian smoothing is the most widely utilized, there isn't much
of a difference between alternative weighting functions in practice.
Gaussian smoothing is characterized by the smooth modification of the
image's frequency components.
Smoothing decreases or attenuates the image's higher frequencies. Other
mask shapes can c ause strange things to happen to the frequency
spectrum, but we normally don't notice much in terms of image
appearance.
Smoothing that preserves the edge
Because the image's high frequencies are suppressed, neighborhood
averaging or Gaussian smoothing wil l tend to blur edges. Using median
filtering as an alternative is a viable option. The grey level is set to the
median of the pixel values in the pixel's immediate vicinity.
The median m of a set of values is the value at which half of the values are
less than m and the other half are greater. Assume that the pixel values
3 x 3 in a given neighborhood are (10, 20, 20, 15, 20, 20, 20, 25, 100). We
obtain (10, 15, 20, 20, |20|, 20, 20, 25, 100) if we order the values, and the
median is 20.
The result of median filtering is that pixels with outlying values are forced
to become more like their neighbors while maintaining edges. Median
filters, by definition, are non -linear.
Median filtering is a morphological operation. Pixel values are replaced
with the smallest value in the neighborhood when we erode an image. munotes.in
Page 38
Imag e Processing
38 When distorting an image, the greatest value in the neighborhood is used
to replace pixel values. Median filtering replaces pixels with the
neighborhood's median value. The type of morphological o peration is
determined by the rank of the value of the pixel used in the neighborhood.
Figure 3: Image of Genevieve with salt and pepper noise, averaging result, and median filtering result.
2.2 AN OVERVIEW
The spatial domain technique is a well -know n denoising technique. It's a
noise -reduction approach that uses spatial filters to apply directly to digital
photos. Linear and nonlinear spatial filters are the two types of spatial
filtering algorithms (Sanches et al., 2008). Filtering is a method used in
image processing to do several preprocessing and other tasks such as
interpolation, resampling, denoising, and so on. The type of task
performed by the filter method and the type of digital image determine the
filter method to be used. Filter methods ar e used in digital image
processing to remove undesirable noise from digital photographs while
maintaining the original image (Priya et al., 2018; Agostinelli et al., 2013). munotes.in
Page 39
Spatial Domain Methods
39 Nonlinear filters are used in a variety of ways, the most common of which
is to rem ove a certain sort of unwanted noise from digital photographs.
There is no built -in way for detecting noise in the digital image with this
method. Nonlinear filters often eliminate noise to a certain point while
blurring images and hiding edges. Several ac ademics have created various
sorts of median (nonlinear) filters to solve this challenge throughout the
previous decade. The median filter, partial differential equations, nonlocal
mean, and total variation are the most used nonlinear filters. A linear fil ter
is a denoising technique in which the image's output results vary in a
linear fashion. Denoising outcomes are influenced by the image's input.
As the image's input changes, the image's output changes linearly. The
processing time of linear filters for picture denoising is determined by the
input signals and the output signals. The mean linear filter is the most
effective filter for removing Gaussian noise from digital medical pictures.
This approach is a simple way to denoise digital photos (Wieclawek a nd
Pietka, 2019). The average or mean pixels values of the neighbour pixels
are calculated first, and then replaced with every pixel of the digital image
in the mean filter. To reduce noise from a digital image, it's a very useful
linear filtering approach . Wiener filtering is another linear filtering
technique. This technique requires all additive noise, noise spectra, and
digital picture inputs, and it works best if all of the input signals are in
good working order. This strategy reduces the mean square error of the
intended and estimated random processes by removing noise.
2.3 SPATIAL DOMAIN METHODS
For image enhancement, there are primarily two methods: one for images
in the spatial domain and the other for images in the frequency domain.
The first meth od is based on editing individual pixels in an image,
whereas the second way is based on altering an image's Fourier transform.
Spatial domain methods
Here, image processing functions can be expressed as :
f(x,y) is the input picture, g(x,y) is the proce ssed image (i.e. the result or
output image), and an operator on f is defined over some neighbourhood N
of (x,y). We usually employ a rectangle subimage centred at N for (x,y).
a) N is a 1×1 neighbourhood (point -processing)
N encompasses exactly one pix el in this case. The operator T is then
transformed into a gray -level transformation function, which is written as:
The gray levels of f(x,y) and g are represented by r,s (x,y). We can
produce some intriguing effects with this technique, such as contrast munotes.in
Page 40
Imag e Processing
40 stretching and bi -level mapping (here an image is converted so that it only
contains black and one color white). The challenge is to define T in such a
way that it darkens grey levels below a particular threshold k and
brightens grey levels above it. A bl ack-and-white image is created when
the darkening and brightening are both consistent (black and white). This
technique is known as 'point -processing' since s is only dependent on the
value (i.e. the gray -level) of T in a single pixel.
b) N is a m×m neighb ourhood (spatial filtering)
In this situation, N refers to a small area. It's worth noting that this
technology isn't limited to image enhancement; it can also be used to
smoothen photos, among other things. The values in a predefined
neighborhood (i.e. the mask/filter) of g(x,y) are used to determine the
value of g(x,y) (x,y). The value of m can range from 3 to 10 in most cases.
These procedures are known as mask processing' or 'filtering.'
METHODS IN THE FREQUENCY DOMAIN
The convolution theorem is at the h eart of these techniques. The following
is an example of what it means:
Assume that g(x,y) is a convolution of an image f(x,y) and a linear,
position invariant operator h(x,y):
Applying the convolution theorem yields :
The Fourier transforms of f, g, a nd h are F, G, and H, respectively. The
following is the result of applying the inverse Fourier transform to G(u,v):
H(u,v), for example, enhances the high -frequency components of f(u,v),
resulting in a g(x,y) picture with exaggerated edges.
Some intrigu ing features can be noticed when looking at the theory of
linear systems (see figure 1): A system with the function of producing an
out-put image g(x,y) from an input image f (x,y) is referred to as h(x,y).
The Fourier notation for this operation is equiva lent to this.
munotes.in
Page 41
Spatial Domain Methods
41
Figure 1 : Linear systems.
2.3.1 POINT PROCESSING
When making a film, it's common to lessen the overall intensity to create a
unique atmosphere. Some people go overboard, and the effect is that the
observer can only see blackness. So, what exactly do you do? You take
out your remote and press the brightness button to alter the light intensity.
When you do this, you're performing a type of image processing called
point processing.
Let's say we have an input image f(x, y) that we want to alte r to get a
different image, which we'll call the output image g. (x,y). When altering
the brightness of a movie, the input picture is the one saved on the DVD
you're watching, and the output image is the one that appears on the
television screen. Point pro cessing is now described as an operation that
calculates the new value of a pixel in g(x, y) based on the value of the
same pixel in f(x, y) and some action. That is, in f(x, y), the values of a
pixel's neighbours have no influence, hence the name point pr ocessing.
The adjacent pixels will play a significant role in the upcoming subjects.
Figure 4.1 depicts the principle of point processing. Some of the most
fundamental point processing operations are explained in this topic.
When you use your remote to adj ust the brightness, you're actually
changing the value of b in the following equation:
The value of b is increased every time you press the '+' brightness button,
and vice versa. As b is increased, a higher and higher value is added to
each pixel in the i nput image, making the image brighter. The image
becomes brighter if b > 0, and darker if b 0. Figure 2 .2 depicts the effect of
altering the brightness.
munotes.in
Page 42
Imag e Processing
42 Figure 2 .1: The point -processing principle. A pixel in the input image is
processed, and the result i s saved in the output image at the same location.
Figure 2 .2: The resultant image will be equivalent to the input image if b
in Eq. 2 .1 is zero. If b is a negative quantity, the image produced will be
smaller.
If b is a positive number, the brightness of the resulting image will be
increased.
The us e of a graph, as shown in Fig. 2 .3, is often a more convenient
manner of illustrating the brightness action. The graph depicts the
mapping of pixel values in the input image (horizontal axis) to pixel
values in the output picture (vertical axis) (vertical axis). Gray -level
mapping is the name given to such a graph. The mapping does nothing in
the first graph, i.e., g(142,42) = /. (142,42).
In the following graph, all pixel values are increased (b > 0), resulting in a
brighter image. This has two effects: I no pixel in the output image will be
fully dark, and ii) some pixels in the output image will have a value
greater than 255. The latter is undesirable due to an 8 -bit image's upper
limit, hence all pixels above 255 are set to 255, as shown in the graph's
horizontal section. When b 0 is set to zero, some pixels will have negative
values and will be set to zero in the output, as shown in the previous
graph.
You can adjust the contrast in the same way that you can adjust the
brightness on your TV. The gray -level values that make up an image's
contrast are how distinct they are. When we look at two pixels with values
112 and 114 adjacent to each other, the human eye has trouble
distinguishing them, and we remark ther e is a low contrast. If the pixels
are 112 and 212, on the other hand, we can readily differentiate them and
claim the contrast is great.
munotes.in
Page 43
Spatial Domain Methods
43
Three instances of gray -level mapping are shown in Figure 2 .3. The input
is shown at the top of the page. The three additional images are the result
of the three gray -level mappings being applied to the input. Eq. 4.1 is used
in all three gray -level mappings.
Figure 2.4: If the value of an in Eq. 2 .2 is one, the output image will be the
same as the input image. If an is less than one, the resulting image will be
less contrasted; if an is greater than one, the resulting image will be more
contrasted.
Changing the slope of the graph1 changes the contrast of an image:
If an is more than one, the contrast is raised; if it is less than one, the
contrast is diminished. When a = 2, the pixels 112 and 114, for example,
will have the values 224 and 228, respectively. The contrast is raised by a
factor of two because the difference between them is increased by a factor
of two. The effect of adjusting the contrast may be observed in Fig. 4.4. munotes.in
Page 44
Imag e Processing
44 When the equations for brightness (Eq. 2.1) and contrast (Eq. 2 .2) are
combined, we get
Which is a straight line's equation. Consider an example of how to use this
equation. Let's say we' re interested in a section of the input image where
the contrast isn't quite right. As a result, we determine the range of pixels
in this region of the image and map them to the complete [0, 255] range in
the output image. Assume that the input image's min imum and maximum
pixel values are 100 and 150, respectively.
Changing the contrast implies that in the output image, all pixel values
below 100 are changed to zero, and all pixel values above 150 are set to
255. Eq. 2 .3 is used to map the pixels in the ran ge [100, 150] to [0, 255],
where a and b are defined as follows:
Non-linear Gray -Level Mapping
Gray -level mapping isn't confined to Eq. 2 .3-defined linear mappings. In
fact, the designer is free to specify the gray -level mapping as she wants as
long as e ach input value has just one output value. Rather than creating a
new equation/graph, the designer will frequently use one that is already
defined. The following are three of the most frequent non -linear mapping
functions.
Gamma Mapping
It is the process of converting one colour into another.
Because humans have a non -linear sense of contrast, it is useful to be able
to adjust the contrast in the dark grey levels and the light grey levels
separately in various cameras and display devices (for example, flat panel
televisions). Gamma mapping is a typical non -linear mapping that is
defined for positive as
munotes.in
Page 45
Spatial Domain Methods
45
Fig. 4.5 Curves of gamma -mapping for various gammas
Figure 2 .5 depicts a few gamma -mapping curves. We get the identity
mapping if = 1. We boost the mid -levels for 0 1 to increase the dynamics
in the dark sections. We decrease the mid -levels to increase the dynamics
in the bright areas for > 1. The gamma mapping is set up so that both the
input and output pixel values are between 0 and 1. Before applying th e
gamma transformation, the input pixel values must first be transformed by
dividing each pixel value by 255. After the gamma transformation, the
output values should be scaled from [0, 1] to [0, 255].
There is a specific case presented. A pixel with the v alue vin = 120 in a
gray-scale picture is gamma mapped with = 2.22. Initially, the pixel value
is divided by 255 to convert it to the interval [0,1], v = 120/255 = 0.4706.
Second, v2 = 0.47062.22 = 0.1876 is used to do gamma mapping. Finally,
the result is vout = 0.1876 • 255 = 47, which is transferred back to the
interval [0,255]. Figure 4.6 depicts some examples.
Figure 2 .6: With a value of 0.45, gamma mapping to the left is 0.45, while
with a value of 2.22, gamma mapping to the right is 2.22. The origi nal
image is in the middle.
Mapping on a Logarithmic Scale
The logarithm operator is used in an alternate non -linear mapping. The
logarithm of the pixel value is used to replace each pixel. Low -intensity
pixel values are amplified as a result of this. It's commonly employed
when an image's dynamic range is too high to display or when there are a munotes.in
Page 46
Imag e Processing
46 few bright spots on a dark background. Because there is no logarithm for
zero, the mapping is defined as
Where c is a scaling constant that guarantees a maximum output value of
255 It is calculated as follows:
Where umax is the input image's maximum pixel value. Changing the
pixel values of the input image using a linear mapping before the
logarithmic mapping can alter the behavior of the logarithmic mapping.
Figure 4.7 shows the logarithmic mapping from [0,255] to [0,255]. This
mapping will stretch low -intensity pixels while suppressing high -intensity
pixels' contrast. Figure 4 .7 shows one example.
2.3.2 INTENSITY TRANSFORMATIONS
When working with grayscale ima ges, it's common to wish to change the
intensity levels. For example, you might wish to flip the black and white
intensities or make the darks darker and the lights lighter. Intensity
modifications can be used to improve the contrast between various
intens ity values so that details in an image can be seen. The next two
photos, for example, illustrate an image before and after an intensity
modification.
The cameraman's jacket was originally black, but an intensity
transformation enhanced the contrast between the black intensity values,
which were previously too near, allowing the buttons and pockets to be
seen. (This example is taken from the Image Processing Toolbox, User's
Guide, Version 5 (MATLAB documentation) —found in the help menu or
online at:
munotes.in
Page 47
Spatial Domain Methods
47 In general, Intensity Transformation Functions are used to adjust the
intensity. The four main intensity transformation functions are discussed
in the following sections:
1. photographic negative (using imcomplement)
2. gamma transformation (using imadjust)
3. logarith mic transformations (using c*log(1+f))
4. contrast -stretching transformations
(using 1./(1+(m./(double(f)+eps)).^E)
● PHOTOGRAPHIC NEGATIVE
The Photographic Negative is the most straightforward of the intensity
conversions. Assume we're dealing with grayscale d ouble arrays with
black equal to 0 and white equal to 1. The notion is that 0s become 1s, and
1s become 0s, with any gradients in between reversed as well. This means
that genuine black becomes true white and vice versa in terms of intensity.
Incomplement provides a function in MATLAB that allows you to
produce photographic negatives (f). The graph below displays the
mapping between the original values (x -axis) and the incomplement
function, with a=0:.01:1.
An example of a photography negative is shown be low. Take note of how
much easier it is to read the text in the middle of the tyre now than it was
before: munotes.in
Page 48
Imag e Processing
48 Original
Photographic Negative
The MATLAB code that created these two images is:
I=imread('tire.tif');
imshow(I)
J=imcomplement(I);
figure, imsh ow(J)
● GAMMA TRANSFORMATIONS
Gamma Transformations allow you to curve the grayscale components to
brighten or darken the intensity (when gamma is less than one) (when
gamma is greater than one). These gamma conversions are created using
the MATLAB functio n:
imadjust(f, [low in high in], [low out high out], gamma) The input image
is f, the curve is gamma, and the clipping is [low in high in] and [low out
high out].
Values below low in and above high in are clipped to low out and high
out, respectively. Both [low in high in] and [low out high out] are used in
this lab with []. This indicates that the input's full range is mapped to the
output's full range. The plots below show the effect of varying gamma
with a=0:.01:1. Notice that the red line has gamma=0.4, which creates an
upward curve and will brighten the image. munotes.in
Page 49
Spatial Domain Methods
49
The outcomes of three of the gamma transformations indicated in the plot
above are shown below. Notice how numbers greater than one result in a
darker image, whilst values between 0 and one resu lt in a brighter image
with more contrast in dark places, allowing you to appreciate the tire's
intricacies.
Original (and
gamma=1)
gamma=3
gamma=0.4
The MATLAB code that created these three images is:
I=imread('tire.tif');
J=imadjust(I,[],[],1);
J2=imadjust(I,[],[],3);
J3=imadjust(I,[],[],0.4);
imshow(J);
figure,imshow(J2);
figure,imshow(J3); munotes.in
Page 50
Imag e Processing
50 The gamma transformation is a crucial step in the image display process.
You should find out more information about them. Charles Poynton, a
digital video systems expert who previously worked for NASA, has a
great gamma FAQ that I recommend you read, especially if you plan to
handle CGI. He also dispels several common misunderstandings
concerning gamma.
● LOGARITHMIC TRANSFORMATIONS
Logarithmic Transformations (such as the Gamma Transformation, where
gamma 1) can be used to brighten an image's intensity. It's most
commonly used to boost the detail (or contrast) of low -intensity values.
They're particularly good at bringing out detail in Fourier transformations
(covered in a later lab). The equation for obtaining the Logarithmic
transform of image f in MATLAB is:
g = c*log(1 + double(f))
The constant c is typically used to scale the log function's range to fit the
input domain. For an uint8 picture, c=255/log(1+2 55), or c=1/log(1+1)
(~1.45) for a double image. It can also be used to boost contrast —the
higher the c value, the brighter the image appears. The log function, when
used in this manner, can produce results that are excessively bright to
display. The graph ic below shows the result for various values of c when
a=0:.01:1. For the plots of c=2 and c=5, the min function clamps the y -
values at 1. (teal and purple lines, respectively).
The original image and the outcomes of applying three of the
transformations from above are shown below. When c=5, the image is the
brightest, and the radial lines on the interior of the tyre can be seen (these
lines are barely viewable in the original because there is not enough
contrast in the lower intensities). munotes.in
Page 51
Spatial Domain Methods
51
The MATLAB co de that created these images is:
I=imread('tire.tif');
imshow(I)
I2=im2double(I);
J=1*log(1+I2);
J2=2*log(1+I2);
J3=5*log(1+I2);
figure, imshow(J)
figure, imshow(J2)
figure, imshow(J3)
Notice how the bright sections, when intensity levels are capped, los e
detail. Any values generated by the scaling that is more than one are
presented as 1 (full intensity) and should be capped. The min(matrix,
upper bound) and max(matrix, lower bound) functions in MATLAB can
be used to clamp data, as indicated in the legen d for the plot above.
Although logarithms can be calculated in a variety of bases, including
MATLAB's builtin log10, log2, and log (natural log), the resulting curve
is the same for all bases when the range is scaled to match the domain.
Instead, the curve 's shape is determined by the range of values to which it
is applied. Here are some log curve examples for a variety of input values: munotes.in
Page 52
Imag e Processing
52
If you want to use logarithm transformations properly, you should be
aware of this effect. Here's what happens when you scale an image's
values to those ranges before applying the logarithm transform:
The MATLAB code that produced these images is:
tire = imread('tire.tif');
d = im2double(tire);
figure, imshow(d);
%log on domain [0,1] munotes.in
Page 53
Spatial Domain Methods
53 f = d;
c = 1/log(1+1);
j1 = c*log(1+f) ;
figure, imshow(j1);
%log on domain [0, 255]
f = d*255;
c = 1/log(1+255);
j2 = c*log(1+f);
figure, imshow(j2);
%log on domain [0, 2^16]
f = d*2^16;
c = 1/log(1+2^16);
j3 = c*log(1+f);
figure, imshow(j3);
The effects of the logarithm transform are barely evident in domain
[0, 1], but they are greatly accentuated in domain [0, 65535]. It's also
worth noting that, unlike linear scaling and clamping, gross detail
remains visible in light areas.
● CONTRAST -STRETCHING TRANSFORMATIONS
The contrast between the dar ks and the brightness is increased via
contrast -stretching procedures. In lab 1, we saw a simplified version of
section 5.3 of the textbook's automated contrast adjustment. That
adjustment simply expanded the histogram to fill the image's intensity
domain while keeping everything at about identical levels. You might
want to push the intensity to a particular point every now and again. There
are only a few degrees of grey around the level of interest, so everything
darker darks are a lot darker and everythin g lighter is a lot lighter. In
MATLAB, you can use the following function to make a contrast -
stretching transformation:
g=1./(1 + (m./(double(f) + eps)).^E)
The function's slope is controlled by E, and the mid -point, m, is where you
wish to switch from dar k to bright values. The distance between 1.0 and
the next greatest integer that may be expressed in a double -precision
floating -point is represented by eps, a MATLAB constant. It is utilized in
this equation to prevent division by zero if the image contain s any zero -
valued pixels. The outcomes of adjusting both m and E are represented in
two plot/diagram sets below. Given a=0:.01:1 and m=0.5, the results for
various values of E are plotted below. munotes.in
Page 54
Imag e Processing
54
The original image and the outcomes of applying the three c hanges from
above are shown below. The m value used in the following examples is
the average of the image intensities (0.2104). The function becomes more
like a thresholding function with threshold m for very high E values,The
resulting image is more black and white than grayscale, for example.
The MATLAB code that created these images is:
I=imread('tire.tif');
I2=im2double(I);
m=mean2(I2)
contrast1=1./(1+(m./(I2+eps)).^4); munotes.in
Page 55
Spatial Domain Methods
55 contrast2=1./(1+(m./(I2+eps)).^5);
contrast3=1./(1+(m./(I2+eps)).^10);
imshow(I2)
figure,imshow(contrast1)
figure,imshow(contrast2)
figure,imshow(contrast3)
This second plot shows how changes to m (using E=4) affect the contrast
curve:
The following shows the original image and the results of applying the
three transformations from ab ove. The m value used below is 0.2, 0.5, and
0.7. Notice that 0.7 produces a darker image with fewer details for this tire
image.
munotes.in
Page 56
Imag e Processing
56 The MATLAB code that created these images is:
I=imread('tire.tif');
I2=im2double(I);
contrast1=1./(1+(0.2./(I2+eps)).^4)
contrast2=1./(1+(0.5./(I2+eps)).^4);
contrast3=1./(1+(0.7./(I2+eps)).^4);
imshow(I2)
figure,imshow(contrast1)
figure,imshow(contrast2)
figure,imshow(contrast3)
● The intrans and changeclass Functions
Except for the contrast stretching transform, the file intran s.m Digital
Image Processing, Using MATLAB[2] provides a function that performs
all of the intensity transformations discussed above. You should go
through the code and figure out how to implement that feature.
A second function named changeclass is used b y the intrans function.
The intrans function's comments, which begin on the second line, explain
how to use it. Please take note of the description of the missing contrast
stretch transform, which states that it should take changing arguments and
what defa ults to use for missing values. The table below shows how
intrans can be used to correlate to the four Intensity Transformation
Functions. Consider the case when I=imread('tire.tif'); Transformation Intensity
Transformation
Function Corresponding intrans C all photographic
negative neg=imcomplement(I); neg=intrans(I,'neg'); logarithmic I2=im2double(I);
log=5*log(1+I2); log=intrans(I,'log',5); gamma gamma=imadjust
(I,[],[],0.4); gamma=intrans(I,'gamma',0.4); contrast -
stretching I2=im2double(I);
contrast=1. /(1+(0.2./(I2
+eps)).^5); contrast=intrans(I,'stretch',0.2,5);
munotes.in
Page 57
Spatial Domain Methods
57 2.3.3 HISTOGRAM PROCESSING
● HISTOGRAMS INTRODUCTION
The histogram is a graphical representation of a digital image used in
digital image processing. A graph is a representation of each tonal va lue as
a number of pixels. In today's digital cameras, the image histogram is
available. They are used by photographers to see the dispersion of tones
captured.
The horizontal axis of a graph represents tonal fluctuations, whereas the
vertical axis represe nts the number of pixels in that specific pixel. The left
side of the horizontal axis depicts black and dark parts, the middle
represents medium grey colour, and the vertical axis reflects the area's
size.
Histogram of the scenery
APPLICATIONS OF HIS TOGRAMS
1. Histograms are employed in software for simple computations in
digital image processing. munotes.in
Page 58
Imag e Processing
58 2. It's a tool for analyzing images. A careful examination of the
histogram can be used to predict image properties.
3. The image's brightness can be modified by loo king at the histogram's
features.
4. Having information on the x -axis of a histogram allows you to
modify the image's contrast according to your needs.
5. It is used to equalize images. To create a high contrast image, the
grey level intensities are extended alo ng the x -axis.
6. Histograms are utilized in thresholding because they improve the
image's appearance.
7. We can figure out which type of transformation is used in the method
if we have the input and output histograms of an image.
HISTOGRAM PROCESSING TECHNIQUES
● HISTOGRAM SLIDING
The entire histogram is shifted rightwards or leftwards in histogram
sliding. When a histogram is adjusted to the right or left, the brightness of
the image changes dramatically. The intensity of light released by a
particular light sour ce determines the brightness of the image.
● HISTOGRAM STRETCHING
The contrast of an image is boosted through histogram stretching. The
contrast of an image is defined as the difference between the maximum
and minimum pixel intensity values. munotes.in
Page 59
Spatial Domain Methods
59 If we wish to increase the contrast of an image, we expand the histogram
till it covers the entire dynamic range of the histogram.
We may determine whether an image has low or high contrast by looking
at its histogram.
● HISTOGRAM EQUALIZATION
Equalizing all of an imag e's pixel values is done through histogram
equalization. The transformation is carried out in such a way that the
histogram is uniformly flattened.
Histogram equalization broadens the dynamic range of pixel values and
ensures that each level has an equal n umber of pixels, resulting in a flat
histogram with great contrast.
When extending a histogram, the shape of the histogram remains the
same, however, when equalizing a histogram, the shape of the histogram
changes, and just one image is generated.
2.3.4 IMAGE SUBTRACTION
● IMAGE SUBTRACTION
Image enhancement and segmentation (where an image is divided into
various 'interesting' elements like edges and areas) are two applications for
this approach. The foundations are built on subtracting two images, which
is defined as computing the difference between each pair of related pixels
in the two images. It can be written as:
munotes.in
Page 60
Imag e Processing
60 A fascinating application is in medicine, where h(x,y) is called a mask and
subtracted from a succession of photos fi(x,y), yielding some fascinating
images. It is possible to watch a dye propagate through a person's brain
arteries, for example, by doing so. The portions in the photos that look the
same get darkened each time the difference is calculated, while the
differences become more hi ghlighted (they are not subtracted out of the
resulting image).
● IMAGE AVERAGING
Consider a noisy image g(x,y), which is created by adding a specific
amount of noise n(x,y) to an original image f(x,y):
The noise is expected to be uncorrelated (thus homo geneous across the
image) and have an average value of zero at each pair (x,y). By
introducing a set of noisy images g(x,y), the goal is to lessen the noise
effects.
Assume we have an image that was created by averaging noisy images:
We now calculate the expected value of
which is :
munotes.in
Page 61
Spatial Domain Methods
61
2.4 LET US SUM UP
Enhancement aims to improve the quality of an image so that it may be
used in a certain process. The word spatial refers to the Image Plane itself,
which is DIRECT pixel manipulation. Frequency domain processing
approaches work by altering an image's Fourier transform. Equalization of
histograms is a typical approach for improving the appearance of
photographs. We need to identify a transformation T that converts grey
values r in the input image F to gr ey values s = T(r) in the converted
image.
Figure 2 shows the original image, histogram, and equalized versions.
Image smoothing can be done in a variety of ways. We'll look at edge -
preserving smoothing. The average pixel value is obtained from the
averag e pixel value in a neighborhood of (x,y) in the input image. Other
mask shapes can cause strange things to happen to the image's frequency
spectrum.
2.5 LIST OF REFERENCES
1. https://www.mv.helsinki.fi/home/khoramsh/4Image%20Enhancemen
t%20in%20Spatial%20Domain.pdf
2. https://homepages.inf.ed.ac.uk/rbf/CVonline/L OCAL_COPIES/OWE
NS/LECT5/node3.html
3. https://www.sciencedirect.com/topics/engineering/spatial -domain
4. http://www.faadooengineers.com/online -study/post/cse/digital -imge -
processing/674/spatial -domain -methods
5. https://www.google.com/search?q=point+processing+in+image+p
rocessing&rlz=1C1CHZN_enIN974IN974&oq=POINT+PROCE
SSING&aqs=chrome.1.0i512l10.1767j0j15&sourceid=chrome&ie
=UTF -8
6. https://www.cs.uregina.ca/Links/class -info/425/Lab3/ munotes.in
Page 62
Imag e Processing
62 7. https://www.javatpoint.com/dip -
histograms#:~:text=In%20digital%20image%20processing%2C%20h
istograms,the%20details%20of%20its%20histogram .
8. http://www.faadooengineers.com/online -study/post/ece/digital -
image -processing/1123/image -subtraction -and-image -averaging
2.6 BIBLIOGRAPHY
1. https://www.mv.helsinki.fi/home/khoramsh/4Image%20Enhancement
%20in%20Spatial%20Domain.pdf
2. https://homepages.inf.ed.ac.u k/rbf/CVonline/LOCAL_COPIES/OWE
NS/LECT5/node3.html
3. https://www.sciencedirect.com/topics/engineering/spatial -domain
4. http://www.faadooengineers.com/online -study/post/cse/digital -imge -
processing/674/spatial -domain -methods
5. https://www.google.com/search?q=point+processing+in+image+pr
ocessing&rlz=1C1CHZN_enIN974IN974&oq=POINT+PROCESS
ING&aqs=chrome.1.0i512l10.1767j0j15&sourceid=c hrome&ie=U
TF-8
6. https://www.cs.uregina.ca/Links/class -info/425/Lab3/
7. https://www.javatpoint.com/dip -
histograms#:~:text=In%20digital%20image%20processing%2C%20hi
stograms,the%20details%20of%20its%20histogram .
8. http://www.faadooengineers.com/online -study/post/ece/digital -
image -processing/1123/image -subtraction -and-image -averaging
2.7 UNIT END EXERCISES
1. What is the goal of spatial domain image enh ancement?
2. What are the different types of filters used in the spatial domain?
3. What Did You Mean When You Said "Digital Image Shrinking"?
4. What are intensity transformations and how do they work?
5. Which of the following processes broadens the rang e of intensity
levels?
6. In digital image processing, what is histogram processing?
7. What exactly is the point of image subtraction?
8. How does applying an average filter to a digital image affect it?
9. What are the most common applications for smooth ing filters?
10. Why is frequency domain preferable to time domain?
munotes.in
Page 63
63 3
IMAGE AVERAGING SPATIAL
FILTERING
Unit Structure
3.0 Objectives
3.1 Introduction
3.2 An Overview
3.3 Image Averaging Spatial Filtering
3.3.1 Smoothing Filters
3.3.2 Sharpening Filters
3.4 Frequency Domain Methods
3.4.1 Low Pass Filterning
3.4.2 High Pass Filtering
3.4.3 Homomorphic Filter
3.5 Let us Sum Up
3.6 List of References
3.7 Bibliography
3.8 Unit End Exercises
3.0 OBJECTIVES
● The Spatial Filtering technique is applied to individual pixels in an
image. A mask is typically thought to be increased in size so that it has a
distinct center pixel. This mas k is positioned on the image so that the
mask's center traverses all of the image's pixels.
● Spatial filtering is frequently used to "clean up" laser output,
reducing aberrations in the beam caused by poor, unclean, or damaged
optics, or fluctuations in the laser gain medium itself.
3.1 INTRODUCTION
Spatial filtering is a method of modifying the features of an optical image
by selecting deleting certain spatial frequencies that make up an object,
such as video data received from satellites and space probes, or raster
removal from a television broadcast or scanned image. munotes.in
Page 64
Imag e Processing
64 Average (or mean) filtering is a technique for smoothing photographs by
lowering the intensity fluctuation between adjacent pixels. The average
filter replaces each value with the average valu e of neighboring pixels,
including itself, as it moves through the image pixel by pixel.
Filtering is a method of altering or improving an image. The processed
value for the current pixel depends on both itself and adjacent pixels in a
spatial domain opera tion or filtering... Filters or masks will be defined.
Filtering is a method of altering or improving an image. The processed
value for the current pixel depends on both itself and adjacent pixels in a
spatial domain operation or filtering... Filters or ma sks will be defined.
3.2 AN OVERVIEW
IMAGE ENHANCEMENT OVERVIEW
By working with noisy photos we can filter signals from noise in two
dimensions. Two types of noise: binary and Gaussian.
The user specifies a percentage value in the binary case (a numbe r
between 0 and 100). This value is randomly set equal to the maximum
grey level value and reflects the percentage of pixels in the image whose
values will be completely lost (corresponding to a white pixel).
The value of the pixel x(k,l) is changed in the Gaussian case by additive
white gaussian noise x(k,l)+n, with noise n~N(0,v) being normally
distributed and variance v set by the user (a number between 0 and 2 in
this exercise).
The image is the same in binary noise, except for a set of points where the
image's pixels are set to white. The noisy image seems blurred in the case
of Gaussian noise.
Original Image Image with binary noise
munotes.in
Page 65
Spatial Domain Methods
65 Image with Gaussian noise
The method of removing noise or sharpening photogr aphs to increase
image quality is known as image enhancement. Even though image
enhancement is a well -established approach, we will concentrate on two
strategies based on the notion of filtering an original image to produce a
restored or better image. Both linear and nonlinear actions are possible
with our filters.
1. Median filtering
A pixel is replaced by the median of the pixels in a window around it in
median filtering. That is to say,
W is a suitable window that surrounds the pixel. The median filtering
algorithm entails sorting the pixel values in the window in ascending or
descending order and selecting the middle value. In most cases, a square
window with an odd square size is chosen.
2. Spatial averaging
Each pixel is replaced by an average of its nearby pixels in the case of
spatial averaging. That is to say,
Where W is the number of pixels in the window, and Nw is the number of
pixels in the window. Because spatial averaging causes a distortion in the
form of blurring, the size of the window W is limi ted in practice.
To introduce noise to an image and then recover it using the techniques
as described above. You'll notice that the best picture enhancement
strategy is determined by the type of noise as well as the amount and level
of noise in the image.
munotes.in
Page 66
Imag e Processing
66 3.3 IMAGE AVERAGING AND SPATIAL FILTERING
SPATIAL FILTERING AND ITS TYPES
The Spatial Filtering technique is applied to individual pixels in an image.
A mask is typically thought to be increased in size so that it has a distinct
centre pixel. This mas k is positioned on the image so that the mask's
centre traverses all of the image's pixels.
Using linearity as a criterion for classification:
There are two kinds of them:
1. Linear Spatial Filter
2. Non -linear Spatial Filter
Classification in General:
Smoothing Spatial Filter: A smoothing spatial filter is used to blur and
reduce noise in an image. Blurring is a pre -processing technique for
removing minor details, and it is used to achieve Noise Reduction.
Types of Smoothing Spatial Filter:
1. Linear Filt er (Mean Filter)
2. Order Statistics (Non -linear) filter
These are explained in the next paragraphs.
1. Mean Filter: A linear spatial filter is just the average of the pixels in the
filter mask's neighborhood. The goal is to replace the value of each pixe l
in a picture with the average of the grey levels in the filter mask's
neighborhood.
Types of Mean filter:
(i) Averaging filter: This filter is used to reduce image detail. The
coefficients are all the same.
(ii) Weighted averaging filter: Pixels are mult iplied by various coefficients
in this filter. The average filter is multiplied by a higher value than the
centre pixel.
1. Order Statistics Filter:
This filter is based on the ordering of pixels within the image region it
covers. It substitutes the value in dicated by the ranking result for the value
of the centre pixel. This filtering preserves the edges better.
(i) Minimum filter: The 0th percentile filter is the smallest of the order
statistics filters. The smallest value in the window replaces the value in the
center. munotes.in
Page 67
Spatial Domain Methods
67 (ii) Maximum filter : The maximum filter is the 100th percentile filter.
The largest value in the window replaces the value in the center.
(ii) Median filter : Every pixel in the image is taken into account. The
original values of the pixel ar e replaced by the median of the list after
surrounding pixels are sorted first.
Sharpening Spatial Filter :
(also known as a derivative filter) is a type of spatial filter that sharpens
the image. The sharpening spatial filter serves the exact opposite ob jective
as the smoothing spatial filter. Its primary goal is to eliminate blurring and
highlight the edges. The first and second -order derivatives are used.
First order derivative:
● Must be zero in flat segments.
● Must be non zero at the onset of a grey leve l step.
● Must be non zero along ramps.
First order derivative in 1 -D is given by:
f' = f(x+1) - f(x)
Second order derivative:
● Must be zero in flat areas.
● Must be zero at the onset and end of a ramp.
● Must be zero along ramps.
Second order derivative in 1 -D is given by:
f'' = f(x+1) + f(x -1) - 2f(x)
3.3.1 SMOOTHING FILTERS
SMOOTHING FILTERS
To reduce the amount of noise in an image, image smoothing filters such
as the Gaussian, Maximum, Mean, Median, Minimum, Non -Local Means,
Percentile, and Rank filters can b e used. Although these filters can
efficiently reduce noise, they must be applied with caution so that crucial
information in the image is not altered. It's also worth noting that, in most
circumstances, edge detection or augmentation should come after
smoothing.
● GAUSSIAN
● MEAN
● MEAN SHIFT munotes.in
Page 68
Imag e Processing
68 ● MEDIAN
● NON-LOCAL MEANS
● GAUSSIAN
When you ap ply the Gaussian filter to an image, it blurs it and removes
information and noise. It's comparable to the Mean filter in this regard. It
does, however, use a kernel that represents a Gaussian or bell -shaped
hump. Unlike the Mean filter, which produces an evenly weighted
average, the Gaussian filter produces a weighted average of each pixel's
neighborhood, with the average weighted more towards the center pixels'
value. As a result, the Gaussian filter smoothes the image more gently and
maintains the edges better than a Mean filter of comparable size.
The frequency response of the Gaussian filter is one of the main
justifications for adopting it for smoothing. Lowpass frequency filters are
used by the majority of convolution -based smoothing filters. As a res ult,
they have the effect of removing high spatial frequency components from
an image. You can be quite certain about what range of spatial frequencies
will be present in the image after filtering by selecting an adequately big
Gaussian, which is not the c ase with the Mean filter. Computational
biologists are also interested in the Gaussian filter since it has been
associated with some biological plausibility. For example, some cells in
the brain's visual circuits often respond in a Gaussian fashion.
Becaus e many edge -detection filters are susceptible to noise, Gaussian
smoothing is typically utilised before edge detection.
MEAN
Mean filtering is a straightforward technique for smoothing and reducing
noise in photographs by remo ving pixel values that aren't indicative of
their surroundings. Mean filtering is a technique that replaces each pixel
value in an image with the mean or average of its neighbors, including
itself.
The Mean filter, like other convolution filters, is based on a kernel, which
describes the shape and size of the sampled neighborhood for calculating
the mean. The most common kernel size is 3x3, but larger kernels might
be utilized for more severe smoothing. It's worth noting that a small kernel
can be applied m ultiple times to achieve a similar, but not identical, result
to a single pass with a large kernel.
Although noise is reduced after mean filtering, the image has been
softened or blurred, and high -frequency detail has been lost. This is
mainly caused by th e filter's limits, which are as follows:
• A single pixel with a very atypical value can have a considerable impact
on the mean value of all the pixels in its vicinity.
munotes.in
Page 69
Spatial Domain Methods
69 The filter will interpolate new values for pixels on the edge when the filter
neighbor hood straddles an edge. If crisp edges are required in the output,
this could be a problem.
The Median filter, which is more commonly employed for noise reduction
than the Mean filter, can solve both of these concerns. Smoothing is often
done with other co nvolution filters that do not calculate the mean of a
neighborhood. The Gaussian filter is one of the most popular.
MEAN SHIFT
Mean shift filtering is based on a data clustering algorithm extensively
used in image processing a nd can be utilized for edge -preserving
smoothing. The collection of surrounding pixels is determined for each
pixel in an image with a spatial location and a specific grayscale value.
The new spatial center (spatial mean) and the new mean value are
calcula ted for this set of adjacent pixels. The new center for the following
iteration is determined by the calculated mean values. Iterate the specified
technique until the spatial and grayscale mean cease changing. The final
mean value will be set to the iterat ion's beginning point at the end of the
iteration.
MEDIAN
The Median filter is typically used to minimise image noise, and it can
often preserve image clarity and edges better than the Mean filter. This
filter, like the Mean f ilter, examines each pixel in the image individually
and compares it to its neighbours to determine whether it is typical of its
surroundings. Instead of merely replacing the pixel value with the mean of
nearby pixel values, the median of those values is u sed instead. Median
filters are especially good for reducing random intensity spikes that
commonly appear in microscope images.
This filter's operation is depicted in the diagram below. The median is
derived by numerically ordering all of the pixel values in the surrounding
neighborhood, in this case, a 3x3 square, and then replacing the pixel in
question with the middle pixel value.
Median filter
munotes.in
Page 70
Imag e Processing
70 The center pixel value of 150, as seen in the picture, is not typical of the
surrounding pixels and is substi tuted with the median value of 124. It's
worth noting that larger neighborhoods will result in more severe
smoothing.
The Median filter has two key advantages over the Mean filter since it
calculates the median value of a neighborhood rather than the mean: • The
median is more robust than the mean, thus a single very unrepresentative
pixel in a neighborhood will not have a substantial impact on the median
value. For example, in datasets contaminated with salt -and-pepper noise
(scatter dots).
● Since the media n value must be the value of one of the pixels in the
neighborhood, the Median filter does not create unrealistic pixel values
when the filter straddles an edge. For this reason, it is much better at
preserving sharp edges than the Mean filter.
● However, th e Median filter is sometimes not as subjectively good at
dealing with large amounts of Gaussian noise as the Mean filter. It is
also relatively complex to compute.
NON-LOCAL MEANS
Unlike the Mean filter, which smooths a pictur e by taking the mean of a
set of pixels surrounding a target pixel, the Non -Local Means filter takes
the mean of all pixels in the image, weighted by their similarity to the
target pixel. When compared to mean filtering, this filter can result in
improved post-filtering clarity with minimum information loss. When
smoothing noisy images, the Non -Local Means or Bilateral filter should
be your first choice in many circumstances.
It's worth noting that non -local means filtering works best when the noise
in the data is white noise, in which case most visual characteristics,
including small and thin ones, will be maintained.
3.3.2 SHARPENING FILTERS
Image preprocessing has long been a feature of computer vision, and it can
considerably improve the performance of machine learning models. Image
processing is the process of applying several sorts of filters to our image.
Filters can assist minimize image noise while also enhancing the image's
qualities.
Sharpening filters are discussed as below.
• When compared to sm ooth and blurry images, sharpening filters make
the transition between features more recognizable and evident.
• What occurs when a sharpening filter is applied to an image?
munotes.in
Page 71
Spatial Domain Methods
71 When compared to their neighbors, the brighter pixels are rendered
brighter (boos ted).
Sharpening or blurring an image can be reduced to a series of matrix
arithmetic operations.
When we apply a filter to our image, we're doing a convolution operation
on it with a Xen kernel. A kernel is a square matrix with nxn dimensions.
CONVOLUTION AND KERNEL
Each image can be represented as a matrix, with its features represented as
numerical values, and we use convolution with various types of matrices
known as kernels to extract or enhance distinct features.
The act of adding each element of the image to its nearby neighbors,
weighted by the kernel, is known as convolution. This has something to do
with a type of mathematical convolution. Despite being marked by "*,"
the matrix operation being performed —convolution —is not ordinary
matrix multiplic ation.
The kernel is what determines the type of operation we're doing, such as
sharpening, blurring, edge detection, gaussian blurring, and so on.
The following is an example of a sharpening kernel:
SHARPENING
• Sharpening is a technique for enhancing the transition between features
and details by sharpening and highlighting the edges. Sharpening, on the
other hand, does not consider whether it is enhancing the image's original
features or the noise associated with it. It improves both.
Blurring vs Sha rpening
● Blurring: Blurring/smoothing is accomplished in the spatial domain by
averaging the pixels of its neighbors, resulting in a blurring effect. It's
an integration procedure. munotes.in
Page 72
Imag e Processing
72
● Sharpening: Sharpening is a technique for identifying and emphasizing
diffe rences in the neighborhood. It is a differentiation process.
Sharpening Filters of Various Types
1) High Boost Filtering and Unsharp Masking
Using a smoothing filter, we can sharpen an image or perform edge
improvement.
1. Make the image blurry. Blurring i s the process of suppressing the
majority of high -frequency components.
2. Original Image - Blurred Image (Output (Mask)). Most of the high -
frequency components that were previously blocked by the blurring
filter are now present in this output.
3. By apply ing the mask to the original image, the high -frequency
components will be enhanced.
This procedure is called UNSHARP MASKING since we are using a
blurred image to create our personalized mask.
As a result, Unsharp Mask m(x, y) can be written as:
● f(x,y) = original image.
● fb(x,y) = blurred image.
When you apply this mask to the original image, the high frequency
components are enhanced.
The value k determines how much weight should be given to the mask that
is being added.
1. Unsharp Masking is represente d by k = 1.
2. High Boost Filtering is represented by k > 1 since we are boosting high -
frequency components by adding higher weights to the image's mask
(edge features).
This approach, like most other sharpening filters, will not yield adequate
results if the image contains noise.
We may get the mask without subtracting the blurred image from the
original by using a negative Laplacian filter. munotes.in
Page 73
Spatial Domain Methods
73 2) Laplacian Filters
A second -order derivative mask is a Laplacian Filter. It attempts to
eliminate the INWARD and O UTWARD edges. This difference in
second -order derivatives aids in determining whether the changes we're
seeing are caused by pixel changes in continuous regions or by an edge.
Positive values are found at the center of a general Laplacian kernel, while
negative values are found in a cross pattern.
To proceed with the derivation of this kernel matrix, knowledge of partial
derivatives and Laplacian operators is required.
Let us consider our image as function of two variables, f(x, y) .
We will be dealing wit h partial derivatices along the two spatial axes.
Discrete form of Laplacian
munotes.in
Page 74
Imag e Processing
74 Resultant resultant Laplacian Matrix
Laplacian Operators' Effects
• It emphasizes and intensifies grey discontinuities while deemphasizing
continuous regions (regions w ithout edges), i.e. derivatives that vary
slowly.
We'll utilize some approximate Laplacian Filters for our programming.
Let us perform sharpening using different methods
Using OpenCV as a tool
OpenCV is a python -based library for dealing with computer visi on issues.
Let's have a look at the code below and figure out what's going on.
● We'll start by importing the libraries we'll need to sharpen our image.
● Numpy -> For conducting quick matrix operations OpenCV -> For
image operations
● cv2.imread -> cv2.imread -> cv2.imread -> To read the input image
from our disc in the form of a numpy array.
● cv2.scale -> To resize our image to fit in the dimensions of (400, 400).
munotes.in
Page 75
Spatial Domain Methods
75 • kernel -> kernel is a 3X3 matrix that we define based on how we want to
slide the picture acros s for convolution.
• cv2.filter2D -> cv2.filter2D -> cv2.filter2D To convolve a kernel with an
image, Opencv includes a function called filter2D.
It accepts three parameters as input:
1. img -> picture input
2. ddepth -> the depth of the output image
3. ke rnel-> kernel of convolution
This is how we can use OpenCV to conduct sharpening.
Changing the magnitudes of the kernel matrix allows us to experiment
with the kernel to obtain different levels of sharpened images.
Original Image
munotes.in
Page 76
Imag e Processing
76 • ImageFilter has a n umber of pre -defined filters, such as sharpen
and blur, that may be used with the filter() method.
• We sharpen our image twice and save the results in the sharp1
and sharp2 variables.
Image after 1st sharp operation
Image after 2nd sharp operation
Sharpening effects can be seen, with the features becoming brighter and
more distinguishable.
munotes.in
Page 77
Spatial Domain Methods
77 3.4 FREQUENCY DOMAIN METHOD
Frequency domain methods
In the frequency domain, image enhancement is simple. To create the
enhanced image, we simply compute the Four ier transform of the image to
be enhanced, multiply the result by a filter (rather than convolve in the
spatial domain), and then take the inverse transform.
The concept of blurring an image by lowering the magnitude of its high -
frequency components or sha rpening an image by increasing the
amplitude of its high -frequency components is intuitively simple. However,
implementing similar actions as convolutions by modest spatial filters in
the spatial domain is typically more computationally efficient.
Understa nding frequency domain principles is crucial since it leads to
enhancement approaches that would otherwise go unnoticed if attention
was focused solely on the spatial domain.
Filtering
Low pass filtering is the process of removing high -frequency components
from an image. The image is blurred as a result of this (and thus a
reduction in sharp transitions associated with noise). All low -frequency
components would be retained while all high -frequency components
would be eliminated in an ideal low pass filter. Ideal filters, on the other
hand, have two flaws: blurring and ringing. The shape of the related
spatial domain filter, which includes a huge number of undulations, is the
source of these issues. Smoother frequency -domain filter transitions, such
as the Bu tterworth filter, produce substantially superior outcomes.
Figure 5: An ideal low pass filter's transfer function.
3.4.1 LOW PASS FILTERING
The high -frequency content of an image's Fourier transform is heavily
influenced by edges and sudden changes i n gray values. • In an image,
regions of relatively uniform gray values contribute to the Fourier
transform's low -frequency content. • As a result, a picture can be
smoothed in the Frequency domain by lowering the Fourier transform's
high-frequency content . This is a lowpass filter, right? • For the sake of munotes.in
Page 78
Imag e Processing
78 simplicity, we'll just discuss real and radially symmetric filters. • A perfect
lowpass filter with r0 as the cutoff frequency
Ideal LPF with r0 = 57
The origin (0, 0) is in the image's center, not i ts corner (remember the
"fftshift" operation).
• Using electrical components, the sudden shift from 1 to 0 of the transfer
function H (u,v) is impossible to achieve in practice. It can, however, be
simulated on a computer.
Ideal LPF examples
munotes.in
Page 79
Spatial Domain Methods
79
The blurre d images have a pronounced ringing effect, which is a hallmark
of perfect filters. The discontinuity in the filter transfer function is to
blame.
In an ideal LPF, the cutoff frequency is chosen.
• The number of frequency components passed by the filter is determined
by the ideal LPF's cutoff frequency 0 r.
• The smaller the 0 r value, the more image components are removed by
the filter.
• In general, the value of 0 r is selected so that the majority of the
components of interest pass through while the major ity of the non -
interesting components are deleted. This is usually a set of contradictory
needs. We'll look at some of the specifics of image restoration.
• Computing circles that contain a given fraction of the total picture power
is a good technique to e stablish a set of standard cut -off frequencies.
• Suppose − = − = = 1 0 1 0 ( , ) N v M u TP P u v , where 2 P(u,v) =
F(u,v) , is the total image power.
• Consider a circle of radius () r0 α as a cutoff frequency in relation to a
threshold, such that T v u ∑∑P(u,v) = αP
• After that, we can set a threshold and c alculate an acceptable cutoff
frequency () r0 α. munotes.in
Page 80
Imag e Processing
80
• A two -dimensional Butterworth lowpass filter has the following transfer
function:
• r0: cutoff frequency, n: filter order
• Because the frequency response does not have a fast transition like the
ideal L PF, it is better for image smoothing because it does not introduce
ringing. n r u v H u v 2 0
Butterworth LPF example
Original Image LPF image, r0 =18
munotes.in
Page 81
Spatial Domain Methods
81 Butterworth LPF example: False contouring
Image with false contouring due to insufficient Lowpass filtered
version of previous image
bits used for quantization
Butterworth LPF example: Noise filtering
Butterworth LPF example: Noise filtering
munotes.in
Page 82
Imag e Processing
82
Low-pass Gaussian filters
• In two dimensions, the form of a Gaussian lowpass filter is 2 2 ( , ) / 2
( , ) − σ = D u v H u v e , where 2 2 D(u,v) = u + v is the frequency plane
distance from the origin.
• The parameter σ represents the Gaussian curve's spread or dispersi on.
The greater the value of σ , the higher the cutoff frequency and the less
severe the filtering.
• The filter is reduced to 0.607 of its maximum value of 1 when D(u,v) =σ
3.4.2 HIGH PASS FILTERING
HIGHPASS FILTERING
• The high -frequency content of a F ourier transform is heavily influenced
by edges and sudden transitions in gray values in a picture.
• Low -frequency content of a Fourier transform is influenced by regions of
relatively uniform gray values in an image.
• As a result, image sharpening in th e Frequency domain can be
accomplished by lowering the Fourier transform's low -frequency
content. A highpass filter would be this.
• Only real and radially symmetric filters will be considered for the sake of
simplicity.
• With a cutoff frequency of 0 r, a n ideal highpass filter is:
munotes.in
Page 83
Spatial Domain Methods
83
Ideal HPF with r0 = 36
The origin (0, 0) is in the image's centre, not its corner (remember the
"fftshift" operation).
• Using electrical components, the sudden shift from 1 to 0 of the transfer
function H (u,v) is impossibl e to achieve in practise. It can, however, be
simulated on a computer.
Ideal HPF examples
munotes.in
Page 84
Imag e Processing
84
• Note how the output images have a strong ringing effect, which is a
hallmark of ideal filters. The discontinuity in the filter transfer function
is to blame.
• A two -dimensional Butterworth highpass filter has the following transfer
function:
• n: filter order, r0: cutoff frequency
• Because the frequency response does not have a sharp transition like the
ideal HPF, it is better for image sharpening because it does not introduce
ringing.
munotes.in
Page 85
Spatial Domain Methods
85 Butterworth HPF example
High -pass Gaussian filters
• In two dimensions, the form of a Gaussian lowpass filter is 2 2 (,) / 2 (,)
1 − σ = -D u v H u v e, where 2 2 D(u,v) = u + v is the distance from the
origin in the frequency plane.
The greater the value of, the higher the cutoff frequency and the harsher
the filtering
3.4.3 HOMOMORPHIC FILTER
HOMOMORPHIC FILTERING
Light reflected f rom objects is used to create images. The image F(x,y) has
two basic characteristics: (1) the amount of source light incident on the
scene being viewed, and (2) the amount of light reflected by the objects in
the scene. The illumination and reflectance com ponents of light are
indicated by the letters i(x,y) and r(x,y), respectively. The image function
F is created by multiplying the functions i and r:
F(x,y) = i(x,y)r(x,y), munotes.in
Page 86
Imag e Processing
86 where
and 0 < r(x,y) < 1. We cannot easily use the
above product to operate separa tely on the frequency components of
illumination and reflection because the Fourier transform of the product of
two functions is not separable; that is
Let's say, on the other hand, that we define
Then
or
The Fourier transforms of
and are Z, I, and R, respectively. The
Fourier transform of the sum of two images: a low frequency illumination
image and a high frequency reflectance image is represented by the
function Z.
Figure 6: Transfer function for homomorphic filtering.
We may now suppre ss the light component while enhancing the
reflectance component by using a filter with a transfer function that
suppresses low frequency components while enhancing high frequency
components. Thus
munotes.in
Page 87
Spatial Domain Methods
87 Where S is the result's Fourier transform. In the realm o f space,
By letting
and
We get
s(x,y) = i'(x,y) + r'(x,y).
Finally, because z was calculated by taking the logarithm of the original
image F, the inverse produces the desired augmented image:
As a result, the following figure can be used to summari se the
homomorphic filtering process:
Figure 7: The process of homomorphic filtering.
3.5 LET US SUM UP
The Spatial Filtering technique is applied to individual pixels in an image.
A mask is typically thought to be increased in size so that it has a d istinct
center pixel. Average (or mean) filtering is a technique for smoothing
photographs by lowering intensity fluctuation between adjacent pixels.
The best picture enhancement strategy is determined by the type of noise
as well as the amount and level o f noise in an image. Both linear and
nonlinear actions are possible with our filters.
We will concentrate on two strategies based on the notion of filtering an
original image. The averaging filter is used to reduce image detail. This
filtering preserves th e edges better. A sharpening spatial filter serves the
exact opposite objective as the smoothing spatial filter. Its primary goal is
to eliminate blurring and highlight the edges. The first and second -order
derivatives are used.
munotes.in
Page 88
Imag e Processing
88 3.6 LIST OF REFERENCES
1. https://www.geeksforgeeks.org/spatial -filtering -and-its-types/
2. http://www.seas.ucla.edu/dsplab/ie/over.html
3. https://www.geeksforgeeks.org/spatial -filtering -and-its-types/
4. https://www.theobjects.com/dragonfly/dfhelp/3 -
5/Content/05_Image%20Processing/Smoothing%20Filters.htm#:~:te
xt=Mean%20filtering%20is%20a%20simple,of%20its%20neighbor s
%2C%20including%20itself .
5. http://saravananthirumuruganathan.wordpress.com/2010/04/01/introd
uction -tomean -shift-algorithm/
6. https://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/O
WENS/LECT5/node4.html#:~: text=Image%20enhancement%20
in%20the%20frequency,to%20produce%20the%20enhanced%2
0image .
7. file:///E:/MY%20IMP%20documents/Lecture_9.pdf
8. file:///E:/MY%20IMP%20documents/Lecture_9.pdf
3.7 BIBLIOGRAPHY
1. https://www.geeksforgeeks.org/spatial -filtering -and-its-types/
2. http://www.seas.ucla.edu/dsplab/ie/over.html
3. https://www.geeksforgeeks.org/spatial -filtering -and-its-types/
4. https://www.theobjects.com/dragonfly/dfhelp/3 -
5/Content/05_Image%20Processing/Smoothing%20Filters.htm#:~:te
xt=Mean%20filtering%20is%20a%20simple,of%20its%20ne ighbors
%2C%20including%20itself .
5. http://saravananthirumuruganathan.wordpress.com/2010/04/01/introd
uction -tomean -shift-algorithm/
6. https://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/
OWENS/LECT5/node4.ht ml#:~:text=Image%20enhancement%
20in%20the%20frequency,to%20produce%20the%20enhanced
%20image .
7. file:///E:/MY%20IMP%20documents/Lecture_9.pdf
8. file:///E:/MY%20IMP%20documents/Lecture_9.pdf
9. https://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/OWE
NS/LEC T5/node4.html#:~:text=Image%20enhancement%20in%20th
e%20frequency,to%20produce%20the%20enhanced%20image . munotes.in
Page 89
Spatial Domain Methods
89 3.8 UNIT END EXERCISES
Q1. Why does the averaging filter cause the image to blur?
Q2. How does applying an average filter to a digital image affect it?
Q3. What does it mean to sharpen spatial filters?
Q4. What is the primary purpose of image sharpening?
Q5. What is the best way to sharpen an image?
Q6. How do you figure out what a low -pass filter's cutoff frequency is?
Q7. What is the purpose of a low -pass filter?
Q8. What is the effect of high pass filtering on an image?
Q9. In homomorphic filtering, which filter is used?
Q10. In homomorphic filtering, which high -pass filter is used?
munotes.in
Page 90
90 Module III
4
DISCRETE FOURIER TRANSFORM -I
Unit Structure
4.1 Objectives
4.2 Introduction
4.3 Properties of DFT
4.4 FFT algorithms ñ direct, divide and conquer approach
4.4.1 Direct Computation of the DFT
4.4.2 Divide -and-Conquer Approach to Computation o f the DFT
4.5 2D Discrete Fourier Transform (DFT) and Fast Fourier Transform
(FFT)
4.5.1 2D Discrete Fourier Transform (DFT)
4.5.2 Computational speed of FFT
4.5.3 Practical considerations
4.6 Summary
4.7 References
4.8 Unit End Exercises
4.1 OBJECTI VES
After going through this unit, you will be able to:
● Understood the fundamental concepts of Digital Image processing
● Able to discuss mathematical transforms.
● Describe the DCT and DFT techniques
● Classify different types of image transforms
● Examine the use of Fourier transforms for image processing in the
frequency domain
munotes.in
Page 91
Discrete Fourier Transform
91 4.2 INTRODUCTION
In the realm of image processing, the Fourier transform is commonly
employed. An image is a function that varies in space. Decomposing an
image into a series of orth ogonal functions, one of which being the Fourier
functions, is one technique to analyse spatial fluctuations. An intensity
image is transformed into the spatial frequency domain using the Fourier
transform.
The sampling process converts a continuous -time s ignal x(t) into a
discrete -time signal x(nT), where T is the sample interval.
x(t) sampling to x(nT)
The Fourier transform of a finite energy discrete time signal x(nT) is given
by [1]
where X(ejω ) is a continuous function of ω and is known as Discrete -Time
Fourier Transform (DTFT).
The relationship between ω and Ω is defined by
ω = ΩT
Replacing Ω by 2πf
ω =2πf ×T
where T is the sampling interval and is equal to 1/fs. Replacing T by 1/fs
ω = 2π f × 1/fs
where fs is the sampling frequency
ω = k × 2π
To limit the infinite number of values to a finite number, Eq. is modified
as
munotes.in
Page 92
Imag e Processing
92 The Discrete Fourier Transform (DFT) of a finite duration sequence x(n)
is defined as
where k = 0, 1......, N – 1
The discrete -frequency representation (DFT) transfers a discrete signal
onto a complex sinusoidal basis.
4.3 PROPERTIES OF DFT
We checked the periodicity of the combination by applying the DFT on a
combination of two periodic sequences x1(n), x2(n). Becaus e DFT is
defined over a single period, the DFT of combination must have a single
periodicity to be well described. In the continuous example, there are three
types of combinations: linear ax1+bx2, convolution of x1 & x2, and
multiplication x1 x2. For the c ontinuous case, x1(n) is combined with x2
to define both linear combination and multiplication (n). Similarly, each
x1(i) in the discrete case should be coupled with x2 (i). As a result, x1(n)
and x2(n) have the same periodicity N, and the resultant series has the
same periodicity N. If two sequences have distinct periodicities N1 and
N2, padding transforms the periodicity N1 sequence into periodicity N2
by adding zeros at the end of N1.
i) Linearity Property :
Let X1(k) = DFT of x1(n) & X2(k) = DFT of x2(n)
∴ DFT {a x1(n) + b x2(n) } = a X1(k) + b X2(k) where a,b are
constants.
ii) Periodicity :
If a sequence x(n) periodic with periodicity N then N point DFT, X(k) is
also periodic with periodicity N .
Let x(n+N ) = x(n) ∀ .
Then DFT (X (k+N)) = X(k ) ∀ munotes.in
Page 93
Discrete Fourier Transform
93 iii) Circular Time shift :
It states that if discrete time signal is circularly shifted in time by m units
then it’s DFT is multiplied by
If DFT x(n) = X(k) Then DFT
iv) Circular Frequency shift :
If discrete time signal multiplied by
then DFT is circularly shifted by m units.
If DFT x(n) = X(k) Then DFT
v) Multiplication :
DFT of product of two discrete time sequence equivalent to circular
convolution of DFT of individual sequences scaled by factor 1/ .
If DFT x(n) = X (k), Then DFT {x1(n) x2 (n)}= 1 / { 1( ) ⊛ 2( )}
4.4 FFT ALGORITHMS Ñ DIRECT, DIVIDE AND
CONQUER APPROACH [2]
DFT calculation is made more efficient using FFT algorithms. The
method, which employs a divide -and-conquer strategy, reduces a DFT o f
size N, where N is a composite number, to the computation of smaller
DFTs from which the bigger DFT is computed. We describe essential
computational strategies, known as fast Fourier transform (FFT)
algorithms, for computing the DFT when the size N is a power of two or
power of four.
According to the formula, the computing challenge for the DFT is to
compute the sequence {X(k)} of N complex -valued integers given another
sequence of data x(n) of length N. munotes.in
Page 94
Imag e Processing
94
where
Similarly, IDFT given as,
We see that di rect computation of X(k) requires N complex
multiplications (4N real multiplications) and N —1 complex adds (4N —
2 real additions) for each value of k. As a result, computing all N DFT
values necessitates N2 complex multiplications and N — N complex
addit ions.
Direct DFT computation is inefficient primarily because it does not take
advantage of the phase factor IN's symmetry and periodicity features.
These two properties in particular are:
Property of symmetry:
Property of periodicity:
These two essential features of the phase factor are used by the
computationally efficient algorithms presented in this section, commonly
known as fast Fourier transform (FFT) algorithms.
4.4.1 Direct Computation of the DFT
For a complex -valued sequence x (n) of N points, the DFT may be
expressed as
The direct computation of above equations requires:
● 2N2 evaluations of trigonometric functions.
● 4N2 real multiplications.
munotes.in
Page 95
Discrete Fourier Transform
95 ● 4N(N -1) real additions.
● A number of indexing and addressing operations.
These are com mon operations in DFT computational techniques. The DFT
values X R(k) and X I(k) are obtained by the procedures in items 2 and 3. To
retrieve the data x(n), 0 to N - 1, and the phase factors, as well as to store
the results, indexing and addressing procedure s are required. Each of these
computing operations is optimized differently by the various DFT
methods.
4.4.2 Divide -and-Conquer Approach to Computation of the DFT
If we take a divide -and-conquer method, we can design computationally
efficient DFT algorit hms. This method is based on decomposing an N -
point DFT into smaller and smaller DFTs. The FFT algorithms are a class
of computationally efficient algorithms based on this basic principle.
To illustrate the computation of an N -point DFT, where N can be fa ctored
as a product of two integers, that is,
N =LM
Because we can pad any sequence with zeros to secure a factorization of
the form above equation, the condition that N is not a prime integer is not
limiting.
As shown in Fig. 1, the sequence x(n), 0< n< N —1, can be stored in a
one-dimensional array indexed by nor a two -dimensional array indexed by
1 and m, where 0 The row index is /, but the column index is m.
As a result, the sequence x(n) can be saved in a recta ngular format.
munotes.in
Page 96
Imag e Processing
96
Fig. 1 Two dimensional data array for storing the sequence x(n) 0 < n
< N-1
array in a variety of ways, each of which depends on the mapping of index
n to the " indexes (l, m). For example, suppose that we select the mapping
n = Ml + m
This leads to an arrangement in which the first row consists of the first M
elements of x(n), the second row consists of the next M elements of x(n),
and so on, as illustrated in Fig. 2(a). On the other hand, the mapping
n = 1 + mL
stores the first L elem ents of x(n) in the first column, the next L elements
in the second column, and so on, as illustrated in Fig.2(b).
munotes.in
Page 97
Discrete Fourier Transform
97
Fig. 2 Two arrangements for the data arrays
The computed DFT values can be stored in a similar manner.
The mapping is specifically from the index k to a pair of indices (p, q),
with 0
The DFT is stored on a row -by-row basis if the mapping
K = Mp+q
is chosen, with the first row containing the first M elements of the DFT
X(k), the second row containing the foll owing set of M elements, and so
on.
The mapping
k = qL+ p,
leads in column -wise X(k) storage, with the first L elements stored in the
first column, the second set of L elements in the second column, and so
on.
Assume that x(n) is mapped to the rectangula r array x(l, m) and that X(k)
is mapped to a comparable rectangular array X(p, q).
The DFT can therefore be written as a double sum over the rectangle
array's elements multiplied by the phase factors. Then,
,
But
munotes.in
Page 98
Imag e Processing
98
The expression involves the c omputation of DFTs of length M and length
L. To elaborate, let us subdivide the computation into three steps:
i) we compute the M -point DFTs
for each of the rows l = 0, 1, ... , L — 1.
ii) we compute a new rectangular array G(l, q) defined as
iii) Finally, we compute the L -point DFTs
for each column q = 0, 1, ... ,M - 1, of the array G(1, q).
On the surface, the computing process given above appears to be more
complicated than the direct DFT computation. The first phase entails
computing L DFTs with M points each. As a result, LM complex
multiplications and LM(M - 1) complex additions are required in this
phase. The second phase necessitates the application of LM complex
multiplications. Finally, MLV complex multiplications and ML(L - 1)
complex additions are required in the third step of the algorithm. As a
result, the computational difficulty is
Complex multiplications: N(M + L + 1)
Complex additions: N(M + L - 2)
where N = ML.
As a result, the number of multiplications has decreased from N2 to N(M +
L + 1), while the number of additions has decreased from N(N - 1) to N(M
+ L - 2).
To summarize, the algorithm that we have introduced involves the
following computations:
Algorithm 1
1. Store the signal column -wise.
2. Compute the M -point DFT of each row. munotes.in
Page 99
Discrete Fourier Transform
99 3. Multiply the resulting array by the phase factors
4. Compute the L -point DFT of each column
5. Read the resulting array row -wise.
An additional algorithm with a similar computational structure can be
obtained if the input signal is stored row -wise and the resulting
transformation is column -wise. This case we select,
n = Ml + m
k = qL + p
This choice of indices leads to the formula for the DFT in the form,
Thus we obtain a second algorithm.
Algorithm 2
1. Store the signal row -wise.
2. Compute the L -point DFT at each co lumn.
3. Multiply the resulting array by the factors
4. Compute the M -point DFT of each row.
5. Read the resulting array column -wise.
4.5 2D DISCRETE FOURIER TRANSFORM (DFT) AND
FAST FOURIER TRANSFORM (FFT)[1]:
3.1.5.1 2D Discrete Fourier Transform (DFT) :
The 2D-DFT of a rectangular image f(m, n) of size M × N is represented
as F(k, l)
f (m, n) ----2D DFT →F(k, l)
where F(k, l) is defined as munotes.in
Page 100
Imag e Processing
100
For a square image f (m, n) of size N × N, the 2D DFT is defined as
The inverse 2D Discrete Fourier Transform is given by
The Fourier transform F (k, l) is given by
F(k,l) = R(k,l) + jI(k,l)
where R(k, l ) repre sents the real part of the spectrum and I(k, l) represents
the imaginary part.
The Fourier transform F (k, l ) can be expressed in polar coordinates as
F( k,l)= mod ( F(k,l)) ej kl
where mod (F(k, l)) = (R2{F(k, l)}+ I2{F(k, l)})1/2 is called the magnitud e
spectrum of the Fourier transform and
is the phase angle or phase spectrum. Here, R{F(k, l)}, I{F(k, l)} are the
real and imaginary parts of F(k, l) respectively.
The Fast Fourier Transform is the most computationally efficient type of
DFT (FFT).
The FFT of an image can be represented in one of two ways: (a)
conventional representation or (b) optical representation.
High frequencies are collected at the centre of the image in the standard
form, whereas low frequencies are distributed at the edges, as seen in Fig.
1. The null frequency can be seen in the upper -left corner of the graph. munotes.in
Page 101
Discrete Fourier Transform
101
Fig. 1 – Standard representation of FFT of an image [1,3]
The frequency range is [0, N] X [0, M], where M is the image's horizontal
resolution and N is the image's ver tical resolution.
munotes.in
Page 102
Imag e Processing
102
Fig. 2 optical representation of the FFT of the same image.
Discreteness in one domain leads to periodicity in another as in Fig. 2, as
we all know. As a result, the spectrum of a digital image will be unique in
the range – π to π or between 0 and 2π.
4.5.2 Computational speed of FFT [4]:
The DFT requires N2 comple x multiplications. At each stage of the FFT
(i.e. each halving) N/2 complex multiplications are required to combine
the results of the previous stage. Since there are (log 2N) stages, the
number of complex multiplications required to evaluate an -point DFT
with the FFT is approximately N/2 log 2N.
4.5.3 Practical considerations [4] :
If N is not a power of 2, there are 2 strategies available to complete N -
point FFT.
1. take advantage of such factors as N possesses. For example, if N is
divisible by 3‰ (e.g. N=48), the final decimation stage would include a
‰3 -point transform. munotes.in
Page 103
Discrete Fourier Transform
103 2. pack the data with zeroes; e.g. include 16 zeroes with the 48 data
points (for N=48) and compute a 64 -point FFT. (However, you should
again be wary of abrupt transitions between the tra iling (or leading) edge
of the data and the following (or preceding) zeroes; a better approach
might be to pack the data with more realistic “dummy values”). Zero
padd ing cannot improve the resolution of spectral components, because
the resolution is “proportional” to 1/M rather than 1/N. Zero padding is
very important for fast DFT implementation (FFT).
4.6 SUMMARY :
Frequency smoothing and frequency leaking are example s of DFT
applications on finite pictures with MxN pixels. DFT is based on
discretely sampled pictures (pixels), which suffer from aliasing. DFT takes
into account periodic boundary conditions including centering, edge
effects, and convolution. Images have borders and are truncated (finite),
resulting in frequency smoothing and leakage. All drawbacks of DFT
overcomes by FFT.
4.7 REFERENCES
1] S. Jayaraman Digital Image Processing TMH (McGraw Hill)
publication, ISBN - 13:978 -0-07- 0144798
2] John G. Proakis, Digital Signal Processing: Principles, Algorithms,
And Applications, 4/E
3] Gonzalez, Woods & Steven, Digital Image Processing using MATLAB,
Pearson Education, ISBN -13:978 -0130085191
4] https://www.robots.ox.ac.uk/~sjrob/Teaching/SP/l7.pdf
4.8 UNIT END EXE RCISES
1. Find the N × N point DFT of the following 2D image f(m, n), 0 ≤ m, n
≤ N
2. Prove that DFT diagonlises the circulant matrix.
3. Which of the following is true regarding the number of computations
required to compute an N -point DFT?
a) N2 complex multiplications and N(N -1) complex additions
b) N2 complex additions and N(N -1) complex multiplications
c) N2 complex multiplications and N(N+1) complex additions
d) N2 complex additions and N(N+1) complex multiplications
Answer : a
4. Which of the followi ng is true regarding the number of computations
required to compute DFT at any one value of ‘k’? munotes.in
Page 104
Imag e Processing
104
a) 4N -2 real multiplications and 4N real additions
b) 4N real multiplications and 4N -4 real additions
c) 4N -2 real multiplications and 4N+2 real additions
d) 4N real multiplications and 4N -2 real additions
Answer : d
5. Divide -and-conquer approach is based on the decomposition of an N -
point DFT into successively smaller DFTs. This basic approach leads to
FFT algorithms.
a) True
b) False
Answer : a
6. How many c omplex multiplications are performed in computing the N -
point DFT of a sequence using divide -and-conquer method if N=LM?
a) N(L+M+2)
b) N(L+M -2)
c) N(L+M -1)
d) N(L+M+1)
Answer : d
7. Define discrete Fourier transform and its inverse.
8. State and prove the translation property.
9. Give the drawbacks of DFT.
10. Give the property of symmetry and Periodicity of Direct DFT.
munotes.in
Page 105
104 5
DISCRETE FOURIER TRANSFORM -II
Unit Structure
5.1 Objectives
5.2 Introduction
5.2.1 Image Transforms
5.2.2 Unitary Transform
5.3 Properties of 2 -D DFT
5.4 Classification of Image transforms
5.4.1 Walsh Transform
5.4.2 Hadamard Transform
5.4.3 Dis crete cosine transform
5.4.4 Discrete Wavelet Transform
5.4.4.1 Haar Transform
5.4.4.2 KL Transform
5.5 Summary
5.6 References
5.7 Unit End Exercises
5.1 OBJECTIVES
After going through this unit, you will be able to:
● Understood the fundamental conc epts of Digital Image processing
● Able to discuss mathematical transforms.
● Describe the DCT and DFT techniques
● Classify different types of image transforms
● Examine the use of Fourier transforms for image processing in the
frequency domain
munotes.in
Page 106
Discrete Fourier Transform
105 5.2 INTRODUCTION
5.2.1 Image Transforms
A representation of an image is called as Image transform. The reasons for
transforming an image from one representation to another are as -
i. The transformation may isolate critical components of the image
pattern so that they are di rectly accessible for analysis.
ii. The transformation may place the image data in a more compact form
so that they can be stored and transmitted efficiently.
5.2.2 Unitary Transform [1] :
A discrete linear transform is unitary if its transform matrix conforms to
the unitary condition
A × AH = I
where A = transformation matrix, AH represents Hermitian matrix.
AH= A*T
I = identity matrix
When the transform matrix A is unitary, the defined transform is called
unitary transform.
Example) Check whether the DFT matr ix is unitary or not [1].
Step 1 : Determination of the matrix A
Finding 4 -point DFT (where N = 4)
The formula to compute a DFT matrix of order 4 is given below
where k = 0, 1..., 3
1. Finding X(0)
2. Finding X(1)
munotes.in
Page 107
Imag e Processing
106
= x(0) − jx(1)−x(2)+ jx(3)
3. Finding X(2)
X(2) = x(0) −x(1)+ x(2)−x(3)
4. Finding X(3)
X (3) = x(0)+ jx(1) −x(2)− jx(3)
Collecting the coefficients of X(0), X(1), X(2) and X(3), we get
munotes.in
Page 108
Discrete Fourier Transform
107
( A*)T = AH =
The result is the identity matrix, which shows that Fourier transform
satisfies unitary condition.
Sequency - It refers to the number of sign changes. The sequency for a
DFT matrix of ord er 4 is given below.
1 1 1 1
1 j - 1 - j
1 - 1 1 -1
1 - j - 1 j
munotes.in
Page 109
Imag e Processing
108 5.3 PROPERTIES OF 2 -D DFT [1] :
The properties of 2D DFT are shown in table 1.
Table 1 - properties of 2D DFT [1]
5.4 CLASSIFICATION OF IMAGE TRANSFORMS :
A) Walsh transform : transforms with non -sinusoidal orthogonal basis
functi ons
B) Hadamard transform : transforms with non -sinusoidal orthogonal
basis functions
C) Discrete cosine transform : transforms with orthogonal basis
functions
D) Discrete wavelet transform
● Haar Transforms : transforms with non -sinusoidal orthogonal basis
functions
● KL transform : transforms whose basis functions depend on the
statistics of the input data
5.4.1 Walsh Transform [1] :
The representation of a signal by a set of orthogonal sinusoidal waveforms
is known as Fourier analysis. The frequency components are the
coefficients of this representation, and the waveforms are arranged by
frequency. To express these functions, Walsh created a comprehensive set
of orthonormal square -wave functions. The Walsh function's
computational simplicity stems from the fact that it is a real function with
only two possible values: +1 or –1.
The one -dimensional Walsh transform basis can be given by the following
equation [1]: munotes.in
Page 110
Discrete Fourier Transform
109
where n = time index,
k = frequency index
N = order
m = number bits to repr esent a number
bi(n) = i th (from LSB) bit of the binary value
n decimal number represented in binary.
The value of m is given by m = log 2 N.
The two -dimensional Walsh transform of a function f (m, n) is given
by[1],
Example) Find the 1D Walsh basis for the fourth -order system (N = 4).
the value of N is given as four. From the value of N, the value of m is
calculated as N = 4;
m = log 2 N
=log 2 4 = log 2 22
=2*log 22
m = 2
In this, N = 4. So n and k have the values of 0, 1, 2 and 3. I varies from 0
to m–1. From the above computation, m = 2. So i has the value of 0 and 1.
The construction of Walsh basis for N = 4 is given in Table 1.
When k or n is equal to zero, the basis value will be 1/N.
Table 1 : Construction of walsh basis for N = 4 [1] munotes.in
Page 111
Imag e Processing
110 Sequency : The Walsh functions may be ordered by the number of zero
crossings or sequency, and the coefficients of the representation may be
called sequency components. The sequency of the Walsh basis function
for N = 4 is shown in Table 2.
Table 2 : Walsh transform basis for N = 4 [1]
Likewise, all the values of the Walsh transform can be calculated. After
the calculation of all values, the basis for N = 4 is given below [1]. munotes.in
Page 112
Discrete Fourier Transform
111
Note: When looking at the Wals h basis, every entity has the same
magnitude (1/N), with the only difference being the sign (whether it is
positive or negative). As a result, the following is a shortcut approach for
locating the sign:
Step 1 Write the binary representation of n.
Step 2 W rite the binary representation of k in the reverse order.
Step 3 Check for the number of overlaps of 1 between n and k.
Step 4 If the number of overlaps of 1 is
i) zero then the sign is positive
ii) even then the sign is positive
iii) odd then the sign is negative
5.4.2 Hadamard Transform :
The Hadamard transform is similar to the Walsh transform with the
exception that the rows of the transform matrix are re -ordered.
The elements of a Hadamard transform's mutually orthogonal basis
vectors are either +1 or –1, resulting in a minimal computing complexity
in calculating the transform coefficients.
The following approach can be used to create Hadamard matrices for N = 2n:
The order N = 2 Hadamard matrix is gi ven as,
H2 =
1 1
1 -1 munotes.in
Page 113
Imag e Processing
112 The Hadamard matrix of order 2N can be generated by Kronecker product
operation:
H2N =
Substituting N = 2 in above equation,
H4 =
=
Similarly, substituting N = 4 in H 2N equation,
The Hadamard matrix of order N = 2n may be generated from the order
two core matrix. It is not desirable to store the entire matrix.
5.4.3 Discrete cosine transform :
Membe rs of a family of real -valued discrete sinusoidal unitary transforms
are discrete cosine transforms. A discrete cosine transform is made up of a
set of sampled cosine functions and a set of basis vectors. DCT is a signal
compression technique that breaks d own a signal into its fundamental
frequency components.
If x[n] is the signal of length N, the Fourier transform of the signal x[n] is
given by X[k] where,
where k varies between 0 to N − 1.
Consider extending the signal x[n], which is indicated by xe[n] , so that the
expanded sequence has a length of 2N. There are two ways to expand the
sequence x[n].
Consider the following sequence (original sequence) of length four: x[n] =
[1, 2, 3, 4]. Fig. 1 depicts the original sequence. There are two ways to
length en the sequence. By simply copying the original sequence again, as
shown in Fig. 2, the original sequence can be extended. HN HN
HN - HN
H2 H2
H2 - H2
1 1 1 1
1 -1 1 -1
1 1 -1 -1
1 -1 -1 1
munotes.in
Page 114
Discrete Fourier Transform
113 As demonstrated in Fig. 2, the expanded sequence can be created by
simply replicating the original sequence. The biggest disadvantag e of this
method is the variance in sample value between n = 3 and n = 4.
Fig. 1 Original sequence
Fig. 2 Extended sequence obtained by simply copying the original
sequence
munotes.in
Page 115
Imag e Processing
114
Fig. 3 Extended sequence obtained by folding the origin al sequence
The phenomena of 'ringing' is unavoidable due to the extreme fluctuation.
A second approach of producing the expanded sequence, as illustrated in
Fig. 3, is to copy the original sequence in a folded fashion. When
comparing Figs. 2 and 3, it is obvious that the variance in the sample value
at n = 3 and n = 4 in Fig. 3 is the smallest when compared to Fig. 2. The
expanded sequence created by folding the initial sequence is shown to be a
better choice as a result of this.
The length of the expanded sequence is 2N if N is the length of the
original sequence, as seen in both Figs. 2 and 3.
In this example, the length of the original sequence is 4 (refer Fig. 1) and
the length of the extended sequence is 8(refer Fig. 2 and Fig. 3).
The Discrete Fourier Transform (DFT) of the extended sequence is given
by Xe[k] where
Split the interval 0 to 2N – 1 into two parts,
Let m = 2N – 1 − n. Substituting in above equation,
munotes.in
Page 116
Discrete Fourier Transform
115
But,
Replacing m by n and Multiplying both sides by
Upon simplificatio n,
Thus, the kernel of a one -dimensional discrete cosine transform is given
by
munotes.in
Page 117
Imag e Processing
116 The process of reconstructing a set of spatial domain samples from the
DCT coefficients is called the inverse discrete cosine transform (IDCT).
The inverse discrete cosine transformation is given by,
The forward 2D discrete cosine transform of a signal f(m, n) is given by,
The 2D inverse discrete cosine transform is given by
5.4.4 Discrete Wavelet Transform: Haar Transform, KL Transform
5.4.4.1 Haar Transform :
The Ha ar transform is based on a class of orthogonal matrices with
elements of 1, –1, or 0 multiplied by powers of √2 as its elements. The
Haar transform is computationally efficient since it only requires 2(N – 1)
additions and N multiplications to change an N -point vector.
Algorithm to Generate Haar Basis [1]: The algorithm to generate Haar
basis is given below:
Step 1 Determine the order of N of the Haar basis.
Step 2 Determine n where n = log 2 N.
Step 3 Determine p and q.
(i) 0 ≤ p < n –1
(ii) If p = 0 then q = 0 or q = 1
(iii) If p ≠ 0, 1 ≤ q ≤ 2p munotes.in
Page 118
Discrete Fourier Transform
117 Step 4 Determine k.
k = 2p + q – 1
Step 5 Determine Z.
Step 6 If k = 0 then H(Z) = 1/ √N
Otherwise
,
The flow chart to compute Haar basis is given Fig. 4
Fig. 4 Flow chart to compute Haar basis munotes.in
Page 119
Imag e Processing
118 5.4.4.2 KL Transform ( KARHUNEN –LOEVE TRANSFORM ) :
Harold Hotelling was the first to study the discrete formulation of the KL
transform, which is why it is also known as the Hotelling transform. The
KL transform is a reversible linear transform t hat takes advantage of a
vector representation's statistical features.
The orthogonal eigenvectors of a data set's covariance matrix are the basic
functions of the KL transform. The input data is optimally decorrelated
using a KL transform. The majority of the 'energy' of the transform
coefficients is focused inside the first few components after a KL
transform. A KL transform's energy compaction property is this.
Drawbacks of KL transform :
i. A KL transform is input -dependent, and the fundamental function for
each signal model on which it acts must be determined. There is no unique
mathematical structure in the KL bases that allows for quick
implementation.
ii. The KL transform necessitates multiply/add operations in the order of
O(m2). O(log 2m) multiplica tions are required for the DFT and DCT.
Applications of KL Transforms [1] :
(i) Clustering Analysis : Used to determine a new coordinate system for
sample data where the largest variance of a projection of the data lies on
the first axis, the next largest variance on the second axis, and so on.
(ii) Image Compression : It is heavily utilised for performance evaluation
of compression algorithms since it has been proven to be the optimal
transform for the compression of an image sequence in the sense that the
KL spectrum contains the largest number of zero -valued coefficients.
Example) Perform KL transform for the following matrix:
Step 1 - Formation of vectors from the given matrix
The given matrix is a 2×2 matrix; hence two vectors can be extracted from
the given matrix. Let it be
x0 and x1.
Step 2 Determination of covariance matrix
The formula to compute covariance of the matrix is
munotes.in
Page 120
Discrete Fourier Transform
119 In the formula for covariance matrix, x denotes the mean of the input
matrix. The formula to compute
the mean of the giv en matrix is given below:
where M is the number of vectors in x.
The mean value is calculated as
Now multiplying the mean value with its transpose yields
xxT To find the E
In our case, M = 2 hence
munotes.in
Page 121
Imag e Processing
120 Step 3 Determination of eigen values of the covariance matrix
To find the eigen values λ, we solve the characteristic equation,
λ2 – λ - 4 = 0
From the last equation, we have to find the eigen values λ 0, λ1. Solving
above equation,
Step 4 - Determination of eigen vectors of the covariance matrix
The first eigen vector φ 0 is found from the equation,
munotes.in
Page 122
Discrete Fourier Transform
121
Step 5 - Normalisation of the eigen vectors
The normalisation formula to normalise the eigen vector φ 0 is,
Similarly, the normalisation of the eigen vector φ 1 is given by
Step 6 - KL transformation matrix f rom the eigen vector of the covariance
matrix munotes.in
Page 123
Imag e Processing
122 From the normalised eigen vector, we have to form the transformation
matrix.
Step 7 - KL transformation of the input matrix
To find the KL transform of the input matrix, the formula used is Y =
T[x].
The final transform matrix
Step 8 - Reconstruction of input values from the transformed coefficients
From the transform matrix, we have to reconstruct value of the given
sample matrix X using the formula X = TTY.
munotes.in
Page 124
Discrete Fourier Transform
123 5.5 SUMMARY
Different transform -based compression approaches have been tested with
and compared to find a viable image transformation methodology for
medical images of various sizes and modalities.
Image classification is a complicated process that relies on several f actors.
Some of the presented solutions, difficulties and more picture order
potential are discussed here. The focus should be on cutting -edge
classification algorithms for improving characterization precision.
5.6 REFERENCES
1] S. Jayaraman Digital Image Processing TMH (McGraw Hill)
publication, ISBN - 13:978 -0-07- 0144798
2] John G. Proakis, Digital Signal Processing: Principles, Algorithms,
And Applications, 4/E
3] Gonzalez, Woods & Steven, Digital Image Processing using MATLAB,
Pearson Education, ISBN -13:978 -0130085191
4] https://www.robots.ox.ac.uk/~sjrob/Teaching/SP/l7.pdf
5.7 UNIT END EXERCISES
1. Compute the discrete cosine transform (DCT) matrix for N = 4.
2. Generate one Haar Basis for N = 2.
3. Compute the Haar basis for N = 8.
4. Compute the basis of the KL transform for the input data x1 =, (4, 4,
5)T, x2 = (3, 2, 5)T, x3 = (5, 7, 6)T and x4 = (6, 7, 7 )T.
5. Compute the 2D DFT of the 4 × 4 grayscale image given below.
6. . State and prove separability property of 2D -DFT.
7. Let ( , ) denote a digital image of size 256 × 256. In order to
compress this image, we take its Discrete Cosine Transform ( , ), , =
0, … ,255 and keep only the Discrete Cosine Transform coefficients for ,