Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
IMAGE EDGE DIRECTION AND ORIENTATION DETECTING METHOD
Document Type and Number:
WIPO Patent Application WO/2008/119939
Kind Code:
A2
Abstract:
An method and apparatus for encoding a portion of an image comprising a plurality of pixels. The method comprises performing a plurality of edge detection operations on said portion of said image each edge detection operation generating an output value, classifying said image portion into one of a predetermined plurality of classes based upon relationships between said plurality of output values, and encoding said portion based upon said classification.

Inventors:
AL-FAYADH ALI HASSAN NASSER (GB)
JUMEILY DHIYA AL (GB)
HUSSAIN ABIR (GB)
Application Number:
PCT/GB2008/001029
Publication Date:
October 09, 2008
Filing Date:
March 26, 2008
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV LIVERPOOL JOHN MOORES (GB)
AL-FAYADH ALI HASSAN NASSER (GB)
JUMEILY DHIYA AL (GB)
HUSSAIN ABIR (GB)
International Classes:
H04N7/26; G06T5/00
Other References:
JHING-FA WANG ET AL: "A Novel Fast Algorithm for Intra Mode Decision in H.264/AVC Encoders" CIRCUITS AND SYSTEMS, 2006. ISCAS 2006. PROCEEDINGS. 2006 IEEE INTERNA TIONAL SYMPOSIUM ON KOS, GREECE 21-24 MAY 2006, PISCATAWAY, NJ, USA,IEEE, 21 May 2006 (2006-05-21), pages 3498-3501, XP010939305 ISBN: 978-0-7803-9389-9
ER E L: "Class notes of the course CPSC 320-00 Computer Vision (Fall '04 12-week), Week 5 (Smoothing & Edge Detection)" INTERNET CITATION, [Online] 21 September 2004 (2004-09-21), XP002444467 Retrieved from the Internet: URL:http://cs.hiram.edu/ walkerel/cs320/> [retrieved on 2007-07-26]
Attorney, Agent or Firm:
KENRICK, Mark, Lloyd (Sussex House83-85 Mosley Street, Manchester M2 3LG, GB)
Download PDF:
Claims:

CLAIMS

1. An image encoding method for encoding a portion of an image comprising a plurality of pixels, the method comprising: performing a plurality of edge detection operations on said portion of said image each edge detection operation generating an output value; classifying said image portion into one of a predetermined plurality of classes based upon relationships between said plurality of output values; and encoding said portion based upon said classification.

2. A method according to claim 1, wherein performing each of said plurality of edge detection operations comprises convolving said portion of said image with an appropriate mask.

3. A method according to claim 1 or 2, wherein each of said output values indicates a strength of an edge of a particular direction within said portion of said image.

4. A method according to any preceding claim, wherein one of said plurality of edge detection operations detects horizontal edges.

5. A method according to any preceding claim, wherein one of said plurality of edge detection operations detects vertical edges.

6. A method according to claim 5 as dependent upon claim 4, wherein said portion of said image is classified as containing a horizontal edge if said edge detection operation detecting horizontal edges provides a stronger output value than said edge detection operation detecting vertical edges.

7. A method according to claim 5 as dependent upon claim 4, wherein said portion of said image is classified as containing a vertical edge if said edge detection

operation detecting vertical edges provides a stronger output value than said edge detection operation detecting horizontal edges.

8. A method according to any preceding claim, wherein two of said plurality of edge detection operations detect diagonal edges of respective orientations.

9. A method according to claim 8, wherein said two edge detection operations respectively detect diagonal edges of first and second orientations, and said portion of said image is classified as containing a diagonal edge of first orientation if said edge detection operation detecting diagonal edges of first orientation provides a stronger output value than said edge detection operation detecting diagonal edges of second orientation.

10. A method according to claim 8 or 9 as dependent upon claim 6 or 7, wherein said portion of said image is classified as containing a horizontal or vertical edge if but only if output values of said edge detection operation detecting horizontal edges and said edge detection operation detecting vertical edges satisfy a predetermined relationship with output values of said edge detection operations detecting diagonal edges of respective orientations.

11. A method according to claim 10, wherein said portion of said image is classified as containing a horizontal or vertical edge if but only if the absolute value of the difference between output values of said edge detection operation detecting horizontal edges and said edge detection operation detecting vertical edges satisfies a predetermined relationship with the absolute value of difference between output values of said edge detection operations detecting diagonal edges of respective orientations.

12. A method according to claim 11, wherein said predetermined relationship is a less than relationship.

13. A method according to any preceding claim, further comprising:

determining whether said portion of said image comprises an edge, and if said portion of said image does not comprise an edge classifying said portion of said image in a predetermined shade class.

14. A method according to any preceding claim, wherein each of said predetermined classes has an associated set of encodings.

15. A method according to claim 14, wherein each encoding is a vector.

16. A method according to claim 14 or 15, further comprising: selecting a set of encodings based upon a class to which said portion of said image has been classified; and selecting an encoding from said selected set of encodings to represent said portion of said image.

17. A method according to claim 16, wherein selecting one of said selected set of encodings comprises selecting an encoding having a minimum mean square error with the portion of the image.

18. A method according to any preceding claim, wherein said portion of said image is a plurality of pixels taken from said image.

19. A method according to claim 18, wherein said plurality of pixels are a contiguous block of pixels in said image.

20. A method according to claim 18, wherein the image comprises a plurality of blocks, and each of said blocks is processed in turn.

21. A carrier medium carrying computer readable code for controlling a computer to carry out the method of any one of claims 1 to 20.

22. A computer apparatus for encoding a portion of an image, the apparatus comprising: a program memory storing processor readable instructions; and a processor configured to read and execute instructions stored in said program memory; wherein the processor readable instructions comprise instructions controlling the processor to carry out the method of any one of claims 1 to 20.

23. Apparatus for encoding a portion of an image comprising a plurality of pixels, the apparatus comprising: a processor configured to perform a plurality of edge detection operations on said portion of said image, each edge detection operation generating an output value; a classifier configured to classify said image portion into one of a predetermined plurality of classes based upon relationships between said plurality of output values; and an encoder configured to encode said block based upon said classification.

24. Apparatus for encoding a portion of an image comprising a plurality of pixels, the apparatus comprising: means for performing a plurality of edge detection operations to said portion of said image each edge detection operation generating an output value; means for classifying said image portion into one of a predetermined plurality of classes based upon relationships between said plurality of output values; and means for encoding said block based upon said classification.

25. A method for generating a set of data for use in image encoding operations, said set of data comprising a plurality of data items, and each data item being used to represent a portion of an image to be encoded, the method comprising: processing a set of training data to generate said set of data items, said processing comprising performing singular value decomposition on a matrix based upon said training data.

26. A method according to claim 25, wherein said training data comprises a plurality of pixel values representing an image

27. A method according to claim 26, wherein said training data comprises a plurality of pixel values representing a concatenation of images.

28. A method according to any one of claims 25 to 27, further comprising: processing said training data to generate a plurality of training portions.

29. A method according to claim 28, wherein each of said training portions is a contiguous block of pixel values taken from said training data.

30. A method according to claim 28 or 29, further comprising classifying said training portions into a predetermined plurality of classes.

31. A method according to claim 30, further comprising generating a training data subsset for each of said plurality of classes and processing each of said training data subsets by performing said singular value decomposition on a matrix based upon a respective training data subset.

32. A method according to claim 31, wherein each of said training data subsets comprises training portions classified in a respective class.

33. A method according to claim 31 or 32, wherein said classes include a plurality of edge classes and a shade class.

34. A method according to any one of claims 25 to 33, wherein said singular value decomposition is performed on a matrix comprising a plurality of columns, each column representing a block of pixel values taken from said training data.

35. A method according to any one of claims 25 to 34, wherein said data items are generated by multiplying predetermined portions of matrices generated from said singular value decomposition.

36. A method according to claim 35, wherein said predetermined portions of said matrices are selected with reference to a rank value and a number of data items to be in said set of data items to be generated.

37. A carrier medium carrying computer readable code for controlling a computer to carry out the method of any one of claims 25 to 36.

38. A computer apparatus for generating a set of data for use in an image encoding operations, the apparatus comprising: a program memory storing processor readable instructions; and a processor configured to read and execute instructions stored in said program memory; wherein the processor readable instructions comprise instructions controlling the processor to carry out the method of any one of claims 24 to 34.

Description:

IMAGE ENCODING METHOD

The present invention relates to a method for encoding image data, and more particularly to an encoding method allowing image data to be compressed.

Computers are ubiquitous in modern society and are used for a wide variety of applications in both home and business environments. Many applications in which computers are employed require images to be processed and displayed. It is desirable that images are displayed so as to appear as clear as possible to a user. However this aim must be balanced with that of managing image storage requirements. It is typically the case that images taking up relatively large amounts of storage space will appear more clear than images taking up only a small amount of storage space.

It is important to correctly balance the competing goals of image quality and storage requirements. This is not only because all computers have finite storage space, but also because it is now very common for images to be transmitted between computers along communications links. For example web pages which are accessible over the Internet normally include images. Communications links along which images are transmitted have limited bandwidth, and consequently larger images take a longer time to download.

With the problems outlined above in mind, various image encoding methods have been developed which encode input image data into compressed image data requiring less storage space. The compressed image data is subsequently decoded to generate reconstructed image data which is displayed to a user. Ideally, the reconstructed image data is the same as the input image data. However, if differences between the input image data and reconstructed image data are imperceptible to a user, the image encoding method will still be widely applicable. It is, in this context, important to note that minor differences between the input image data and reconstructed image data, perhaps affecting only a few pixels of the image will be imperceptible to a user.

Various image encoding methods of the general type described above have been developed. One of the most popular is the JPEG format developed by the Joint Potographic Experts Group and described in W. Pennebaker, J. Mitchell, "JPEG-Still Image Data Compression Standards". Van Nostrand Reinhold, 1993.

Although image encoding methods such as JPEG have enjoyed widespread success, differences between input image data and reconstructed image data are sometimes perceptible to a user. This is particularly problematic in situations where image quality is of particular importance.

A particular problem with the JPEG encoding scheme relates to edge-degradation. That is, if an input image includes an edge (for example if the image represents a boundary between two real world objects), the area of the image representing the edge may be unclear in the reconstructed image data.

It is an object of some embodiments of the present invention to obviate or mitigate at least some of the problems outlined above.

According to a first aspect of the present invention, there is provided an image encoding method and apparatus for encoding a portion of an image comprising a plurality of pixels. The method comprises performing a plurality of edge detection operations on said portion of said image each edge detection operation generating an output value, classifying said image portion into one of a predetermined plurality of classes based upon relationships between said plurality of output values, and encoding said portion based upon said classification.

Is has been found that classifying image portions based upon output values of edge detection operations, and subsequently encoding image portions based upon those classes provides efficient and effective image encoding.

Performing each of the plurality of edge detection operations may comprise convolving said portion of said image with an appropriate mask. Each of the output

values may indicate a strength of an edge of a particular direction within said portion of said image. For example the magnitude of each output value may indicate the strength of the detected edge.

One of the plurality of edge detection operations may detect horizontal edges, while one of the edge detection operations may detect vertical edges. The portion of said image may be classified as containing a horizontal edge if the edge detection operation detecting horizontal edges provides a stronger output value than the edge detection operation detecting vertical edges. The portion of said image may be classified as containing a vertical edge if the edge detection operation detecting vertical edges provides a stronger output value than the edge detection operation detecting horizontal edges.

Two of the plurality of edge detection operations may detect diagonal edges of respective orientations. The two edge detection operations may respectively detect diagonal edges of first and second orientations. For example one edge detection operation may detect 45° diagonal edges while another edge detection operation may detect 135° diagonal edges. The portion of the image may be classified as containing a diagonal edge of first orientation if said edge detection operation detecting diagonal edges of first orientation provides a stronger output value than said edge detection operation detecting diagonal edges of second orientation.

The portion of said image may be classified as containing a horizontal or vertical edge if but only if output values of said edge detection operation detecting horizontal edges and said edge detection operation detecting vertical edges satisfy a predetermined relationship with output values of said edge detection operations detecting diagonal edges of respective orientations.

The portion of said image may be classified as containing a horizontal or vertical edge if but only if the absolute value of difference between output values of said edge detection operation detecting horizontal edges and said edge detection operation detecting vertical edges satisfies a predetermined relationship with the absolute value

- A -

of difference between output values of said edge detection operations detecting diagonal edges of respective orientations. The predetermined relationship may be a less than relationship.

The method may further comprise determining whether said portion of said image comprises an edge, and if said portion of said image does not comprise an edge classifying said portion of said image in a predetermined shade class.

Each of the predetermined classes may have an associated set of encodings, referred to as a codebook. Each encoding may be a vector. The method may further comprise selecting a set of encodings based upon a class to which said portion of said image has been classified; and selecting an encoding from said selected set of encodings to represent said portion of said image. Selecting one of the selected set of encodings may comprises selecting an encoding having a minimum mean square error with the portion of the image.

The portion of said image is preferably a plurality of pixels taken from said image. The plurality of pixels may be a contiguous block of pixels in said image. Each block may be a square of size n x n and in such a case the encodings described above include n 2 values, for example each encoding may be a vector of length n 2 . The image may comprise a plurality of blocks, and each of said blocks may be processed in turn.

According to a second aspect of the present invention, there is provided a method for generating a set of data for use in image encoding operations, said set of data comprising a plurality of data items, and each data item being used to represent a portion of an image to be encoded, the method comprising: processing a set of training data to generate said set of data items, said processing comprising performing singular value decomposition on a matrix based upon said training data.

It has been found that applying singular value decomposition to a training data set allows a set of data items for use in image encoding operations to be effectively and efficiently created.

The training data may comprise a plurality of pixel values representing an image. The training data may comprise a plurality of pixel values representing a concatenation of images.

The method may further comprise processing said training data to generate a plurality of training portions. Each of said training portions may be a contiguous block of pixel values taken from said training data. The method may further comprise classifying said training portions into a predetermined plurality of classes. The method may further comprise generating a training data subsset for each of said plurality of classes and processing each of said training data subsets by performing said singular value decomposition on a matrix based upon a respective training data subset. Each of said training data subsets may comprises training portions from said training data classified in a respective class.

The classes may include a plurality of edge classes and a shade class.

The singular value decomposition may be performed on a matrix comprising a plurality of columns, each column representing a block of pixel values taken from said training data. The data items may be generated by multiplying predetermined portions of matrices generated from said singular value decomposition.

The predetermined portions of said matrices may be selected with reference to a rank value. Additionally or alternatively, the predetermined portions of said matrices may be selected with reference to a number of data items to be in said set of data items to be generated.

The invention can be implemented in any convenient form including by way of suitable methods and apparatus. The invention can be implemented by appropriate programming of a computer, and accordingly the invention provides appropriate computer programs and carrier media carrying such programs. The invention can also

be implemented in special bespoke hardware configured to carry out image encoding operations.

Embodiments of the present invention will now be described, by way of example, with reference to the accompanying drawings, in which:

Figure 1 is a schematic illustration of an image encoding process;

Figure 2 is a schematic illustration of an image decoding process;

Figure 3 is a flowchart of the encoding process of Figure 1;

Figure 4 is a flowchart of the decoding process of Figure 2;

Figure 5 is a flowchart showing part of the encoding process of Figure 3 in further detail;

Figure 6 is a schematic illustration of a block of pixels;

Figure 7 A is a flowchart of a process for generation of a codebook used in an embodiment of the invention;

Figure 7B is a flowchart showing part of the process of Figure 7A in further detail;

Figure 8 is a schematic illustration of a matrix of training data used in the process of Figure 7A;

Figure 9 is a schematic illustration showing how part of the training data of Figure 8 is processed;

Figure 10 is a high-level schematic illustration of an embodiment of the invention;

Figures HA to HF are images which were used to assess performance of an embodiment of the invention; and

Figure 12 is a graph comparing performance of an embodiment of the invention with that of other image compression techniques.

Embodiments of the present invention provide a method of encoding image data to allow compression of that image data. Figure 1 is a high level schematic illustration of the image encoding process. It can be seen that input image data 1 is processed by an encoder 2 to generate compressed image data 3. Compression of image data in this way has a number of benefits which are well known. Such compression is particularly useful in systems having limited storage space, or systems in which image data is transmitted between computers, and such transmission must be carried out over limited-bandwidth communications links.

It will however be appreciated that corresponding decoding is required to allow the compressed image data to be reconstructed for display to a user. A suitable decoding process is shown in Figure 2. Here, the compressed image data 3 is input to a decoder 4 which produces as output reconstructed image data 5. Optimally, the reconstructed image data 5 is identical to the input image data 1. However, useful compression with acceptable results can be achieved where differences exist between the input image data 1 and the reconstructed image data 5, so long as those differences are imperceptible or barely perceptible to a user.

Figure 3 is a flowchart showing the processing of Figure 1 in further detail. At step Sl the input image data 1 defined by a plurality of pixels is processed to define a plurality of blocks, each comprising an array of pixels. For example, each block can be defined by 16 pixels arranged in a four-by-four square. Such blocks have been found to provide reconstructed image data having good subjective image quality. In alternative embodiments blocks of different arrangements are processed. For example, square blocks of different sizes may be employed, as may rectangular or non- rectangular blocks. Indeed, any suitable distribution of pixels may be defined to make

up a block for the purposes of processing. Each block is processed in turn to encode the input image data, as is described in further detail below. At step S2 a vector from a set of vectors which is determined to be a closest fit to the processed block is selected, and an index associated with this vector is stored at step S3. A check is carried out at step S4 to determine whether further blocks remain to be processed, and if this is the case, processing returns to step S2. Otherwise, processing continues at step S5 where the compressed image data is stored, the compressed image data comprising a plurality of indices each indicating a vector representing a block of the input image data.

Figure 4 is a flowchart showing the decoding process of Figure 2 in further detail. At step S6 a vector associated with a vector index included in the compressed image data is obtained, and a corresponding image block is reconstructed at step S7. At step S8 a check is carried out to determine whether further blocks remain to be processed. If there are further blocks to be processed, processing returns to step S 7, otherwise, processing passes to step S9 where the reconstructed image data is output.

Having provided an overview of image encoding and decoding operations, the processing of each block of the input image data to determine a vector index is now described in further detail with reference to Figure 5. For the purposes of the following description, reference is made to an n x n block B, defined by equation (1):

B = b y : l ≤ i ≤ n; l ≤ j ≤ n; (1)

It can be seen that by is the pixel value (for example a greyscale value) for the pixel in the Hh row and jth column of the block B. Figure 6 is a schematic illustration of the block B. It can be seen that the block B is shown to have n columns and n rows of pixels.

Referring to Figure 5, at step SlO, a normalisation process is carried out. The normalisation process comprises computing the mean pixel value of pixels in the

block B, and subtracting this mean value from the value of each pixel in the block B. It is these modified pixel values that are used for subsequent processing.

At step SI l a gradient of the block B in an x-direction is computed according to equation (2):

a.

2 - ^- "

G x = ; [ σσ*. - σ σ*. 1 (2) n x n , =1 J=l j= « +i J=1

2 where G x is the gradient in the x-direction.

It can be seen from equation (2) and Figure 6 that the gradient in the x-direction is computed by summing pixel values from each row in a lower portion of the block, and subtracting from this sum the summation of pixel values from each row in an upper portion of the block. The result of the subtraction is normalised by appropriate multiplication.

Having computed the gradient of the block B in the x-direction at step SI l, the gradient in a y-direction is computed at step S 12. The computation of gradient in the y-direction is carried out according to equation (3):

where G y is the gradient in the y-direction.

Here, it can be seen from Figure 6 that the gradient in the y-direction is computed by summing pixel values from each column in a left hand portion of the block, and subtracting from this sum the summation of pixel values form each column in a right hand portion of the block. Again, the result of the subtraction is normalised.

Having computed gradients in both x and y directions, a gradient magnitude is computed at step S 13, according to equation (4):

where lei is the gradient magnitude.

At step S 14, a spectral radius for the block B is computed. The spectral radius is the modulus of the dominant eigenvalue of the block B. That is:

p(B) = max(|2, |);1 ≤ i ≤ n (6)

where: p(B) is the spectral radius of the block B; and λ, ,1 < z ≤ n are the eigenvalues of the block B.

At step S 15 a check is carried out to determine whether the block B is to be classified as an edge block or a shade block. This check involves a comparison between the gradient magnitude computed at step S 13 by equation (4) and the spectral radius computed at step S 14 by equation (6). If the gradient magnitude is less than the spectral radius it is determined that the block B does not include an edge and it is consequently classified as a shade block at step S 16. Otherwise, the block B is classified to be an edge block, and processing continues at step S 17, where a further classification of the edge block is carried out. Specifically, edge blocks are classified into vertical, horizontal, 45° diagonal and 135° diagonal edge blocks.

At step S 17 four masks are convolved with the block B. Each mask is an edge detector for one of the four detailed edge classifications. A first mask Ml is a vertical edge detector, a second mask M2 is a horizontal edge detector, a third mask M3 is a 45° diagonal (or upper diagonal) edge detector, while a fourth mask M4 is a 135°

diagonal (or lower diagonal) edge detector. The four masks are defined by equations (7), (8), (9) and (10):

2 2 2 2

1 1 1 1

M2 = _ 1 _ 1 (8)

-1 -1

- 2 - 2 - 2 - 2

Each of the four convolutions carried out at step S17 generates a respective value, V 1 ,

V 2 , V 3 , V 4 .

The convolution operations involve multiplying each pixel of the block B with the corresponding element of a mask, and summing the results of the multiplications. An example convolution operation where block B has the form shown in equation (11), is now described.

Here, the value V 1 obtained by convolving the block B with the mask Ml is given by equation (12):

vι = ~2 ~ 1 Pi +

- 2 P 9 - 1 Pw

At step S18, the values V 1 , V 2 , V 3 , V 4 are processed to determine the nature of the edge.

The block is considered to comprise a vertical edge if:

v, > v, λ V l - V 2 V 3 - V 4 (12)

That is, the edge is classified as a vertical edge if the vertical edge detector provides a higher value response than the horizontal edge detector, and if the magnitude of the difference between the response of the vertical and horizontal edge detectors is greater than the magnitude of the difference between the responses of the two diagonal edge detectors.

The block is considered to comprise a horizontal edge if:

V 1 < V 0 λ v i ~ v 2 > v 3 -v 4 (13)

That is, the edge is classified as a horizontal edge if the vertical edge detector provides a lower value response than the horizontal edge detector, and if the magnitude of the difference between the response of the vertical and horizontal edge

detectors is greater than the magnitude of the difference between the responses of the two diagonal edge detectors.

The block is considered to comprise a 45° diagonal edge if:

V 3 > v 4 A Iv 3 -V 4 I ^v 1 -V 2 I (14)

That is, the edge is classified as a 45° diagonal edge if the 45° diagonal edge detector provides a higher value response than the 135° diagonal edge detector, and if the magnitude of the difference between the responses of the two diagonal edge detectors is greater than the magnitude of the difference between the response of the vertical and horizontal edge detectors.

The block is considered to comprise a 135° diagonal edge if:

V "3 5 < ^ V " λ 4 A V 3, - "V4, > V ' 1 1 - V ",2 (15)

That is, the edge is classified as a 135° diagonal edge if the 45° diagonal edge detector provides a lower value response than the 135° diagonal edge detector, and if the magnitude of the difference between the responses of the two diagonal edge detectors is greater than the magnitude of the difference between the response of the vertical and horizontal edge detectors.

The processing described above associates each processed block with one of five predetermined classes. Each class has an associated set of vectors referred to as a codebook. Each vector in each codebook has n 2 elements. That is, each vector has a number of elements equal to the number of elements in a block. For each block one of the vectors from the appropriate codebook is selected. It should be noted that this processing has advantages from the point of view of computational complexity, given that each block need only be compared with vectors in the appropriate codebook, that codebook being determined by the block's classification. Selection of a vector is

carried out by locating the vector having the lowest mean square error with the currently processed block. Mean square error is defined by equation (16):

where MSE is the mean square error; and

7 is a vector in the codebook, andy is an element of that vector.

It can therefore be seen that equation (16) compares elements of the block B with corresponding elements of the vector Y.

The sub-codebooks are generated using a singular value decomposition (SVD)-based technique, as is now described. In general terms Singular Value Decomposition (SVD) is a known method in linear algebra to diagonalize a rectangular m x n matrix A by factorizing it into three matrices U, S, and V, such that:

A = U S V τ (17)

Where:

S is a diagonal m x n matrix with elements S 1 along its principal diagonal and zeros at all other positions;

U is an orthonormal matrix of size m x m; and

V is an orthonormal matrix of size n x n.

The matrix U is referred to as the left singular matrix, while the matrix V is referred to as the right singular matrix. The diagonal matrix S is referred to as the singular value matrix.

SVD is an approximation technique which effectively reduces any matrix into a smaller invertible and square matrix. So, one special feature of SVD is that it can be performed on any real m xn matrix.

Calculating the SVD of the matrix A involves finding the eigenvalues and eigenvectors of the matrices AA T and A τ A, where A τ is the transpose of the matrix

A.

The eigenvectors of A τ A make up the columns of V while the eigenvectors of AA T make up the columns of U. The eigenvalues of A τ A or A A τ are the squares of the singular values of A. The singular values are the diagonal entries of the matrix S and are typically arranged in descending order. The singular values are always real numbers if the matrix A is a real matrix, and the matricies U and V are also real.

Equation (17) can be expressed as:

Where: u \ and Vj are the i th column vectors of U and V respectively; Si are the singular values; and p = min {m , ή).

If the singular values are ordered so that S 1 > S 2 ... > s p , and if the matrix A is set to have a rank r <p, then the last p - r singular values are set to be equal to zero, and the matrix A can be approximated by a matrix A * with rank r. That is, the SVD can be expressed by equation (19):

The approximation error matrix E r is dependent on the performance accuracy of the quantisation and/or truncation by parameter r, which can be described as:

Er = A - A* (19a)

The 2-norm of a matrix may be calculated from the singular values. Therefore the 2- norm of the error matrix (A - A*) is equal to the next singular value not used in A *, that is, the 2-norm of approximation error is calculated by

As the singular values are in descending order, it can be seen that the error decreases towards zero in the 2- norm sense.

One important result of the SVD of A is A* in equation (19), which is the closest rank r matrix to A. The term 'closest' meaning that A* minimises the sum of the squares of the difference of the elements of A and A*, ∑ij | ay - ay

It will be appreciated that some standard libraries include a function performing SVD. Such a function is provided, for example, in the MATLAB environment.

Having provided an overview of SVD, the generation of a sub-codebook using SVD is now described with reference to Figure 7A. The code book is generated from one or more training images, and more particularly from blocks of pixels within those images. It is preferred that at least two images are used for training purposes. In such a case the images can be simply be concatenated to generate training data. The training data is obtained at step S20. At step S21 non-overlapping n x n blocks are defined within the training data, and the blocks are used to create a matrix. The matrix is arranged such that each column of the matrix represents a block. The matrix is shown in Figure 8. It can be seen that the matrix has N b columns, where N b is the number of blocks in the training data, and n x n rows, where the blocks are n x n

blocks as described above. Each block is represented by a respective column in the matrix.

At step S22 the mean of all values in the matrix of Figure 8 is calculated. At step S23 this mean value is subtracted from each value of the matrix. At step S24 an operation is carried out to determine the number of columns in the matrix, and consequently the number of blocks in the training data.

At step S25, the blocks of the training data are classified into five classes (a shade class and four edge classes) as described above. Implementation of the classification process is described in further detail below with reference to Figure 7B. The remaining processing of Figure 7A is repeated for blocks of each of the five classes in turn.

At step S26 a class to be processed is selected. At step S27 a matrix is created from the matrix of Figure 8 comprising columns representing blocks of the training data which are classified in the currently processed class. This involves selecting columns of the matrix of Figure 8 which represent blocks in the appropriate class. The matrix created at step S28 has a column for each block classified in the currently processed class.

At step S28 singular value decomposition is carried out on the matrix created at step S27. At step S29 a size for the codebook to be created is set, while at step S30 a rank value is set. The codebook is then generated at step S31, and transposed at step S32. At step S33 a check is carried out to determine whether further classes remain to be processed. If this is the case, processing returns to step S26, otherwise processing ends at step S34.

Generation of a codebook at steps S28 to S32 is now described in further detail with reference to Figure 9. Figure 9 shows a matrix 101 created at step S27 of Figure 7 A. The matrix has n x n rows and a number of columns equal to the number of blocks in the currently processed class, denoted as N b Classj where Class; is the currently

processed class. The SVD operation of step S28 creates three matrices U, S, V as described above. The dimensions of these matrices are shown in Figure 8. It can be seen that the matrix U is a square matrix of dimension n x n, the matrix V is a square matrix of dimension N b Class; while the matrix S is has n x n rows and N b Classj columns.

The three matrices U, S and V are processed to generate the codebook at step S31. The rank value r is selected at step S30 to give good compression results. Empirical testing has shown that a rank value of 7 works well in some embodiments of the present invention. The codebook size N Cb is chosen so as to balance the competing goals of good image quality and small storage space. A codebook with 200 entries is used in some embodiments of the invention. It will be recalled that the codebook will include the desired number of vectors, each vector having length n x n.

The code book is created by multiplying the matrices U, S and V, but to produce the desired size of codebook only part of the matrices are multiplied together. Specifically, only the first r columns of the matrix U are used. A square portion of the matrix S is used, that square portion being of size r x r and being located at the top left hand corner of the matrix S. A rectangular portion of the matrix V is used, that portion comprising the first N Ob rows and first r columns.

The code book is created at step S29 by multiplying the indicated parts of the matrices U and S with the transpose of the indicated part of the matrix V, and transposing the result of the multiplication.

That is:

Codebook = (U 1 S 1 V 1 1 ) T (21)

Where U 1 , S 1 and V 1 are the parts of the matrices U, S and V detailed above.

It was indicated above that the implementation of step S25 of Figure 7A would be described in further detail with reference to Figure 7B. Step S25 is concerned with processing blocks of the training data to identify classes with which the blocks are associated. At step S40 of Figure 7B a vector having a length equal to the number of blocks in the training data is created. At step S41 a classification process is applied to the blocks, and the vector stores an integer value in the range 1 to 5 indicating a class to which a block represented by the corresponding column of the matrix is associated. At step S42 a vector for each class is created, the vector indicating columns associated with that class. The vector created at step S42 is used to generate the matrix 101 at step S27.

Figure 10 is a high level schematic illustration of the processing described above. The image 1 is processed in terms of a plurality of blocks, and one of those blocks 6 is shown in Figure 10. It can be seen that the block is input to a classifier 7, which is configured to select between a shade classification 8, and an edge classification 9. The edge classification 9 comprises four sub-classifications as indicated above. These sub- classifications are a vertical-edge classification 10, a horizontal-edge classification 11, a 45° diagonal-edge classification 12, and a 135° diagonal-edge classification 135.

Thus, the classifier 9 classifies the block 6 into one of five classes. Each class has an associated sub-codebook as described above, and these sub-codebooks 14, 15, 16, 17, 18 are shown in Figure 10. The processed block 6 is thus classified into one of the five described classes, and a vector is subsequently selected from the sub-codebook associated with the class. The selected vector is then stored by reference to its index 19.

The method described above has been implemented in MATLAB and been shown to have good performance. A database of six grey level images was developed to evaluate the method's effectiveness, and to compare the method's effectiveness with that of other image compression techniques.

Performance was characterised using mean square error (MSE), and Peak Signal to Noise Ratio (PSNR) measures.

The PSNR is defined as follows:

where MSE is the mean square of the error between the original and the reconstructed image and is defined by equation (23):

where: MN is the total number of pixels in the image x is the input image; and y is the reconstructed image.

Table 1 shows the results obtained:

TABLE 1

The Barbara image is shown in Figure HA 5 the Peppers image is shown in Figure HB, the Lena image is shown in Figure HC, the Baboon image is shown in Figure HD, the Bird image is shown in Figure HE and the Goldhill image is shown in Figure HF.

Results shown in the portion of Table 1 marked "Described method" were obtained using the method described above, where the codebooks were constructed using SVD as described above, trained using the Barbara and the Peppers images.

Results shown in the portion of Table 1 marked "VQ" were obtained using the method described in Linde Y., Buzo A.H., and Gray R.M.: "An Algorithm for Vector Quantizer Design", IEEE Trans Commun. COM-28 (1980), 84-95. Results shown in the portion of Table 1 marked "JPEG-2000" were obtained using the JPEG-2000 algorithm implemented as described in Gonzalez R.C., Woods R.E., and Eddins S.L.: "Digital Image Processing Using Matlab (2004)."

The results shown in Table 1 indicate that, in all cases, the described method using SVD-based codebook generation outperformed the VQ method and the JPEG-2000 implemention using PSNR as the image quality measure.

The simulation results also indicated that the proposed method allowed faster training than the VQ method. Furthermore, the method described above required shorter CPU encoding time than the VQ method.

Figure 12 is a graph showing how the PSNR for the Lena image varied according to bit rate for the method described above as well as the VQ method and the JPEG-2000 implementation. Again, it can be seen that the described method provided results of higher quality than the VQ method or the JPEG-2000 method.

Although preferred embodiments of the present invention have been described above, it will be appreciated that various modifications can be made to the described embodiments without departing from the spirit and scope of the present invention, as defined by the appended claims.