AUTOMATED METHOD FOR LOSSLESS DATA COMPRESSION AND DECOMPRESSION OF A BINARY STRING

Title:

AUTOMATED METHOD FOR LOSSLESS DATA COMPRESSION AND DECOMPRESSION OF A BINARY STRING

Document Type and Number:

WIPO Patent Application WO/2004/051863

Kind Code:

A1

Abstract:

The present invention concerns an automated method for lossless compression method, of a binary string (S) of input data into a compressed binary string (E) of output data, which subdivides the input string (S) into words (1) each comprising n bits and, on the basis of a sorting ; O = &lcub B₀, B₁, B₂,…,B_2n-1&rcub = &lcub B_k&rcub _{k=0,1,2,…,2}ⁿ-1 of the 2ⁿbinary values (B) which may be assumed by a binary word, for each binary value B_k of the sorting O, for k = 0, 1, 2,(2ⁿ-2), assuming that (S₀ = S), it locates the words having binary value B_k within the string S_k, it sores in an array A_k the distance d_i^B_k of each located word with respect to the preceding word having binary value B_k, it juxtaposes to the output string (E) the elementsd_i^B_k stored in the array A_k and it constructs a string S_k+1 obtained by eliminating from the string S_k the N_k located words, the method finally juxtaposing to the output string (E) the number N₂ⁿ_-1 of words of the last string S₂ⁿ_-1. The present invention also concerns the related automated method for decompressing a compressed binary string (E) of input data into a binary string (S) of output data. The present invention further concerns the instruments necessary to perform the automated methods and to the apparatus performing the same.

Inventors:

LEANZA ELENA (IT)

Application Number:

PCT/IT2002/000762

Publication Date:

June 17, 2004

Filing Date:

December 04, 2002

Export Citation:

Click for automatic bibliography generation Help

Assignee:

ATOP INNOVATION SPA (IT)
LEANZA ELENA (IT)

International Classes:

H03M7/30; (IPC1-7): H03M7/30

Foreign References:

US6388585B1	2002-05-14
DE3523247A1	1987-01-02

Other References:

HOSANG M.: "A character elimination algorithm for lossless data compression", MADE PUPLIC DURING DATA COMPRESSION CONFERENCE 2002, 2 April 2002 (2002-04-02) - 4 April 2002 (2002-04-04), pages 1 - 9, XP002246070, Retrieved from the Internet [retrieved on 20030617]
HOSANG M.: "A character elimination algorithm for lossless data compression", PROCEEDINGS DATA COMPRESSION CONFERENCE 2002, 2 April 2002 (2002-04-02) - 4 April 2002 (2002-04-04), Snowbird, UT, USA, pages 457, XP002246068
MOFFAT A ET AL: "SELF-INDEXING INVERTED FILES FOR FAST TEXT RETRIEVAL", ACM TRANSACTIONS ON INFORMATION SYSTEMS, ASSOCIATION FOR COMPUTING MACHINERY, NEW YORK, US, vol. 14, no. 4, 1 October 1996 (1996-10-01), pages 349 - 379, XP000635100, ISSN: 1046-8188
PECHURA M. A., MCINTYRE D. R.: "Data Compression using static Huffman code-decode tables", COMMUNICATIONS OF THE ACM, vol. 28, no. 6, June 1985 (1985-06-01), pages 612 - 616, XP002246069

Attorney, Agent or Firm:

Scilletta, Andrea (Via Piemonte 26, Roma, IT)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1.

Automated method for compressing, without information loss, or lossless compression method, a binary string S of input data into a com pressed binary string E of output data, characterised in that it comprises the following steps: A. fixing a sorting O of the 2"binary values (B), which may be assumed by a binary word comprising n bits, according to a ruel ([1]) of sort ing: B. inizialising the output string E to the void string: E _, C. subdividing the input string S into words (1) each comprising n bits; D. for each binary value Bk of the sorting O, for k = 0,1, 2,..., (2n 2), as suming that So = S, D. 1 locating within the string Sk the Nk words, with Nk ! 0, having the binary value Bk, D. 2 in the case when Nk > 0, storing in an array Ak the distance dkBk, for i = 0,1, 2,..., Nk1, of each word located during step D. 1 with respect to the preceding word having binary value Bk, in such a way that the distance d B, of the first located word is equal to the number of words of the string Sk which precede it and the distance dBk of the located words following the first one is equal to the number of words interposed between the word located during step D. 1 and the preceding word having binary value Bk, D. 3 juxtaposing to the output string E a datum related to the pres ence of at least one word having binary value Bk within the string Sk and, in the case when Nk > 0, the elements d,., for i = 0, 1, 2,..., Nk 1, stored in the array Ak during step D. 2, D. 4 constructing a string Sk+l obtained by eliminating from the string Sk the Nk words, with Nk 2 O, located during step D. 1 having the binary value Bk, and repeating loop D, E. juxtaposing to the output string E the number N2n_1 of words of the last string S2.,.

2.	Method according to claim 1, characterised in that the datum re lated to the presence of at least one word having binary value Bk within the string Sk, which is juxtaposed to the output string E during step D. 3, is a flag bit.

3.	Method according to claim 2, characterised in that said flag bit is equal to"1", in the case when at least one word having binary value Bk is present within the string Sk, or otherwise is equal to"0", in the case when no word having binary value Bk is present within the string Sk.

4.	Method according to claim 1, characterised in that the datum re lated to the presence of at least one word having binary value Bk within the string Sk, which is juxtaposed to the output string E during step D. 3, is the number Nk of words located within the string Sk.

5.	Method according to anyone of the preceding claims, character ised in that the sorting rule assumes the 2"binary values (B) as positive binary values and the sorting O fixed during step A is either the decreas ing sorting or the increasing sorting of such positive binary values.

6.

Method according to anyone of the claims from 1 to 4, character ised in that during step A the method counts the number of occurrences of each binary value Bk within the input binary string S, and the sorting rule fixes the sorting O according to either the decreasing sorting or the in creasing sorting of the occurrences of the binary values Bk within the in put binary string S, the method further comprising the step: F. juxtaposing to the output string E data related to the sorting O fixed during step A.

7.	Method according to anyone of the preceding claims, character ised in that during step D. 3 the elements dBk are juxtaposed to the output string E according to a binary encoded representation.

8.

Method according to claim 7, characterised in that said encoded representation defines a first positive integer value gk, with gk > 0, and a second positive integer value xk, with xk ! 0, and the value of dBk is represented according to: a first mode, in the case ( [2]) when a second mode, in the case ( [4]) when a third mode, in the case ( [6]) when.

9.

Method according to claim 8, characterised in that according to said first mode, djBk is represented: in the case when gk > 0, by (gk + 1) binary bits according to the formula ([3]): where"&"is the juxtaposition operator and [Z] is the positive two's complement binary representation having t bits of the value Z ; and in the case when gk = 0, by only one binary bit equal to "0".

10.	Method according to claim 8 or 9, characterised in that according to said second mode, diBk is represented by (gk + xk + 2) binary bits ac cording to the formula ( [5]) :.

11.

Method according to anyone of the claims from 8 a 10, character ised in that according to said third mode, d,. is represented B _ in the case when xk > 0 Bk29k 2 8, +xt 2 gFk + X. T] bits, where In [X] is the operator returning the integer part of X, according to the formula ( [7]) : is the operator constructing a string including W bits rn' equal to"1"and Recez L Y J is the operator returning the remainder of the division y, i. e. it is equal to wIn ; and dBx _ 2g. y y in the case when xk = 0, by gk + xk + 1 + In t binary (in \. y bits according to the formula ([8]):.

12.	Method according to anyone of the claims from 8 to 11, charac terised in that for each binary value Bk of the sorting O, for k = 0, 1, 2,..., (2n2), it is gk =g xk = x.

13.	Method according to claim 12, characterised in that it also com prises the step: G. juxtaposing to the output string E data related to the first positive in teger value g and to the second positive integer value x.

14.

Method according to anyone of the claims from 8 to 12, charac terised in that it determines for each array Ak, for k = 0,1, 2,..., (2n2), the optimum first and second positive integer values gk and xk, i. e. which are apt to minimise the size of memory needed to represent the array Ak according to the binary encoded representation, the method juxtaposing to the output string E, during step D. 3, data related to the optimum first and second positive integer values gk and xk.

15.	Method according to claim 14, characterised in that it determines the optimum first and second positive integer values gk and xk by means of a trial and error process.

16.

Method according to claim 15, characterised in that the trial and error process verifies the size of memory needed to represent the array Ak according to the binary encoded representation for all the values of the first and second positive integer values gk and xk ranging, respec tively, from 0 and G, with G > 0, and from 0 and X, with X > 0.

17.	Method according to claim 16, characterised in that G=12 and/or X=3.

18.	Method according to anyone of the preceding claims, character ised in that the number n of bits, included in the words into which the input string S is subdivided during step C, is an involution of 2: n = 2m.

19.	Method according to anyone of the preceding claims, character ised in that it furthermore comprises the step: H. juxtaposing to the output string E data related to the number n of bits included in the words into which the input string S is subdivided dur ing step C.

20.	Method according to anyone of the preceding claims, character ised in that it still comprises the step : I. juxtaposing to the output string E data related to the size of the input string S.

21.	Method according to anyone of the preceding claims, character ised in that it is iterated, being applied at each hth iteration, with h > 1, to the output string E obtained by the preceding (hl)th iteration.

22.

Method according to claim 21, characterised in that, for at least two consecutive iterations (h1) and h, with h > 1, the corresponding num bers nh, and nh of bits, included in the words into which the corre sponding input strings S are subdivided during step C, are different with respect to one another: nh1 $ nh.

23.

Automated method for decompressing a compressed binary string E of input data into a binary string S of output data, characterised in that the compressed binary string E of input data has been obtained by applying to the binary string S the automated compression method ac cording to anyone of claims from 1 to 20, and in that it comprises the fol lowing steps: J. reading within the compressed string E the number N2., of words having the binary value B2.,, K. constructing a string S2n_1 comprising N2n_l words having the bi nary value B2n_l ; L. for each binary value Bk of the sorting 0, for k = (2n 2),(2n 3),..., 2,1, 0, constructing a corresponding string Sk executing the steps of the loop : L. 1 initialising the string Sk to the string Su+,, L. 2 reading within the compressed string E the datum related to the presence of at least one word having binary value Bk within the string Sk, L. 3 in the case when at least one word having binary value Bk is present within the string Sk, reading within the compressed string E the values dB*, for i=0, 1, 2,..., Nk1, with Nk >0, of the distances of each word having binary value Bk from the preceding word having binary value Bk, and inserting into the string Sk words having the binary value Bk at the positions lo cated by the values d Bk L. 4 repeating the loop L, M. assuming the binary string S of output data equal to the string SO : S =SO.

24.	Method according to claim 23, characterised in that it is iterated, the compressed binary string E of input data having being obtained by applying to the binary string S the automated compression method ac cording to claim 21 or 22.

25.	Electronic apparatus, comprising at least one central processing unit and at least one memory unit, characterised in that it executes the automated compression method according to anyone of the claims from 1 to 22.

26.	Electronic apparatus, comprising at least one central processing unit and at least one memory unit, characterised in that it executes the automated decompression method according to claim 23 or 24.

27.	Electric, magnetic or electromagnetic signal modulated by a digital signal, characterised in that said digital signal comprises at least one compressed binary string E obtained by means of the automated compression method according to anyone of the claims from 1 to 22.

28.	Computer program comprising code means adapted to execute, when running on a computer, the automated compression method ac cording to anyone of the claims from 1 to 22.

29.	Memory medium readable by a computer, storing a program, characterised in that the program is the computer program according to claim 28.

30.	Computer program comprising code means adapted to execute, when running on a computer, the automated decompression method ac cording to claim 23 or 24.

31.	Memory medium readable by a computer, storing a program, characterised in that the program is the computer program according to claim 30.

Description:

AUTOMATED METHOD FOR COMPRESSING BINARY STRINGS WITHOUT INFORMATION LOSS, AND RELATED AUTOMATED METHOD FOR DECOMPRESSING.

******* The present invention concerns an automated method for com- pressing binary strings without information loss, or lossless compression method, and related automated method for decompressing, which makes possible, in a simple, fast and inexpensively producible way, to minimise in a lossless way the dimension of binary strings.

The present invention further concerns the instruments necessary to perform the automated methods and to the apparatus performing the same.

It is known that one of the main problems of computer technology and, more broadly, of telecommunications, lies in the larger and larger amount of data needed for processing. This causes a heavy occupation of memories readable by computers, such as removable (e. g. CD-ROMs and floppy disks) or fixed (the so-called Hard Disks of computers) memory media, and a consequent long occupation of telecommunications net- works during information transmission, which involves higher transmission error probability, higher transmission cost, and physical limitation to the feasibility of specific real time applications. In this regard, it is possible to consider some applications for transmitting large data quantity by em- ploying transmission channels having limited capacity; in particular, it is possible to recall the attempt for carrying out an interactive television

service, known as Video-on-Demand, based upon the twin wire telephone network, that, due to the great limitations of the channel, has to necessar- ily transmit a smaller data quantity, making consequently the television image resolution worse.

Considering that the quantity of information presently stored in com- puters and/or exchanged through telecommunications networks is rapidly growing, also due to the big push exerted by the Internet which exten- sively uses information having large data quantity (such as digital im- ages), the problem of reducing the quantity of data representing both in- formation and application programs is extremely critical.

In order to solve such problem, many solutions have been devel- oped, providing for an encoding of data, represented in binary format, ac- cording to suitable methods in order to reduce the memory occupation.

Generally, such encoding methods are particularly advantageous when applied to specific types of data to be encoded, having specific charac- teristics. In other words, each individual encoding method is more or less advantageous depending on if it is applied, for instance, to data repre- senting stationary images or to data representing video images or to data representing sounds.

The compression methods can be classified according to the con- gruence between decompressed data and original data, dividing the con- cerned compression methods into methods having no information loss or "lossless"methods, when the data reconstructed or decompressed from the compressed data are identical to the original ones, and methods hav-

ing information losses or"lossy"methods, in which the reconstructed data lose a portion of the original data information. Obviously lossless methods are preferable and, in some specific applications when all the information must be kept, are the only ones that may be used.

Examples of lossless methods are the Huffman encoding method and the Lempel-Ziv-Welch, or LZW, encoding method.

Such methods have some drawbacks.

For instance, the Huffman method operates only on static strings, creating a corresponding dictionary for each string, while the LZW method operates on dynamic strings, creating a dynamically updated dictionary.

Consequently, both said methods are complex and their computer execu- tion is slow.

Moreover, just because each encoding method is suitably designed in connection with specific types of data to be encoded, newer and newer encoding requirements for data having particular characteristics are being encountered. This causes a certain difficulty of use, especially in case of transmission of composite data, requiring the use of different encoding methods.

Therefore, the search of improved encoding methods aimed at even more compressing the data to be stored and/or transmitted is being even more developed.

It is an object of this invention, therefore, to provide an automated method for compressing binary strings without information loss, or lossless compression method, which is simple, fast executed by a com-

puter, and which makes possible minimise the number of bits needed to represent the number under consideration.

It is another object of this invention to provide such a compression method which generates compressed data strings suitably organised so as to be possibly further compressed.

It is still object of this invention to provide the related automated method for decompressing, and the instruments necessary to perform the automated methods and the apparatus performing the same.

It is specific subject matter of this invention an automated method for compressing, without information loss, or lossless compression method, a binary string S of input data into a compressed binary string E of output data, characterised in that it comprises the following steps: A. fixing a sorting O of the 2"binary values (B), which may be assumed by a binary word comprising n bits, according to a rule of sorting: B. inizialising the output string E to the void string : E ="" : ="" ; C. subdividing the input string S into words each comprising n bits; D. for each binary value Bk of the sorting O. for k = 0,1, 2,..., (2n-2), as- suming that So = S, D. 1 locating within the string Sk the Nk words, with Nk > 0, having the binary value Bk, D. 2 in the case when Nk > 0, storing in an array Ak the distance , for i = 0,1, 2,..., Nk-1, of each word located during step D. 1

with respect to the preceding word having binary value Bk, in such a way that the distance d0Bk of the first located word is equal to the number of words of the string Sk which precede it and the distance d,'. k of the located words following the first one is equal to the number of words interposed between the word located during step D. 1 and the preceding word having binary value Bk, D. 3 juxtaposing to the output string E a datum related to the pres- ence of at least one word having binary value Bk within the string Sk and, in the case when Nk > 0, the elements dBk, for i = 0,1, 2,..., Nk-1, stored in the array Ak during step D. 2, D. 4 constructing a string Sk+1 obtained by eliminating from the string Sk the Nk words, with Nk 2 O, located during step D. 1 having the binary value Bk, and repeating loop D, E. juxtaposing to the output string E the number N2.-, of words of the last string S2.-,.

Preferably according to the invention, the datum related to the pres- ence of at least one word having binary value Bk within the string Sk, which is juxtaposed to the output string E during step D. 3, is a flag bit, still more preferably equal to"I", in the case when at least one word having binary value Bk is present within the string Sk, or otherwise is equal to "0", in the case when no word having binary value Bk is present within the string Sk.

Alternatively, the output string E may further comprise an initial flag

bit for indicating whether at least one binary value Bk is not present within the input string S (and, therefore, the specific flags of each individual bi- nary value Bk must be read) or all the binary values Bk are present (and, therefore, the output string E does not comprise specific flags of each in- dividual binary value Bk because all the values must be read).

Always according to the invention, the datum related to the presence of at least one word having binary value Bk within the string Sk, which is juxtaposed to the output string E during step D. 3, may be the number Nk of words located within the string Sk.

Still according to the invention, the sorting rule may assume the 2n binary values (B) as positive binary values and the sorting O fixed during step A may be either the decreasing sorting or the increasing sorting of such positive binary values.

Furthermore according to the invention, during step A the method may count the number of occurrences of each binary value Bk within the input binary string S, and the sorting rule may fix the sorting O according to either the decreasing sorting or the increasing sorting of the occur- rences of the binary values Bk within the input binary string S, the method further comprising the step : F. juxtaposing to the output string E data related to the sorting O fixed during step A.

In this way, the method is automatically adaptive to the type of file to be compressed, in conformity to the occurrences of the binary values.

Preferably according to the invention, during step D. 3 the elements

dBk are juxtaposed to the output string E according to a binary encoded representation.

Still more preferably according to the invention, said encoded repre- sentation defines a first positive integer value gk, with gk #0, and a sec- ond positive integer value xk, with xk # 0, and the value of diBk is repre- sented according to: - a first mode, in the case when - a second mode, in the case when - a third mode, in the case when Always according to the invention, according to said first mode, dBk may be represented : - in the case when gk > 0, by (gk + 1) binary bits according to the formula ([3]) : where"&"is the juxtaposition operator and [Zl is the positive two's complement binary representation having t bits of the value Z; and in the case when gk = 0, by only one binary bit equal to "0" Still according to the invention, according to said second mode, djBk

may be represented by (gk + xk + 2) binary bits according to the formula : Furthermore according to the invention, according to said third mode, diBk is represented: - in the case when xk > 0, by alBt _ 2gk gk +xk +2+In binary --/ bits, where In [X] is the operator returning the integer part of X, according to the formula :

is the operator constructing a string including W bits equal to"1"and Re- LYJ is the operator returning the remainder of the division-, i. e. it is equal to bu-2 bk -in the case when xk=O, by gk+xk+I+In di-binary 2 gk +xk bits according to the formula : Always according to the invention, for each binary value Bk of the sorting O, for k = 0, 1, 2,..., (2"-2), it may be gk =g Xk =x Still according to the invention, the method may also comprise the step: G. juxtaposing to the output string E data related to the first positive in- teger vialue g and to the second positive integer value x.

Preferably according to the invention, the method may determine for each array Ak, for k = 0,1, 2,..., (2n-2), the optimum first and second posi- tive integer values gk and xk, i. e. which are apt to minimise the size of memory needed to represent the array Ak according to the binary en- coded representation, the method juxtaposing to the output string E, dur- ing step D. 3, data related to the optimum first and second positive integer values gk and xk.

Furthermore according to the invention, the method may determine the optimum first and second positive integer values gk and xk by means of a trial and error process.

Always according to the invention, the trial and error process may verify the size of memory needed to represent the array Ak according to the binary encoded representation for all the values of the first and sec- ond positive integer values gk and xk ranging, respectively, from 0 and G, with G > 0, and from 0 and X, with X > 0.

Preferably according to the invention, G=12 and/or X=3.

Still preferably according to the invention, the number n of bits, in- cluded in the words into which the input string S is subdivided during step C, is an involution of 2: n=2m Furthermore according to the invention, the method may furthermore comprise the step: H. juxtaposing to the output string E data related to the number n of bits included in the words into which the input string S is subdivided dur- ing step C.

Preferably according to the invention, the method still comprises the step: I. juxtaposing to the output string E data related to the size of the input string S.

Still preferably according to the invention, the method is iterated, being applied at each h-th iteration, with h > 1, to the output string E ob- tained by the preceding (h-1)-th iteration.

Always according to the invention, for at least two consecutive itera- tions (h-1) and h, with h > 1, the corresponding numbers n*_l and nh of bits, included in the words into which the corresponding input strings S are subdivided during step C, may be different with respect to one another: n*_l X nh It is still specific subject matter of this invention an automated method for decompressing a compressed binary string E of input data into a binary string S of output data, characterised in that the compressed bi-

nary string E of input data has been obtained by applying to the binary string S the just described automated compression method, and in that it comprises the following steps: J. reading within the compressed string E the number N2n_l of words having the binary value B2n_l ; K. constructing a string S2"_l comprising N2.-, words having the binary value B2 L. for each binary value Bk of the sorting O, for k = (2n - 2),(n2 - 3),..., 2,1, 0, constructing a corresponding string Sk executing the steps of the loop : L. 1 initialising the string Sk to the string Sk+1, L. 2 reading within the compressed string E the datum related to the presence of at least one word having binary value Bk within the string Sk, L. 3 in the case when at least one word having binary value Bk is present within the string Sk, reading within the compressed string E the values dBi, for i=0, 1,2,..., Nk-1, with Nk # 0, of the distances of each word having binary value Bk from the preceding word having binary value Bk, and inserting into the string Sk words having the binary value Bk at the positions lo- cated by the values diBk L. 4 repeating the loop L, M. assuming the binary string S of output data equal to the string So S=So.

Always according to the invention, the decompression method may be iterated, the compressed binary string E of input data having being obtained by applying to the binary string S the iterated automated com- pression method.

It is also subject matter of this invention an electronic apparatus, comprising at least one central processing unit and at least one memory unit, characterised in that it executes the automated compression method previously illustrated.

It is further subject matter of this invention an electronic apparatus, comprising at least one central processing unit and at least one memory unit, characterised in that it executes the automated decompression method described before.

It is still subject matter of this invention an electric, magnetic or elec- tromagnetic signal modulated by a digital signal, characterised in that said digital signal comprises at least one compressed binary string E obtained by means of the automated compression method previously illustrated.

It is another subject matter of this invention a computer program comprising code means adapted to execute, when running on a computer, the automated compression method previously described.

It is further subject matter of this invention a memory medium read- able by a computer, storing a program, characterised in that the program is the computer program just described.

It is another subject matter of this invention a computer program comprising code means adapted to execute, when running on a computer,

the automated decompression method illustrated before.

It is further subject matter of this invention a memory medium read- able by a computer, storing a program, characterised in that the program is the computer program just described.

The present invention will be now described, by way of illustration and not by way of limitation, according to its preferred embodiments, by particularly referring to the Figures of the enclosed drawings, in which: Figure 1 schematically shows a portion of a binary string S of input data to be compressed by means of a preferred embodiment of the auto- mated method according to the invention; Figure 2 schematically shows the portion of the string S of Figure 1; Figure 3 schematically shows a flow chart of the trial and error proc- ess for determining the first and the second optimum positive integer val- ues gk and xk according to the preferred embodiment of the automated method according to the invention; Figure 4 schematically shows a first portion of the compressed bi- nary string E of output data generated by the preferred embodiment of the automated method according to the invention; Figure 5 schematically shows a portion of a second string S1 gener- ated during the execution of the preferred embodiment of the automated compression method according to the invention; Figure 6 schematically shows a second portion of the compressed binary string E of output data generated by the preferred embodiment of the automated method according to the invention;

Figure 7 schematically shows a third portion of the compressed bi- nary string E of output data generated by the preferred embodiment of the automated method according to the invention; and Figure 8 schematically shows a flow chart of the preferred embodi- ment of the automated compression method according to the invention.

In the following of the description same references will be used to indicate alike elements in the Figures.

With reference to Figure 1, it may be observed that the automated compression method firstly subdivides into words 1 of equal length the string S, having binary representation, which is to be compressed. Each word 1 comprises n bits, where in a preferred embodiment of the invention n = 8. Advantageously, in case the size of the string S is not a multiple of n, in order to obtain a last word of n bits the string S may be padded with a tail of bits equal to"0".

Each word 1 may assume one binary value B among 2"different bi- nary values, which, for n = 8, are equal to 2 = 256. The method fixes a sorting O of such 2n values according to a corresponding sorting rule : In the embodiment of the method shown in the Figures, the words 1 are considered as positive binary numbers the sorting of which is the de- creasing sorting from the decimal value 255, equal to the binary value "11111111", to the value 0, equal to the binary value"00000000".

With reference to Figure 2, the method scans the string S for deter- mining the position of the words 1'having the binary value Bo which is in

the first position within the sorting O, and it stores their reciprocal dis- tances dB° in a memory array Ao ; in the embodiment of Figure 1, such first binary value is the value"11111111". In particular, in the embodiment of Figure 2, the distance d0B0 of the first word 1' equal to Bo="11111111" is equal to the number of words 1 of the string S which precede it, while the distance dB° (for i > 0) between two successive words 1'equal to Bo"11111111"is equal to the number of words 1 which separate them, in such a way that in the case when two words equal to Bo="11111111"are consecutive, as in the case of the words 1"of Figure 2, their reciprocal distance is equal to 0, so that in Figure 2 it is d3B0 = 0.

Assuming that the string S comprises No words, with No > 0, having binary value Bo, at the end of such step of scanning the string S for searching all the words having binary value Bo, the array Ao contains No elements related to the distances dB°, for i = 0,1, 2,..., No-1, which locate the positions within the string S of the No words 1'having binary value Bo.

In the embodiment of the method shown in the Figures, each one of the No elements of the array Ao, in case No > 0, is represented accord- ing to a suitable encoded binary representation. In particular, being de- fined two positive integer values go and xo, with go 2 0 and xo : 0, de- pending on the value of this may be represented in one of three dif- ferent ways: 1) in the case when diB0 < 2g0 [2] dB° is represented by (go +1) binary bits, the first of which is equal to "0"and the following go bits form the positive two's complement bi- nary representation having go bits of the value of dB,, ; in other words,

in this case zizis represented according to the formula : where"&"is the juxtaposition operator and [Zl is the positive two's complement binary representation having t bits of the value Z; 2) in the case when diB° is represented by (g0 + x0 + 2) binary bits the first of which is equal to"1", the second one is equal to"0"and the following (go +xo) bits are the positive two's complement binary representation having (g0 + x0) bits of the value of (diB°-2go) ; in other words, in this case diB0 is represented according to the formula: 3) finally, in the case when 2 +x°) dB° i dB° _ 2g° zizis represented by go + +2+In binary bits, where 2 g° +x° In [X] is the operator returning the integer part of X, in such a way that the first bit of the encoded representation is equal to "1", the following bits are equal to"1", the following bit is equal to"0" and the last (go +xo) bits are the positive two's complement binary representation having (g0+Xo) bits of the value of is the operator returning the remainder of the division , i. e. it is equal to W-InCW- ; y Y- in other words, in this third case

dB° is represented according to the formula : is the operator constructing a string including W bits equal to"I".

In the embodiment of the method shown in the Figures, the two posi- tive integer values go and xo are fixed through a trial and error process which finds the optimum values of go and xo that minimise the size of memory needed to represent the array Ao according to the encoded rep- resentation. Specifically, with reference to Figure 3, immediately under- standable to one skilled in the art, for each trial value of go between 0 and G, with G > 0, the method verifies the memory occupation of the rep- resentation of the array Ao for each value of xo between 0 and X, with X > 0, and it considers as optimum values of go and xo the ones which minimise such memory occupation. Preferably, G=12 and X=3.

With reference to Figure 4, it may be observed that at the end of the step of encoded representation of the array Ao, the method constructs an output string E comprising: - a first flag bit equal to"1", for indicating that at least one word of value Bo exists within the string S (i. e. the array Ao comprises at least one element dB°), - the values of go and xo, and - the No elements of the array Ao having encoded representation.

As it will be illustrated later on, it is not necessary to insert into the output string E the value of No. However, alternative embodiments of the method may provide, instead of the first flag bit of indicating the presence of at least one word of value Bo within the string S, for the insertion into the output string E of the value of No (which assumes the value 0, in case of absence of words of value Bo within the string S).

Moreover, as shown in Figure 5, a second binary string S1 is con- structed, either physically or logically, which is obtained by eliminating from the initial string S all the words 1'having the binary value Bo.

In the case when the string S does not comprise any word of value equal to Bo, that is No = 0, the output string E comprises only one flag bit equal to"0".

Afterward, similarly to what executed during the preceding step, the method scans the string S1 for determining the position of the words hav- ing the binary value Bl which is in the second position within the sorting O, and it stores their reciprocal distances dB'in a corresponding memory

array A1 ; in the embodiment of the method shown in the Figures, such second binary value is the value "11111110".

Assuming that the string S1 comprises N1 words, with N1 #0, hav- ing the binary value Bl, at the end of such scanning step of the string S1 for searching all the words having the binary value B1, the array A1 con- tains N1 elements related to the distances diB1, for i=0, 1,2,..., N1=1, which locate the positions within the string S1 of the N1 words having bi- nary value Bl.

Similarly to what done for the array A0, in case N1 > 0, the embodi- ment of the method shown in the Figures fixes two positive integer values g1 and x1 which minimise the size of memory needed to represent the array A1 according to the encoded representation, illustrated above with reference to the array Ao.

With reference to Figure 6, it may be observed that, in case N1 > 0, at the end of the step of encoded representation of the array A1, the method juxtaposes at the head of the output string E - a first flag bit of flag equal to"1", for indicating that at least one word of value B1 exists within the string S (i. e. the array A1 comprises at least one element dB), - the values of g1 and x1, and -the N1 elements of the array Al having encoded representation.

In case the string S does not comprises any word of value equal to B1, that is N1 = 0, only one flag bit equal to"0"is juxtaposed to the output string E.

A third binary string S2 is then constructed, either physically or logi- cally, which is obtained by eliminating from the second string S1 all the words having the binary value Bl.

The method executes similar operations on the third string S2, scanning it for determining the position of the words having the binary value B2 which is in the third position within the sorting O, and it stores their reciprocal distances dB2 in a memory array A2 ; in the embodiment of the method shown in the Figures, such third binary value is the value "11111101". Assuming that the string S2 comprises N2 words, with N2 > 0, having the binary value B2, at the end of such scanning step the array A2 contains N2 elements related to the distances diB2, for i = 0, 1, 2,..., N2-1. Similarly to what previously done, in case N2 > 0 the method fixes two positive integer values g2 and x2 which minimise the size of memory needed to represent the array A2 according to the already illustrated encoded representation, and, as shown in Figure 7, at the head of the output string E there are juxtaposed: -a first flag bit equal to"1", - the values of g2 and x2, and - the N2 elements of the array A2 having encoded representation.

Also, in case the string S does not comprise any word of value equal to B2, that is N2 = 0, only one flag bit equal to"0"is juxtaposed to the output string E.

Afterward, a fourth binary string S3 is constructed, either physically or logically, which is obtained by eliminating from the third string S2 all

the words having the binary value B2.

In other words, the method iterates the preceding steps for all the 2n different binary values Bi according to the sorting O. However, for the last binary value B2n_1, that in the embodiment of Figure 1 is the value "00000000", the method stores only the number N2"_1 of words having such binary value B2"_1, since the last string S2"_l comprises only words having such binary value B2"-1, so that the related reciprocal distances d, 2n-} have constant value equal to 0. Therefore, for the last binary value B2"_1 it is not even executed an encoded representation as previously illustrated.

Finally, the method advantageously stores the initial size of the compressed string S, without the possible tail of bits equal to"0".

The embodiment of the method shown in the Figures provides for the following specific cases of the encoded representation for specific values of gk and xk, for k = 0, 1, 2,..., 2n-2 : - in the case when gk = 0, if the inequality [2] occurs, the element dBt is represented only by the bit "0", and - in the case when xk = 0, the inequality [4] may never occur and, therefore, in case the inequality [6] occurs, the encoded repre- sentation of the element dBk does not comprise the first bit"1", i. e. this is represented according to the formula

Figure 8 shows a flow chart which schematically summarises the method steps illustrated above.

The embodiment of the method shown in the Figures may be iter- ated, that is the output string E may be subjected to the same steps of the compression method, schematically shown in Figure 8, being treated as it were an input string S so as to obtain a new output string El. In this case, the initial size of the input string S is advantageously stored only once at the head of the last output string.

In particular, the string E is subdivided in words comprising the same number n of bits, if the new output string El which is obtained has smaller size, otherwise the string E is subdivided in words comprising a number n1 of bits that is different from n bits, preferably nl < n. Still more prefera- bly, the sizes nh of the words in which the strings to be compressed are subdivided ad each h-th iteration are equal to involutions of 2, according to the formula : nh = 2mh The number nh of bits of the words into which the input string to be compressed has been subdivided is stored at the output string head.

The automated decompression method reconstructs the input string S starting from the output string E. In particular, the output string E is read according to a sorting reversed with respect to the sorting O used for con- structing it, carrying out the following steps: - reading the number nh of bits of the words forming the string S ; - constructing a string S2"_1 comprising N2"_1 words having the binary

value B2n_l ; - for each successive binary value Bk, for k decreasing from (2n-2) to 0, - reading the flag bit and, in case it is equal to"1", reading the values of gk and xk and constructing a string Sk by inserting into the string Sk+1 words having the binary value Bk at the po- sitions located by the values dBk interpreted according to the encoded representation of formulas [3], [5], [7] and [8].

It is clear that, knowing the number of words included in the string Sk+1, it is not necessary to know the number Nk of words having binary value Bk to be inserted into the string Su+1, thanks to the fact that the at- tempt of reading dBk, for i > Nk, would be inconsistent with the number of words included in the string Sk+1.

In case the compression method has been iterated, the decompres- sion method must also be iterated up to reconstructing the input string S.

Advantageously, an input data string to be compressed may prelim- narily be subdivided in equal parts or"segments", all of which may sepa- rately be compressed for being then, for example, transmitted and, at re- ception, decompressed and composed so as to form again the original string. This segmentation is done in order on the one hand to lighten the compression process and on the other hand to allow data to be sent in several segments along one of common data transmission lines, in such a way that the receiving apparatus may incrementally reconstruct the origi- nal input data string starting from the different consecutive segments, by

incrementally juxtaposing the segments obtained from decompression of the compressed segments.

The preferred embodiments have been above described and some modifications of this invention have been suggested, but it should be un- derstood that those skilled in the art can make other variations and changes, without so departing from the related scope of protection, as defined by the following claims.

Previous Patent: AUTOMATED METHOD FOR COMPRESSING BINARY STRINGS WITHOUT INFORMATION LOSS, AND RELATED AUTOMATED METH...

Next Patent: DYNAMIC REAL TIME GENERATION OF 3GPP TURBO DECODER INTERLEAVER SEQUENCE