Pages

Monday, March 2, 2015

Compress your data or file using UTL_COMPRESS package





Oracle UTL_COMPRESS

The UTL_COMPRESS package can be used to compress and uncompress large objects (raw, blob or bfile). For all practical purposes, think of UTL_COMPRESS as PLSQL’s means of compressing or zipping files. When looking at the various procedures and functions, many are prefixed with LZ.

Oracle’s documentation for the UTL_COMPRESS package is sparse in terms of showing a full-scale example. Sparse is not even the correct word here. Nonexistent is a better descriptor for UTL_COMPRESS documentation.

 The actual implementation of the UTL_COMPRESS package is pretty easy to use once an example is seen. Many of the examples shown elsewhere include a length comparison among the input, the compressed, and the uncompressed lengths. Ideally, the input and uncompressed lengths should be the same. You may find that the compressed length is longer than the input length. This occurs when the input length is small or short. The overhead of building a dictionary can make the compressed length longer than the input length.

In this simple example, we create one’s own input, compress it, uncompress it, and evaluate the lengths of each.

DECLARE  
 l_in_blob      BLOB;  
 l_compressed_blob  BLOB;  
 l_uncompressed_blob BLOB;  
 BEGIN  
 -- Set some values  
 l_in_blob   := TO_BLOB(UTL_RAW.CAST_TO_RAW ('This is a long string of words used for this example'));  
 l_compressed_blob  := TO_BLOB('0');  
 l_uncompressed_blob := TO_BLOB('0');  
 -- Compress the string  
 UTL_COMPRESS.lz_compress(src => l_in_blob, dst => l_compressed_blob);  
 -- Uncompress the string  
 UTL_COMPRESS.lz_uncompress(src => l_compressed_blob, dst =>l_uncompressed_blob);  
 -- Compare the results with the input  
 DBMS_OUTPUT.put_line('Input length is  : ' || LENGTH(l_in_blob));  
 DBMS_OUTPUT.put_line('Compressed length : ' || LENGTH(l_compressed_blob));  
 DBMS_OUTPUT.put_line('Uncompressed length: ' || LENGTH(l_uncompressed_blob));  
  -- Caller responsibility to free up temporary LOBs  
 -- See Operational Notes in the documentation  
 DBMS_LOB.FREETEMPORARY(l_in_blob);  
 DBMS_LOB.FREETEMPORARY(l_compressed_blob);  
 DBMS_LOB.FREETEMPORARY(l_uncompressed_blob);  
 END;  
 /  
 Input length is  : 52  
 Compressed length : 67  
 Uncompressed length: 52   
 PL/SQL procedure successfully completed.  

Note that the compressed length is longer than the input length. Adding a few more characters to the input string yields the following:

DECLARE  
 l_in_blob      BLOB;  
 l_compressed_blob  BLOB;  
 l_uncompressed_blob BLOB;  
 BEGIN  
 -- Set some values  
 l_in_blob   := TO_BLOB(UTL_RAW.CAST_TO_RAW ('This is a long string of words used for this example.Now is the time for all good men to come to the aid of their country'));  
 l_compressed_blob  := TO_BLOB('0');  
 l_uncompressed_blob := TO_BLOB('0');  
 -- Compress the string  
 UTL_COMPRESS.lz_compress(src => l_in_blob, dst => l_compressed_blob);  
 -- Uncompress the string  
 UTL_COMPRESS.lz_uncompress (src => l_compressed_blob, dst => l_uncompressed_blob);  
 -- Compare the results with the input  
 DBMS_OUTPUT.put_line('Input length is  : ' || LENGTH(l_in_blob));  
 DBMS_OUTPUT.put_line('Compressed length : ' || LENGTH(l_compressed_blob));  
 DBMS_OUTPUT.put_line('Uncompressed length: ' || LENGTH(l_uncompressed_blob));  
 -- Caller responsibility to free up temporary LOBs  
 -- See Operational Notes in the documentation  
 DBMS_LOB.FREETEMPORARY(l_in_blob);  
 DBMS_LOB.FREETEMPORARY(l_compressed_blob);  
 DBMS_LOB.FREETEMPORARY(l_uncompressed_blob);  
 END;  
 /  
 Input length is  : 122  
 Compressed length : 113  
 Uncompressed length: 122  
 PL/SQL procedure successfully completed.  

That little bit extra for the input string pushed it over the top in terms of having the compression take any real effect.

Note that the LZ_COMPRESS subprogram is overloaded since more than one signature method can be used to invoke it. The quality parameter is set to a default of 6. This parameter provides a trade-off between speed of compress and quality of compression. It takes quite a few more words or length of input before there is a difference in what the quality input does.

“Quality is an optional compression tuning value. It allows the UTL_COMPRESS user to choose between speed and compression quality, meaning the percentage of reduction in size.
A faster compression speed in UTL_COMPRESS will result in less compression of the data. A slower compression speed will result in more compression of the data. Valid values are [1..9], with 1=fastest and 9=slowest. The default 'quality' value is 6.”

At a setting of 1, the compressed length is 248, and at 9, the length becomes 245. The strength or utility of using UTL_COMPRESS comes into play when dealing with truly large objects and not just experimenting with contrived strings. For this use case, use a document, but then, a table will also be needed.

The setup steps are to create a table to hold the BLOB (Binary Large Object), create a directory with read/write for the user, put a file into the directory, initialize the record, insert a BLOB, compress it with UTL_COMPRESS, and compare the lengths.

create table compress_blob (indx integer, y blob);  
 create directory MYDIR as 'C:\temp';  
 --We assume the user doing this will have read/write on MYDIR  
 --Copy a file into the directory, e.g., A_57KB_Word_doc.doc  
 --This block will take care of the insert for you, or you  
 --could create a separate procedure to do this  
 --This is simple code, does not address other PL/SQL errors  
 DECLARE  
  ablob blob;  
  abfile bfile := bfilename('MYDIR', 'A_57KB_Word_doc.doc');  
  -- Gets a pointer to the file.  
  a_compressed_blob blob;  
  amount integer;  
  asize  integer;  
  quality integer := 9;  
cursor blob_cur is select * from compress_blob;  
 BEGIN  
 --  
 -- compress_blob table is initialized with one record because   
 -- the PL/SQL BLOB locator (ablob) must point to a specific  
 -- EXISTING NON-NULL database BLOB.   
 --  
 -- initialize the blob locator  
  insert into compress_blob values (1, empty_blob());  
  select y into ablob from compress_blob where indx = 1;  
  -- open the bfile and get the initial file size  
  dbms_lob.fileopen(abfile);  
  asize := dbms_lob.getlength(abfile);  
  dbms_output.put_line('Size of input file: ' || asize);  
  -- load the file and get the size  
  dbms_lob.loadfromfile(ablob, abfile, asize);  
  dbms_output.put_line('After loadfromfile');  
  asize := dbms_lob.getlength(ablob);  
  dbms_output.put_line('Size of blob: ' || asize);  
  -- compress the blob  
  -- you can experiment with varying the quality  
  a_compressed_blob := utl_compress.lz_compress(ablob, quality);  
  -- insert the compressed blob  
  insert into compress_blob values (2, a_compressed_blob);  
  -- compare the sizes of the blobs in the table  
  dbms_output.put_line  
  ('Sizes before and after insertion/compression -->');  
  for c1_rec in blob_cur  
   loop  
   asize := dbms_lob.getlength(c1_rec.y);  
   dbms_output.put_line(asize);  
  end loop;  
 end;  
 /  
 Size of input file: 57856  
 After loadfromfile  
 Size of blob: 57856  
 Sizes before and after insertion/compression -->  
 18722  
 57856  
 PL/SQL procedure successfully completed. 

When using a quality value of 1, the compressed size was 20868, so a value of 9 represents a 10% or so improvement. As file sizes increase, so will the amount of compression.

1 comment:

  1. The Best Sports Tournaments & Sportsbooks in South Africa
    Live Betting and Sportsbook · The William 토토사이트 Hill™ Sportsbook · Betvictor · Bet365 · Betway · Bet365 · Betway · PointsBet · Ladbrokes

    ReplyDelete