News:

Happy New Year! Yes, the current one, not a previous one; this is a new post, we swear!

Main Menu

[Java] Compression data

Started by iago, May 17, 2006, 10:33:47 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

iago

I did a lot of searching and wasn't able to find a good way to compress Data in java.  The only way I could uncover was to write the data to a compressed file and read it back.  It's not the best solution, but it's quick for any reasonable amount of data and it works perfectly.  Here's the code:

[Java 1.5 or higher only]

Quote

package util;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;

public class Compressor
{
   public static byte []compress(byte []data)
   {
      try
      {
         File tempFile = File.createTempFile("RCCompression", null);
         tempFile.deleteOnExit();
         FileOutputStream fileOut = new FileOutputStream(tempFile);
         GZIPOutputStream out = new GZIPOutputStream(fileOut);
         out.write(data);
         out.flush();
         out.finish();
         fileOut.close();
         
         FileInputStream in = new FileInputStream(tempFile);
         byte []bytes = new byte[in.available()];
         in.read(bytes);
         
         tempFile.delete();
         
         return bytes;
      }
      catch(IOException e)
      {
         System.err.println("Error compressing data: " + e);
         System.exit(1);
         return null;
      }
   }
   
   public static byte []decompress(byte []data)
   {
      try
      {
         Buffer b = new Buffer();
         
         File tempFile = File.createTempFile("RCCompression", null);
         tempFile.deleteOnExit();
         FileOutputStream fileOut = new FileOutputStream(tempFile);
         fileOut.write(data);
         fileOut.close();
         
         FileInputStream fileIn = new FileInputStream(tempFile);
         GZIPInputStream in = new GZIPInputStream(fileIn);
         
         int i;
         while((i = in.read()) >= 0)
            b.addByte((byte) i);
         
         tempFile.delete();
         
         return b.getBytes();
      }
      catch(IOException e)
      {
         System.err.println("Error decompressing data: " + e);
         return new byte[0];
      }
   }
   
   
   public static void main(String []args)
   {
      byte []data = "aaaaaaaaaaaaaa1 plumbing".getBytes();
      byte []data2 = "A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  ".getBytes();
      
      byte []compressedData = Compressor.compress(data);
      byte []compressedData2 = Compressor.compress(data2);
      
      System.out.format("Old length = %d\n", data.length);
      System.out.format("New length = %d\n", compressedData.length);
      System.out.format("Decompressed = %d\n", Compressor.decompress(compressedData).length);
      
      System.out.println();
      
      System.out.format("Old length = %d\n", data2.length);
      System.out.format("New length = %d\n", compressedData2.length);
      System.out.format("Decompressed = %d\n", Compressor.decompress(compressedData2).length);

      System.out.println();
      System.out.println();
      
      System.out.format("Decompressed data: %s\n", new String(Compressor.decompress(compressedData)));
      System.out.format("Decompressed data: %s\n", new String(Compressor.decompress(compressedData2)));
   }
}


And the output:
Quote

Old length = 24
New length = 33
Decompressed = 24

Old length = 551
New length = 53
Decompressed = 551


Decompressed data: aaaaaaaaaaaaaa1 plumbing
Decompressed data: A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data.....  A lot of repeated data..... 


Notice that for a short string, the data length actually increases. 

MyndFyre

Doesn't Java have zlib support built-in, where one of the benefits is that you actually ensure that compressed data is not larger than the original data?
Quote from: Joe on January 23, 2011, 11:47:54 PM
I have a programming folder, and I have nothing of value there

Running with Code has a new home!

Quote from: Rule on May 26, 2009, 02:02:12 PMOur species really annoys me.

iago

I saw something about zlib, but then I worried that it had external dependencies so I didn't go into it. 

How does zlib ensure that the compressed data isn't larger? 

MyndFyre

Quote from: iago on May 18, 2006, 08:07:43 AM
I saw something about zlib, but then I worried that it had external dependencies so I didn't go into it. 

How does zlib ensure that the compressed data isn't larger? 
Oooh, sorry.  It isn't *much* larger.  Per RFC 1951, in cases where it works out better, zlib will prepend a four-byte signature and then the original data.

From the Zlib FAQ
Is there a Java version of zlib?
Probably what you want is to use zlib in Java. zlib is already included as part of the Java SDK in the java.util.zip package. If you really want a version of zlib written in the Java language, look on the zlib home page for links.
Quote from: Joe on January 23, 2011, 11:47:54 PM
I have a programming folder, and I have nothing of value there

Running with Code has a new home!

Quote from: Rule on May 26, 2009, 02:02:12 PMOur species really annoys me.

iago

Ah, I looked at java.util.zip, but it seemed like it was more overhead.  You have to do things like create zip entries, and I really didn't feel like figuring out how.