Image Compression

Soldato
Joined
22 Oct 2005
Posts
2,884
Location
Moving...
I've been given an assignment where we have to carry out some kind of image compression. What kind of compression is up to us but I will hopefully be using a mixture of huffman coding, arithmetic coding, run-length coding, quad trees, quantization and transform coding, just because they are the one's we have discussed in lectures.

To help he has given us test images (.ppm format) together with some source code that can save and load the images. Here is the code:

Header:
Code:
#include <stdio.h>
#include <string.h>

#define RED   0
#define GREEN 1
#define BLUE  2

class _ppm{
public:
	_ppm();
	~_ppm();

	int load_ppm(const char* filename, int xx = 0, int yy = 0);
	int save_ppm(const char* filename);

	int get_image_height(){return height;};
	int get_image_width(){return width;};
	int get_image_depth(){return depth;};

	int get_pixel(int x, int y, int channel);
	int set_pixel(int x, int y, int channel, int value);

protected:

	void clear();

	int* data;
	int width,height,depth;

private:

	bool loaded;
};

Source:
Code:
#include "PPM.h"
#include <iomanip>

_ppm::_ppm()
{
loaded = false;
}

_ppm::~_ppm()
{
if(loaded)
	clear();
}

/* save_ppm - Saves the image as PPM format to the file stream.
 * Arguments: filename to save into, x1, y1 to x2, y2 area to save.
 * Returns: 0 on success, -1 on error
 */

int _ppm::save_ppm(const char* filename)
{
  if(!loaded)
	  return -1;

  FILE * fi;
  int x, y, rgb;

  int x1 = 0;
  int y1 = 0;
  int x2 = width;
  int y2 = height;
	
  if((fi = fopen(filename,"w+b"))==0)
	  return -2;
  
  if (x1 < 0 || x1 > width || y1 < 0 || y1 > height ||
      x2 < 0 || x2 > width || y2 < 0 || y2 > height)
		return -1;
  
  
  /* Dump PPM header: */
  
  fprintf(fi, "P6\n %d %d\n%d\n", x2 - x1, y2 - y1, depth);
  

  /* Dump data: */
  
  for (y = y1; y < y2; y++)
    for (x = x1; x < x2; x++)
      for (rgb = 0; rgb < 3; rgb++)
			fputc(data[((y*width+x)*3)+rgb], fi);
  
  return 0;
}

/* load_ppm - Loads a PPM format image.
 * Arguments: filename to load from,x and y locations to start 
 * loading into, and opacity with which to add the data to the 
 * existing image (default is 255 - full opacity).
 * Returns: 0 on success, -1 on error
 */

int _ppm::load_ppm(const char* filename, int xx, int yy)
{
  if(loaded)
	  clear();

  FILE * fi;
  char temp[10240];
  int x, y, rgb, m; 

  if((fi=fopen(filename,"rb"))==0)
	  return -1;
  
  /* Load PPM header: */
  
  fscanf(fi, "%s %d %d %d", &temp, &width, &height, &m); 
  depth = m;

  /* Create storage array for data */

  data = new int[(width*height)*3];

  /* Get some data from the ppm */

    fgetc(fi);

  /* Load real data: */

  for (y = 0; y < height; y++)
	{
	for (x = 0; x < width; x++)
		{
		for (rgb = 0; rgb < 3; rgb++)
			{
			data[((y*width+x)*3)+rgb] = fgetc(fi);
			}
		}
	}

  loaded = true;
  return 1;
}



int _ppm::set_pixel(int x, int y, int channel, int value)
{
if(!loaded)
	return -1;

if (x < 0 || x > width || y < 0 || y > height)
	return -1;

if(channel > 2 || channel < 0)
	return -1;

if(value < 0 || value > 255)
	return -1;

data[(((y*width)+x)*3)+channel] = value;

return 0;
}

int _ppm::get_pixel(int x, int y, int channel)
{
if(!loaded)
	return -1;

if (x < 0 || x > width || y < 0 || y > height)
	return -1;

return data[(((y*width)+x)*3)+channel];
}

void _ppm::clear()
{
delete[] data;
height = 0;
width = 0;
depth = 0;
loaded = false;
}

Now I havn't done any kind of compression before but I would have thought the main area we alter is how the images are saved and loaded?

However, he STRONGLY recommends that we use the code he's provided. This just writes integers between 0-255 to the file doesn't it? Surely we need to work at a lower level than this? Or am I mistaken?

Thanks.
 
No, compression is about decreasing the amount of data in the file, not how it is written to the disk.

If your lecturer STRONGLY recommends you use his code, then I STRONGLY recommend you listen to him.
 
I think I know what you're getting at. He has provided code for loading/saving a ppm format image but there is no way to save compressed data in the provided code.

Does the assignment ask you to save the compressed images? If so you'll need to write the code to do that. Use his code for dealing with ppm images.

Best if you fire off an email or ask at the end of his next lecture to confirm :)
 
Unforunately our lecture with him this morning was cancelled so I still don't know what we're supposed to be doing. I've sent him an email as well but he very rarely answers them.

So for the minute I'm gonna have to assume. Am I right in thinking that if I am to do any kind of compression I'm going to have to write new functions to save an load images because I will have to write binary data?

For example huffman coding involves assigning shorter codes for the more frequent symbols, so if the image was nearly all white I would give the RGB value of 255 a code of 0, then if there was one pixel of black in the image it would have amassive code like 11111111. So I write these binary values to file.

The thing I dont understand is that these images will need to be opened in some kind of program (e.g. photoshop) so you can view them. If i'm writing these random codes how the hell is photoshop supposed to know what they are?

I'm sure I'm just being an idiot but I'm really confused as to how to create these compression algorithms. I understand the actual technique, i.e. I know how arithmetic coding works, but I don't know how we are supposed to implement it.

Any advice?

Thanks again.
 
The PNG image format is implemented using the zlib library, which uses your Huffman coding, etc. All graphics tools should be able to read PNG files; they're fairly common.

Are you sure that an external program has to be able to read your compressed image? If so, then go for something like PNG as previously mentioned. If not, then it would be easier if you wrote the image rendering code yourself, as this will allow you to define the structure of the compressed image to your own specifications.
 
Last edited:
From what you've said I would think the flow is:

Load ppm image (using professor's code) -> compress using choice of algorithms you mentioned in the OP (your code)-> output compressed file (your code).

Professor is impressed by size of compressed image file compared to the original and wants to check file is valid and of reasonable quality.

Load compressed image (your code) -> decompress image (your code) -> save as ppm file for him to view (his code)

Even if this is not exactly what he wants it will be close and very good practice too. The specifics will be quite easy to conform to if you get the above working.

Am I right in thinking that if I am to do any kind of compression I'm going to have to write new functions to save an load images
I would say so. His class only provides methods for loading/saving an image in ppm format and accessing it's pixels by x, y and RGB channel.
 
ok from what I can see, your Prof. has given you a class for loading and saving ppm images (as previous stated by others). What it sounds like he's asking you to do is write a corresponding class that does the same thing, only this time it compresses the image data.

I'd start by taking a copy of the "_ppm" class (or inherit from it and overload the load and save methods), and adapt the load and save methods to include an extra step where they compress/decompress the image data.

Actually looks like a reasonably fun assignment (but I'm like that :P)

akakjs
 
Thanks for your replies. I'm not sure whether an external program needs to read it. At a guess I would say no, so for the minute I'm going to follow Fourstar's advice:

Load ppm image (using professor's code) -> compress using choice of algorithms you mentioned in the OP (your code)-> output compressed file (your code).

Professor is impressed by size of compressed image file compared to the original and wants to check file is valid and of reasonable quality.

Load compressed image (your code) -> decompress image (your code) -> save as ppm file for him to view (his code)

My next question is how to actually read and write the compressed data. I'm using C++ so will be using fopen statements to access the file. Now my understanding is that the mode parameter, e.g. wb, specifies to write a file in binary mode. But what exactly is binary mode? I would have thought it means writing in binary, i.e. when you open the file in say a notepad it is made from 0's and 1's, however it is actually filled with jibereish like:
7+'Y 2+'J0-B)#0+!<

So if it doesn't write in binary, how exactly do I compress something. For example if I had a 3x3 image with the pixels values:
255 255 255
255 0 255
255 255 255

Obviously it would be more beneficial to write a single 0's instead of 11111111's (255's) and a single 1 instead of 00000000(0). But if it's actually writing a string of characters that make up the numbers between 0 and 255, then surely all numbers are going to take up the same amount of space?

Is this possible or am I looking at things the wrong way?

Sorry if this is a stupid question!:o
 
Basically you encode patterns in the data. So for instance with a trivial run length encoding instead of having say AAAA you could have 4A thus saving two characters.

I don't know what compression algorithms your meant to be doing. Huffman and Run Length encoding are by far the most common ones and are not that hard to implement.
 
Basically you encode patterns in the data. So for instance with a trivial run length encoding instead of having say AAAA you could have 4A thus saving two characters.

I don't know what compression algorithms your meant to be doing. Huffman and Run Length encoding are by far the most common ones and are not that hard to implement.

He hasn't given us any algorithms that we have to include, he said we should just include as many methods as possible. He's broken down the general encodeing process into the following parts:

INPUT IMAGE ----> Mapper ----> Quantiser -----> Symbol encoder -----> CODED DATA

The coding methods that are in our notes are:
-Huffman coding
-Run length coding
-Arithmatic coding
-Quad trees
-Block truncation coding

There's also the quantizer aspect which is either done using a scalar method or a vector method, but I understand this part fine.

The other section is a mapper, the only mapper he mentions is Transform coding using DCT transforms but I dont think this should be a problem as I'm not writing and symbol codes in this section, it's more of a technique to get rid of redundant data before you get to the coding aspect. However I havent really looked at this much yet so I may be completely wrong!

I'm pretty sure we can get a first by just using these methods, it's only if we want to go for the big marks (which I dont:p) we should look at other methods.

I just can't get my head around how the code is written to the file and how it is all constructed, i.e. how much space is taken by each string/character/number etc.
 
Back
Top Bottom