Jump to content
Tuts 4 You

String compression algorithms


CodeExplorer

Recommended Posts

Posted (edited)

String compression algorithms:

1. First simple algorithm:

Input string "Acccc";

We store first 2 chars "Ac" +

a char:

- first bit should be set whit 1 for marking that we have a compression here

- rest of bits will filled whit numbers of duplicates - under this case 3

"Acccc" -> Ac[3]

2. Second algorithm:

Input string "the Udrea, the Basescu";

* We store "the Udrea, " +

First char:

- first bit should be set whit 1 for marking that we have a compression here

- rest of bits will filled whit the lenght of duplicate string - under this case

Second char:

- all bits should contain the position from where to take the string - under this case 11

"the Udrea, the Basescu" -> "the Udrea, [11,3] Basescu"

Edited by CodeRipper
Posted

I think most of us already though of the first algo. You can even make it shorter by removing the second bracket.

Ac$3 for instance where $ is the indicator.

I ran some tests with it and you barely save any space (unless whitespace).

Posted

If you just use one separator you cant have strings like Accccc123 because that makes Ac$5123

A + 5123 times c

Not really what you intended

And yeah, none of these really beat Huffman unless you have something like 5000 of the same character

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...