There were two different races you could compete in on the c64 cracking scene; the race for the first release or the fight over the best release (in Triad’s publication “Gamer’s Guide”). In the definition of best, size was the main factor, which means that knowing your compression was imperative to be competitive (and having a small intro, not to waste bytes on this).
After the game was isolated, you normally had a 250 block chunk of data (main file for a multi level or the raw game file for a one filer). You should normally have been able to insert the trainer in this chunk as well. There is no way to add an intro to a file of this size, as there is no space to do it. This is why you packed it. I use pack to describe what’s technically called RLE packing, where you substitute strings of the equal bytes byte with a code, and thereby being able to represent the data in a shorter way. On the negative side you add a depacker and a depacking time, but on the positive side you have a more manageable chunk of data where the size allows adding an intro to it.
Ok, then you add the intro and the new BLOB is the original RLE packed chunk with a bit of intro added.
An last but not least you crunch (which is what I call the sequence crunching, often based on LZ77 or adaptations of it). The BLOB that is the result after this step, is the final product.
Some Q&A here:
Q: Why RLE and not just crunching twice?
A: It’s important to understand that packers moves full bytes around. Crunchers manipulate on bit level.
- In theory, the second run of a cruncher should work on data that is so scrambled that it can’t make any good work. Try zipping a zipped file on a PC and you will see that it gets longer and not shorter. In the example above, the efficient compression would only be applicable on the intro and if the intro is short and efficient there is very little data that can be efficiently compressed.
- They also compress in very different ways that to not interfere – rather the opposite. RLE represents the data in a way which can be new sequences. It reduces the entropy, but the sequence entropy that LZ77 eats, still remains. Compressing twice with the same algorithm addresses the same entropy.
- Also, RLE makes the file smaller making the distances between sequences shorter. That means that relative addressing between sequences, can in theory be shorter.
Q: Why the RLE in the first place?
A: Well, the benefit of the RLE are several (apart from possible compression gains):
- The BLOB of the main game could be one that is too big to add the intro to without some sort of prior compression. If the BLOB is small enough then the RLE step has lesser value
- If you RLE pack, the BLOB you add to your intro has a consistent start address and it installs itself properly in memory. Without the RLE packer, you need to manually transfer stuff around in memory. The RLE depacker does the relocation for you, at a very small overhead.
I should say this was one of my special treat tricks back in the days, including having access to crunchers that were not public. Today, most of the crunchers are publicly available and algorithms have improved. I’m hence most interested in feedback on this text.