Computer & OS Fundamentals · beginner · ~9 min
Distinguish archiving from compression and know the common formats and their risks.
Archiving bundles files (tar); compression shrinks them (gzip/xz/zip). Lossless compression removes redundancy, so high-entropy data barely shrinks — a hint it's encrypted/packed. Archives carry real risks: zip-slip path traversal and zip bombs.
Backups and exfiltrated data come as archives, so handling them is routine. Zip-slip and zip bombs are concrete vulnerability classes in any code that extracts user-supplied archives — both offensive and defensive review targets.
Archive vs compress. tar bundles; gzip/xz/zip shrink. .tar.gz vs .zip. Unix vs cross-platform norms. Entropy tell. Won't-compress data is likely encrypted/packed. Zip-slip. Malicious paths escape the extract dir. Zip bomb. Tiny input, huge expansion → DoS.
Archiving bundles many files into one; compression shrinks data by removing redundancy. They're often combined.
.tar (no compression by itself)..gz, .bz2, .xz..tar.gz (a.k.a. .tgz) = tar then gzip — the Unix norm.Lossless compression replaces repeated patterns with shorter references. Already-random or already-compressed data (encrypted blobs, JPEGs) barely shrinks — a useful tell: high-entropy data that won't compress is often encrypted or already packed.
.tar.gz), exfil bundles, source dumps. Knowing how to list and extract them is routine.../../etc/cron.d/x can write outside the extraction directory if the extractor doesn't sanitise paths — a real vulnerability class.tar archives, gzip/xz/zip compress, and high-entropy data resists compression. Archive handling is everyday work, and extractors must defend against zip-slip path traversal and zip bombs.