Computer & OS Fundamentals · beginner · ~12 min
Tell encoding apart from encryption and recognise ASCII, Unicode, Base64, and hex.
Encoding is reversible representation, not encryption. ASCII (0–127, 1 byte), Unicode/UTF-8 (all languages, 1–4 bytes), hex (base-16, 2 digits/byte), and Base64 (3 bytes→4 printable chars) are the four to recognise.
Pentesters constantly decode Base64/hex in tokens, traffic, and configs, and encoding tricks (UTF-8 normalisation) bypass naive filters. Mistaking encoding for encryption — e.g. "the password is Base64'd, so it's safe" — is a finding in itself.
Encoding ≠ encryption. No key, trivially reversible. ASCII. 0–127, one byte. UTF-8. Variable-length Unicode, ASCII-compatible; normalisation can defeat filters. Hex. Two digits per byte; hashes/dumps/packets. Base64. 3→4 printable chars; auth headers, JWTs, attachments.
Encoding represents data in a particular format. It is not encryption — there's no secret key, and anyone can reverse it. Confusing the two is a classic beginner (and audit) mistake.
Maps the basic English characters to numbers 0–127 (A=65, a=97, space=32). One byte per character.
Covers every character in every language. UTF-8 encodes Unicode as 1–4 bytes, staying ASCII-compatible for the first 128 code points. It's the web's default. Encoding tricks (overlong/normalised forms) can bypass naive input filters — a real web-security topic.
Base-16, digits 0-9a-f. Each byte is two hex digits (0xFF=255). Used everywhere binary must be shown as text — hashes, memory dumps, packet bytes. A "hex dump" shows raw bytes as hex plus printable ASCII.
Encodes arbitrary bytes using 64 printable characters (A-Za-z0-9+/), 3 bytes → 4 characters, sometimes =-padded. Used to carry binary in text channels: HTTP Basic auth, JWTs, email attachments, data URIs. Base64 is not encryption — dGVzdA== decodes to test instantly. Spotting base64 in traffic and decoding it is routine recon.
ASCII, UTF-8, hex, and Base64 are reversible encodings — representation, not protection. Recognising and decoding them is daily recon work, and treating Base64 as security is a real mistake to flag.