cybersecurity · beginner · ~15 min

Identify file type from the first bytes

Per-format magic-byte allow-list.

Challenge

Given the first 64 bytes of a file, return the file format as an integer:

  • 1 = PNG (\x89PNG\r\n\x1a\n)
  • 2 = JPEG (\xff\xd8\xff)
  • 3 = GIF (GIF87a or GIF89a)
  • 4 = PDF (%PDF-)
  • 0 = unknown

Implement int sniff_format(const unsigned char *buf, int len).

Why this matters

Upload pipelines that trust the file extension get owned. Magic-byte sniffing is the second line of defence.

Input format

Byte buffer + length.

Output format

Format code 0-4.

Constraints

Read at most the first 8 bytes; bound-check len.

Starter code

int sniff_format(const unsigned char *buf, int len) { /* TODO */ (void)buf; (void)len; return 0; }

Common mistakes

Matching PDF anywhere instead of at offset 0.

Edge cases to handle

Very short buffer; len < 4.

Complexity

O(1).

Background lessons

Up next

Solve this exercise in the browser editor — compile and run against the test harness, no setup required.