file-handling · beginner · ~10 min

Detect a byte-order mark

Recognise BOM byte signatures with correct ordering.

Challenge

Your job

#include <stdint.h>
#include <stddef.h>
int detect_bom(const uint8_t *buf, size_t n);

Return: 1 UTF-8 (EF BB BF), 2 UTF-16LE (FF FE), 3 UTF-16BE (FE FF), or 0 for no BOM (including NULL or too-short input).

Hints

  1. Check the 3-byte UTF-8 BOM first.
  2. Then the 2-byte UTF-16 BOMs.

Why this matters

A leading BOM silently corrupts parsing if you don't skip it; detecting one is the first step in robust text loading.

Starter code

#include <stdint.h>
#include <stddef.h>
int detect_bom(const uint8_t *buf, size_t n) {
    /* TODO */
    (void)buf; (void)n;
    return 0;
}

Common mistakes

Swapping LE/BE. Checking the 2-byte BOM before the 3-byte one. Reading past a short buffer.

Edge cases to handle

No BOM. Only one byte present. NULL.

Complexity

O(1).

Background lessons

Up next

Solve this exercise in the browser editor — compile and run against the test harness, no setup required.