cybersecurity · beginner · ~20 min

Bidirectional %XX URL encoding with an allow-list

Per-byte allow-list encoding — the RFC 3986 unreserved set.

Challenge

Implement int url_encode(const char *in, char *out, int cap).

For each byte of in:

  • If it's [A-Za-z0-9-._~] (the RFC 3986 'unreserved' set), copy as-is.
  • Otherwise emit three bytes: %XX where XX is the byte's value in uppercase hex.

NUL-terminate out. Return the bytes written (excluding NUL), or -1 if the result wouldn't fit in cap.

Why this matters

Every byte that crosses an HTTP URL boundary must be %XX-encoded if it's not in the safe set. Doing it correctly is the most basic step in any URL builder.

Input format

String + output buffer + cap.

Output format

Encoded length or -1.

Constraints

Uppercase hex; the unreserved set ONLY.

Starter code

int url_encode(const char *in, char *out, int cap) { /* TODO */ (void)in; (void)out; (void)cap; return -1; }

Common mistakes

Encoding ~ (it's in the safe set). Lowercase hex (some validators reject).

Edge cases to handle

Empty input. All-safe input. All-unsafe input. cap=0.

Complexity

O(n).

Background lessons

Up next

Solve this exercise in the browser editor — compile and run against the test harness, no setup required.