Networking in C · intermediate · ~15 min

Parse an HTTP/1.1 request line in C

Walk an HTTP request line and pull out method, path, and version into a struct.

Overview

Three tokens, bounded copies into a struct, reject on overflow or bad terminator.

Why it matters

The request line is where every HTTP attack starts. A bounded, allow-list parser stops the easy ones at the door.

Lesson

Why this matters

Every web proxy, every WAF, every reverse-proxy access log starts the same way: read the first line of an HTTP request and pull out three tokens — method, request-target, version. The line is plain ASCII ending in \r\n. The whole protocol is text-driven so a C parser is small and worth reading.

This is the parser side of Burp / mitmproxy / nginx's access log — we just write it ourselves.

What the wire looks like

GET /index.html HTTP/1.1\r\n
Host: example.com\r\n
\r\n

The request line is three space-separated tokens, then \r\n.

Your job

Implement int parse_request_line(const char *buf, http_req_t *out) where http_req_t has bounded method[8], path[256], version[16] char arrays. Return 0 on success, -1 on any malformed input.

Rules

  • Reject if any field would overflow its bound. No strcpy without bounds.
  • Reject if the line does not end in \r\n.
  • The path can contain /, alphanumerics, ?, &, =, ., -, _. Reject anything else for this exercise.

Common mistakes

  • Using sscanf("%s %s %s", ...) without length specifiers. That's an uncontrolled write.
  • Forgetting the \r\n terminator check.
  • Allowing absurdly long paths because the per-field bound was never enforced.

What this is NOT

  • A full HTTP parser. Headers, body, chunked encoding — all skipped.
  • A smuggling detector. That lives in parse-http-smuggling-defence.

Summary

Walk the line, split on spaces, copy into fixed-size fields with explicit length checks.

Practice with these exercises