Secure Coding in C · intermediate · ~20 min
Write parsers that refuse malformed input early and obviously.
Defensive parsing is strict by construction. You write a grammar; the parser refuses anything outside it; on rejection it returns a clear error code and frees any partial state.
Parser bugs ship in every CVE-list category: HTTP request smuggling, XML XXE, JSON prototype pollution (in C extensions), file-format escape, archive zip-slip. Strict parsing is the cheapest fix.
Cap before parse. Read at most MAX_INPUT bytes. Refuse longer.
Tokenize, don't regex. Tokens have explicit grammars; regexes hide bugs.
Default reject. Every switch / branch should end with a default: return -1; that explicitly refuses unknown input.
Pentester mindset. When two parsers disagree about what a piece of input means, you have a vulnerability (request smuggling, HTTP/2 desync, etc.). Strictness reduces the disagreement surface.
Defensive coding habit. Fail closed: on any error, free partial state and return a non-zero code. Never carry on with corrupted intermediate state.
See state-machines lesson for the FSM skeleton; see input-validation for the boundary discipline.
A safe parser is one that: (1) caps input length; (2) refuses anything not in its grammar; (3) reports failure with a specific error code; (4) leaves the program in a clean state on failure. The opposite — 'parse what you can, ignore the rest' — is the source of most parser CVEs.
int parse_strict(const char *s, ...){
if (!s || strlen(s) > MAX_INPUT) return -1;
/* ... parse, refusing anything unexpected ... */
}
int parse_strict_int(const char *s, int lo, int hi, int *out){
if (!s || !*s) return -1;
char *end;
long v = strtol(s, &end, 10);
if (*end != '\0') return -1; /* trailing garbage — reject */
if (v < lo || v > hi) return -1; /* out of range — reject */
*out = (int)v;
return 0;
}
Hand-roll a corpus of malformed inputs and run your parser against each; assert each is rejected.
Always free partial allocations on the failure path. A common bug is allocating into a struct, hitting a parse error, and returning without freeing.
HTTP parsers, config-file readers, packet decoders, image-format loaders, every interface between your program and untrusted bytes.
Cap, tokenize, default-reject, fail closed. The four-line recipe for parsers that don't ship CVEs.