Secure Coding in C · intermediate · ~10 min

Input validation

Reject invalid input loudly, early, and explicitly.

Overview

Input validation is the discipline of rejecting malformed or malicious data at the boundary, before it flows deeper into your program. Every untrusted source — command-line args, environment variables, file contents, HTTP requests, sockets — must pass through a validator.

Why it matters

All injection attacks (SQL, command, format-string, XSS, CSRF) start with input that wasn't validated. Beyond security, validation also kills entire bug classes — your code below the boundary can trust ranges, formats, and lengths, simplifying every downstream check.

Core concepts

Allow-lists beat deny-lists. Specify what's valid, not what's forbidden — there are infinitely more invalid inputs than valid ones. Validate types. Integers in range, strings of bounded length, IPs in canonical form. Canonicalise first. Two filenames that mean the same thing (./foo and foo) should compare equal — canonicalise before allow-listing. Fail closed. If validation fails, refuse the operation; never 'best-effort' your way through.

Syntax notes

int valid_port(const char *s) {
    char *end;
    long n = strtol(s, &end, 10);   // parse
    if (*end != '\0') return 0;     // trailing garbage
    if (n < 1 || n > 65535) return 0; // out of range
    return 1;
}

Lesson

Trust no input. Validate length, character set, and semantics before acting on data. Prefer strict allowlists (only digits) over blocklists (no semicolons).

A failed validation should produce a clear error and stop processing — never "fix up" malicious input silently. That's how injection bugs slip through.

Code examples

int parse_safe_int(const char *s, int *out) {
    if (!s || !*s) return -1;
    int neg = 0;
    if (*s == '-') { neg = 1; s++; if (!*s) return -1; }
    int n = 0, d = 0;
    while (*s >= '0' && *s <= '9') { n = n*10 + (*s - '0'); s++; d++; }
    if (d == 0 || *s) return -1;   // need at least one digit and no trailing junk
    *out = neg ? -n : n;
    return 0;
}

Debugging tips

Log every rejected input with its reason. Run your validator against a corpus of malicious examples (a 'fuzz test'). If you have a parse_or_default function, audit every call to make sure 'default' is safe.

Memory safety

Length validation is itself a memory-safety check. if (strlen(s) >= sizeof dst) return -1; saves you from buffer overflows before they happen.

Real-world uses

URL routing, SQL parameter binding, command-line argument parsing, packet header validation in firewalls, config file loaders, every signup form.

Practice tasks

  1. Write int valid_username(const char *) that allows lowercase, digits, _, - only, length 3..32. 2. Write a function that parses an IPv4 address strictly. 3. Reject paths containing .. or absolute paths in a 'safe' file open.

Summary

Input validation is the boundary discipline. Allow-list, canonicalise, fail closed. Done well, it eliminates whole classes of bugs and security vulnerabilities — done poorly, it's the first crack in every exploit chain.

Practice with these exercises