File Handling · beginner · ~15 min
Stream-process a log file and aggregate per-key statistics.
Log parsing is the most common task in operational pen testing and incident response. The recipe: open file → loop with fgets → classify each line → update counters → at end-of-stream emit summary.
Logs are the historic record. They contain forensic evidence, anomaly markers, and the trail of any incident. Reading them fast and accurately is a daily skill for defenders and red-teamers alike.
Stream, don't slurp. Read line-by-line; never load multi-gigabyte logs into memory.
Strip CR/LF. fgets keeps the trailing newline. Strip it before any string compare.
Cap line length. Cap at e.g. 4 KB to refuse malicious oversized lines.
Pentester mindset. Attackers inject \r\n into fields they control (filename, User-Agent, etc.) to forge log lines. Detect by scanning user-controlled fields for control bytes before logging.
Defensive coding habit. Any field you log that came from outside must have CR/LF stripped first; this prevents log injection (CWE-117).
FILE *fp = fopen(path, "r");
char line[4096];
while (fgets(line, sizeof line, fp)){
line[strcspn(line, "\n")] = 0; /* strip newline */
/* classify / aggregate */
}
fclose(fp);
Logs are line-delimited text. A parser walks them one line at a time, classifies each line (info/warn/error, or by per-IP, per-user, per-route), and aggregates a counter or running statistic.
char line[4096];
while (fgets(line, sizeof line, fp)) {
if (strstr(line, "Failed password")) count_failed++;
}
static int extract_ip(const char *line, char *out, size_t cap){
const char *from = strstr(line, "from ");
if (!from) return 0;
from += 5;
const char *port = strstr(from, " port");
if (!port) return 0;
size_t n = (size_t)(port - from);
if (n + 1 > cap) return 0;
memcpy(out, from, n); out[n] = 0;
return 1;
}
Run your parser against an empty file, a one-line file, a file without a trailing newline, and a file with embedded NUL bytes. Each of those breaks naive parsers.
fgets always NUL-terminates. The size argument INCLUDES the NUL byte — fgets(buf, sizeof buf, fp) is the correct idiom.
SSH brute-force detection, web-server analytics, anomaly alerts, billing aggregation, incident-response timeline reconstruction.
Stream, classify, aggregate. Always cap line length; always strip CR/LF; always treat content as untrusted.