cybersecurity · intermediate · ~15 min · safe pentest lab

Count suspicious IPs in a sample log

Walk a text buffer line by line, group by a key, count occurrences, and threshold the result.

Challenge

Spot the noisy IPs in an access log

You're sitting in front of a small access-log buffer that an HTTP server kept. Most lines are normal traffic — one or two visits per IP. A handful of IPs hit the server far more often. Your job is to count them.

Real-world frame

This is the bread-and-butter of any defensive log review: before you go hunting for what an attacker did, you scan for who's making a suspicious amount of noise. Brute-force tools, dumb scanners, and crawlers all show up as one IP with an outsized request count.

Task

Implement:

int suspicious_ip_count(const char *log, int threshold);

It should return the number of distinct source IPs that appear at least threshold times in the buffer.

Input

  • log: a single multi-line string of Common-Log-Format-style lines. Each line starts with an IP, followed by a space, followed by the rest of the line.
  • threshold: the minimum request count for an IP to be flagged.

Output

  • Return the count of IPs whose appearance count is >= threshold.

Examples

Log threshold returns
10.0.0.1 ...\n10.0.0.1 ...\n10.0.0.2 ... 2 1 (only 10.0.0.1)
same log 1 2
same log 5 0
"" (empty) 1 0

Edge cases

  • Empty log buffer → 0.
  • A line with no space → skip it.
  • Same IP appearing many times: count once toward "distinct".
  • An IP exactly at the threshold counts.

Rules

  • Operate on the string buffer the grader passes — never read from a real file.
  • Don't allocate dynamic memory if you can help it; a small fixed-size hash table is fine.

Why this matters

The same five-line function is the heart of every brute-force-detection tool ever written. Get it right once and you've defended a thousand servers.

Input format

A NUL-terminated multi-line string + an integer threshold.

Output format

A non-negative integer count.

Constraints

No file I/O. No network. Pure string + table logic.

Starter code

#include <stdio.h>
#include <string.h>

int suspicious_ip_count(const char *log, int threshold) {
    /* TODO: count distinct source IPs that appear at least `threshold` times. */
    return 0;
}

Common mistakes

Re-scanning the whole table for every line and accidentally adding duplicate rows. Forgetting the terminating \n is optional on the last line. Reading past the end of the buffer when the IP is at the very end without a space.

Edge cases to handle

Empty log. A single line without a trailing newline. An IP repeated more than sizeof tab[0].ip characters (malformed input — skip it).

Solve this exercise in the browser editor — compile and run against the test harness, no setup required.