linux-sysprog · intermediate · ~10 min

Demonstrate a race condition

Confirm by experiment that `counter++` is not atomic — and that the result depends on thread scheduling.

Challenge

See a race condition with your own eyes

Two threads. One shared counter. Each thread bumps the counter a few hundred thousand times. The expected total is the sum — but the actual total is almost always less. That's a race condition, and seeing it once cures every "++ is atomic, right?" misconception forever.

Task

Implement:

long unsafe_counter(int nthreads, int per_thread);

The function should spawn nthreads pthreads, have each of them increment a shared long counter per_thread times without any lock, join them all, and return the final counter value.

Function signature

long unsafe_counter(int nthreads, int per_thread);

Input

  • nthreads: how many threads to spawn (the harness uses 1, 4, 8).
  • per_thread: how many increments each thread performs.

Output

  • The final value of the shared counter.
  • For nthreads = 1, no race is possible — return value equals per_thread exactly.
  • For nthreads >= 2, the return value is between 0 and nthreads * per_thread, almost never the ideal.

Examples

nthreads per_thread expected (no race) observed (race)
1 1000 1000 1000 (no race)
4 100000 400000 usually less
8 50000 400000 usually less

Edge cases

  • nthreads <= 0 → return -1.
  • nthreads > 64 → return -1.
  • per_thread == 0 → return 0.

Hints

  1. Conceptual: counter++ looks atomic but is actually read-modify-write. Each thread can read the same value and overwrite the same update — losing it.
  2. Implementation: a global long g_counter = 0, a worker that loops per_thread times doing g_counter++, then standard pthread_create + pthread_join.
  3. Common bug: using volatile and thinking it makes the increment safe. volatile only stops the compiler from caching — it does nothing for atomicity.

Common mistakes

  • Believing _Atomic long is needed to show the race — this exercise wants the broken version.
  • Forgetting to reset the global counter between test runs.

Learning connection

This is the bug that produced years of "intermittent" crashes in pre-2010 C codebases. The cure (pthread_mutex_t) is the next exercise: fix-race-with-mutex.

Pro tip

Run your solution under gcc -fsanitize=thread to see the race detected explicitly. ThreadSanitizer is the secret weapon of every C concurrency author.

Why this matters

One run shows you the bug. After that, you stop trusting ++ on shared data for the rest of your career.

Input format

Two integers.

Output format

One long — the final counter value.

Constraints

Do NOT add a mutex. The whole point is to observe the race.

Starter code

#include <pthread.h>
long unsafe_counter(int nthreads, int per_thread) {
    /* TODO: spawn threads that increment a shared counter WITHOUT a mutex.
       That's the whole point — observe the race. */
    return -1;
}

Common mistakes

Adding a mutex (defeats the point). Allocating per-thread counters (also defeats the point). Forgetting to reset the global on each call.

Edge cases to handle

Single-threaded run: no race. Very small loop count: race rare. Very large: race almost guaranteed.

Background lessons

Up next

Solve this exercise in the browser editor — compile and run against the test harness, no setup required.