linux-sysprog · intermediate · ~45 min
Stateful character-by-character parsing with multiple modes (in-word / in-quote).
Implement int shell_split(const char *line, char **argv, int max_argv):
"..." as one token (without the quote characters).argv[0..ret-1], NULL-terminates after.line? No — allocate each token via strdup. Caller frees.A POSIX shell is at heart a tokenizer + a fork/exec loop. The trickiest bit is splitting a command line into argv while respecting quotes — every shell on every UNIX-like OS does this dance, and getting it right is a great pointer/string exercise.
line is a null-terminated ASCII string. argv has space for max_argv pointers.
Number of tokens written, or -1 on parse error.
Use a small state machine. Don't use strtok (it would mangle the input).
#include <stddef.h>
int shell_split(const char *line, char **argv, int max_argv) { /* TODO */ return 0; }
Mishandling escape sequences (we don't require them — keep it simple); forgetting to NULL-terminate argv; not freeing tokens on the error path.
Empty line returns 0. Trailing whitespace. Two consecutive spaces. Quoted empty string "".
O(n) time, O(tokens) memory.
Solve this exercise in the browser editor — compile and run against the test harness, no setup required.