Before a pipe can connect two programs, both programs must exist as
running processes. This page covers the three system calls that make
that happen: fork, execve, and waitpid.
What is a process
Every running program is a process. The kernel assigns it a process ID
(PID), an independent address space, and a set of file descriptors. When
you run ./pipeline, the shell forks a child process — a copy of itself
— and that child calls execve to replace its memory image with
pipeline.
Processes form a tree. Every process has a parent. When a parent exits
before its child, the child is reparented to PID 1 (the init process).
When a child exits before its parent, the kernel keeps the child's exit
status in a table until the parent reads it — that is a zombie process.
waitpid is how the parent reads and clears it.
Inspecting processes
The kernel exposes every running process through /proc — a virtual
filesystem mounted at boot. Nothing in /proc lives on disk; the
kernel generates each entry on demand when you read it. Every process
gets a directory at /proc/<PID>/.
ls /proc/$$$$ expands to the current shell's PID. The directory contains files
the kernel generates on demand:
| Entry | Contents |
|---|---|
status | Name, state, PID, PPID, memory usage |
fd/ | One symlink per open file descriptor |
maps | Memory regions — address ranges, permissions, backing files |
cat /proc/$$/status
ls -l /proc/$$/fdps -p $$ shows the same process in a human-readable table — the same
ps from f01/08,
reading /proc directly. top and htop do the same across all
processes. Both are interfaces over the same virtual filesystem.
The fd/ directory becomes relevant starting on the next page. After
fork and dup2, you can inspect /proc/<PID>/fd/ on a running child
to see exactly which file descriptors are open and what they point to.
c05/08 uses strace -f to make the same transitions visible as system
calls in real time.
fork
Unix has no primitive to create a process from nothing. The only way to
start a new program is to split an existing process in two, then have
the child replace itself. Every command you type in a shell — ls,
grep, cat — starts that way: the shell forks, and the child calls
execve to become the new program. That is the pattern this chapter
builds. For the pipeline, it happens once per command in the chain.
fork creates an exact copy of the calling process. Both the parent
and the child return from fork — in the parent, fork returns the
child's PID; in the child, it returns zero. On failure it returns -1
and no child is created.
#include <unistd.h>
pid_t pid;
pid = fork();
if (pid < 0) {
perror("fork");
exit(1);
}
if (pid == 0) {
/* child: pid == 0 */
} else {
/* parent: pid == child's PID */
}The child inherits a copy of the parent's memory, file descriptors, and signal handlers. It is a copy — changes in the child do not affect the parent. The file descriptor table, however, is shared at the kernel level: if both parent and child hold an open file descriptor for the same pipe end, both must close it before the pipe signals EOF.
execve
execve replaces the current process's memory image with a new program.
If it succeeds, it never returns — the code after the call is gone.
If it fails, it returns -1 and the process continues.
#include <unistd.h>
char *argv[] = { "ls", "-l", NULL };
char *envp[] = { NULL };
execve("/bin/ls", argv, envp);
perror("execve"); /* only reached on failure */
exit(127);Three arguments:
- path — the absolute or relative path to the executable
- argv — a NULL-terminated array of argument strings;
argv[0]is conventionally the program name - envp — a NULL-terminated array of
KEY=VALUEstrings; passNULLor the process's ownenvironto preserve the environment
Exit code 127 after a failed execve is the shell convention for
"command not found". pipeline uses it the same way.
PATH search
execve requires an absolute path. The shell knows ls means
/bin/ls because it searches $PATH — the variable introduced in
f01/09.
PATH is a colon-separated list of directories:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binTo find a command, split PATH on : and check each directory.
Joining a directory and a command name into a path is a two-length
allocation — tci_strlen(dir) + 1 + tci_strlen(cmd) + 1 — followed
by two tci_strcpy calls:
#include <stdlib.h>
#include <unistd.h>
#include "libtciutil.h"
static char *join_path(const char *dir, const char *cmd)
{
char *full;
size_t dlen;
size_t clen;
dlen = tci_strlen(dir);
clen = tci_strlen(cmd);
full = malloc(dlen + 1 + clen + 1); /* +1 for '/', +1 for '\0' */
if (!full)
return (NULL);
tci_strcpy(full, dir);
full[dlen] = '/';
tci_strcpy(full + dlen + 1, cmd);
return (full);
}
char *find_cmd(char *cmd)
{
char **dirs;
char *path;
char *full;
int i;
path = getenv("PATH");
if (!path)
return (NULL);
dirs = tciu_split(path, ':');
if (!dirs)
return (NULL);
i = 0;
while (dirs[i]) {
full = join_path(dirs[i], cmd);
if (full && access(full, X_OK) == 0) {
/* free dirs array before returning */
return (full);
}
free(full);
i++;
}
/* free dirs array */
return (NULL);
}access(path, X_OK) returns 0 if the file exists and is executable —
no need to attempt the exec to check. If no directory in PATH
contains the command, find_cmd returns NULL and the caller exits
127.
A minimal exec wrapper
Put the PATH search and exec together in exec.c:
#include <unistd.h>
#include <stdlib.h>
#include "libtciutil.h"
static char *join_path(const char *dir, const char *cmd)
{
char *full;
size_t dlen;
size_t clen;
dlen = tci_strlen(dir);
clen = tci_strlen(cmd);
full = malloc(dlen + 1 + clen + 1); /* +1 for '/', +1 for '\0' */
if (!full)
return (NULL);
tci_strcpy(full, dir);
full[dlen] = '/';
tci_strcpy(full + dlen + 1, cmd);
return (full);
}
static char *find_in_path(char *cmd)
{
char **dirs;
char *path;
char *full;
int i;
path = getenv("PATH");
if (!path)
return (NULL);
dirs = tciu_split(path, ':');
if (!dirs)
return (NULL);
i = 0;
full = NULL;
while (dirs[i] && !full) {
full = join_path(dirs[i], cmd);
if (full && access(full, X_OK) != 0) {
free(full);
full = NULL;
}
i++;
}
i = 0; /* restart from 0 — the search loop may have stopped mid-array */
while (dirs[i])
free(dirs[i++]);
free(dirs);
return (full);
}
void exec_cmd(char **argv)
{
char *path;
char *envp[] = { NULL }; /* stripped env — programs that read PATH or HOME will fail */
if (!argv || !argv[0])
exit(1);
path = find_in_path(argv[0]);
if (!path) {
tci_printf("pipeline: command not found: %s\n", argv[0]);
exit(127);
}
execve(path, argv, envp);
perror(path); /* execve failed */
free(path);
exit(127);
}exec_cmd never returns on success. The caller forks first, then calls
exec_cmd in the child. In the parent, execution continues after the
fork.
waitpid
waitpid suspends the calling process until the specified child exits.
It fills in a status integer that encodes how the child terminated.
#include <sys/wait.h>
int status;
pid_t result;
result = waitpid(pid, &status, 0); /* 0: block until child exits */
if (result < 0)
perror("waitpid");WIFEXITED(status) is true when the child exited normally. In that
case, WEXITSTATUS(status) extracts the exit code — the same value
as $? in the shell from
f01/08,
accessed here in C instead of the shell.
if (WIFEXITED(status)) /* false if the child was killed by a signal */
tci_printf("exit code: %d\n", WEXITSTATUS(status));Calling waitpid(-1, &status, 0) waits for any child — useful when
you have forked several children and want to collect them all.
In g01b/05,
SDL_PollEvent replaced blocking read — it returned immediately
with or without an event. waitpid has the same switch: pass WNOHANG
as the third argument and it returns immediately, yielding 0 if no
child has finished yet. The pattern is the same as the game loop —
non-blocking check, act on what is ready, continue.
A single-command test
Update main.c to fork one child and exec the command from argv:
#include <unistd.h>
#include <sys/wait.h>
#include "libtciutil.h"
void exec_cmd(char **argv);
int main(int argc, char **argv)
{
pid_t pid;
int status;
if (argc < 2) {
tci_printf("usage: ./pipeline cmd [args]\n");
return (1);
}
pid = fork();
if (pid < 0) {
perror("fork");
return (1);
}
if (pid == 0)
exec_cmd(argv + 1); /* child: exec argv[1] with remaining args */
waitpid(pid, &status, 0);
if (WIFEXITED(status))
return (WEXITSTATUS(status));
return (1);
}Build and test:
make re
./pipeline ls -l
./pipeline echo hello world
./pipeline nonexistent_command
echo $?ls -l lists the directory. echo hello world prints the arguments.
nonexistent_command exits with code 127.
The binary forks, execs, and waits. That is the complete process lifecycle. The next page connects two of these with a pipe.