./sort test.txt runs and prints sorted output. Looks correct.
Looks is the operative word — the program is leaking memory on
every run, and you have no way to see it without help.
This is what valgrind is for.
Running under valgrind
valgrind runs your program inside a virtual machine of its own
making. It watches every read, every write, every malloc, and
every free. When the program exits, it tells you what was wrong.
It is slower than running the program directly — sometimes 10× to
50× — but for memory bugs there is nothing better.
Try the program under it now, with the leak-checking option turned all the way up:
valgrind --leak-check=full ./sort test.txtThe sorted output prints as usual. Mixed into it, and after it,
valgrind adds reports of its own:
==12345== Invalid write of size 1
==12345== at 0x484F428: strcpy (vg_replace_strmem.c:566)
==12345== by 0x109289: main (sort.c:46)
==12345== Address 0x4a8b045 is 0 bytes after a block of size 6 alloc'd
==12345== at 0x484880F: malloc (vg_replace_malloc.c:442)
==12345== by 0x10930E: main (sort.c:45)
==12345==
==12345== HEAP SUMMARY:
==12345== in use at exit: 17 bytes in 3 blocks
==12345== total heap usage: 4 allocs, 1 frees, 145 bytes allocated
==12345==
==12345== 17 bytes in 3 blocks are definitely lost in loss record 1 of 1
==12345== at 0x484880F: malloc (vg_replace_malloc.c:442)
==12345== by 0x10930E: main (sort.c:45)
==12345==
==12345== LEAK SUMMARY:
==12345== definitely lost: 17 bytes in 3 blocks
==12345== indirectly lost: 0 bytes in 0 blocks
==12345== possibly lost: 0 bytes in 0 blocks
==12345== still reachable: 0 bytes in 0 blocks
==12345== suppressed: 0 bytes in 0 blocksThe number prefixes (==12345==) are the process ID — valgrind's
way of marking its own output so you can tell it from the program's.
Strip those off mentally and read the rest.
Two distinct complaints. Invalid write of size 1 — the program
wrote one byte to memory it did not own. And definitely lost: 17
bytes — the program asked malloc for some memory, then lost
track of where it was before ever calling free on it.
We will fix the leak on this page because it is the simpler one to read off the report. The invalid write is the third planted bug; the next page introduces a tool that makes it impossible to ignore.
Reading the leak report
The most important line in the leak section is definitely lost:
17 bytes in 3 blocks. To understand what that means, picture how
malloc works: you ask it for some bytes, and it hands you back
the address of those bytes — a pointer.
From that moment on, the only way you can ever free that memory
is to give that exact address to free. If you overwrite the
variable holding the address, or return from the function without
saving it, or in any other way lose track of where those bytes are,
you have leaked them.
The memory is still allocated; the operating system still has
it reserved for the program. But the program has no pointer left to
hand to free. Nothing reclaims that memory until the process
itself exits, at which point the OS releases everything at once.
With three lines of input we leaked three blocks — one per line.
Above the leak summary, valgrind tells you exactly where the
leaked memory came from:
==12345== 17 bytes in 3 blocks are definitely lost in loss record 1 of 1
==12345== at 0x484880F: malloc (vg_replace_malloc.c:442)
==12345== by 0x10930E: main (sort.c:45)This is a stack trace, same shape as gdb's backtrace. The at
line is where the allocation happened (inside malloc); the by
line is who called malloc. sort.c:45 is the line in our code:
copy = malloc(strlen(buf));That line allocates a fresh block for each line. We never free those blocks.
The bug
Look at the cleanup code:
free(lines);free(lines) releases the array — the block of char * pointers.
But each of those pointers points at another block, allocated
with malloc(strlen(buf)). Freeing the outer array does not free
the inner blocks. They are still allocated, but we have just thrown
away the only pointers we had to them.
The fix is to free each entry first, then the array:
for (i = 0; i < count; i++)
free(lines[i]);
free(lines);Recompile and rerun under valgrind:
gcc -Wall -Wextra -g sort.c -o sort
valgrind --leak-check=full ./sort test.txtThe leak summary should now read:
==12345== HEAP SUMMARY:
==12345== in use at exit: 0 bytes in 0 blocks
==12345== total heap usage: 4 allocs, 4 frees, 145 bytes allocated
==12345==
==12345== All heap blocks were freed -- no leaks are possibleThat is the line you want to see at the end of every valgrind
run: all heap blocks were freed. If you ever do not see it,
something allocated the program never released.
The invalid-write report is still in the output. We are not done.
The leak rule
Every malloc needs a matching free. If malloc happens inside
a loop — which it does for sort — free has to happen inside its
own loop on the way out. C gives you direct control over memory:
every allocation is yours from the moment malloc returns until
the moment you call free on it. You allocate it, you own it, you
free it.
Two bugs down. The third one is the invalid write valgrind is
still reporting. We could fix it from valgrind's output alone,
and many programmers do. But the next page introduces a different
tool — AddressSanitizer — that ships the same diagnostics in a
form that crashes the program the instant the bad write happens,
with a stack trace and a labelled diagram of which allocation got
written past. It is the modern equivalent of valgrind's
memory checking, and the next page meets it on its own ground.