thecodingidiot.com

The ToolkitPointers

Pointers

Every function in libtci takes at least one pointer. tci_strlen takes a const char *. tci_memset takes a void *. tci_strdup returns a char *. Before writing any of them, you need to understand what a pointer actually is — not the syntax, but the concept.

The tci_* names that appear throughout this page are functions you will write in the pages that follow. They are named here to give the pointer concepts a concrete destination — not because you need to implement them now.

In f05 you used pointers several times. char **argv was described as "array of strings, read it as that for now." FILE *in was something you passed to fopen and fgets without thinking about it. &lines was passed to lines_init and something happened. This page is where "for now" ends.

Variables live in memory

When you declare a variable:

int count;
count = 7;

the compiler reserves space in memory — in this case, 4 bytes for an int on most platforms — and associates the name count with that location. When you write count = 7, the value 7 is stored in those 4 bytes. When you read count in an expression, the compiler fetches the value from those 4 bytes.

The location in memory has an address — a number that identifies exactly which bytes are being used. Think of it as a house number: the address tells you where the variable lives, not what its value is. The hex values in the Valgrind output from f05/040x484F428, 0x109289, 0x4a8b045 — are addresses. That is what they look like.

The & operator

The & operator gives you the address of a variable:

int  count;
int  *p;
 
count = 7;
p = &count;

&count is "the address of count." The type of &count is int * — a pointer to an int. In the declaration, int *p means "p is a variable that holds a pointer to an int." Writing p = &count stores the address of count inside p.

Two things to notice about the * character. In the declaration int *p, the * is part of the type — it means "pointer to". In an expression, *p means something different: it follows the pointer and gives you the value stored there.

Reading and writing through a pointer

printf appears in these examples because tci_printf — the version you will build in c02 — does not exist yet. From c02 onward, use tci_printf.

When p holds an address, *p is "the thing at that address":

int  count;
int  *p;
 
count = 7;
p = &count;
printf("%d\n", *p);    /* prints 7 */
*p = 10;
printf("%d\n", count); /* prints 10 */

*p in an expression is the dereference operator. It follows the pointer and gives you the value stored at the address. Writing *p = 10 follows the pointer and writes 10 there — which changes count, because p holds count's address.

The * has two meanings: in a declaration it builds the pointer type; in an expression it follows a pointer. The context always makes it clear which one is in use.

Why pointers exist

The most common reason to use a pointer is to let a function modify a variable in its caller. Without pointers, every function receives a copy of its arguments — the scope rules from f04/04 mean each function owns its parameters. Modifying that copy does not affect the original:

void add_one(int n)
{
    n = n + 1; /* modifies the copy, not the caller's variable */
}
 
int main(void)
{
    int x;
 
    x = 5;
    add_one(x);
    printf("%d\n", x); /* still 5 */
    return (0);
}

With a pointer, the function receives an address and can write directly to the memory it points at — wherever that memory lives:

void add_one(int *n)
{
    *n = *n + 1; /* modifies whatever n points at */
}
 
int main(void)
{
    int x;
 
    x = 5;
    add_one(&x);        /* pass the address of x */
    printf("%d\n", x);  /* 6 */
    return (0);
}

add_one now takes int *n — a pointer to an int. The caller passes &x — the address of x. Inside the function, *n is x, so *n = *n + 1 increments x directly. This pattern appears constantly in C. scanf uses it. The lines_init in f05/06 used it.

NULL

NULL is the zero address. It means "this pointer points at nothing." Every pointer variable that does not yet point at anything valid should be set to NULL:

int *p;
 
p = NULL;

Dereferencing NULL — writing *p when p is NULL — crashes the program immediately. That crash is a feature, not a bug: it tells you exactly where the bad access happened, which is easier to debug than reading garbage or writing to random memory. In libtci, functions that allocate memory return NULL on failure, and callers check for it before dereferencing.

Pointer arithmetic

Adding an integer to a pointer moves it forward by that many elements, where the element size is determined by the pointer's type:

int  arr[3];
int  *p;
 
arr[0] = 10;
arr[1] = 20;
arr[2] = 30;
 
p = arr;                   /* p points at arr[0] */
printf("%d\n", *p);        /* 10 */
printf("%d\n", *(p + 1));  /* 20 — one int forward */
printf("%d\n", *(p + 2));  /* 30 — two ints forward */

p + 1 does not add 1 to the address; it adds sizeof(int) bytes — typically 4. The compiler knows the type of p and scales the arithmetic accordingly. A char * advanced by 1 moves one byte. An int * advanced by 1 moves four bytes. This is why the type of a pointer matters: the same arithmetic means different things for different types.

Arrays and pointers

An array is a block of contiguous memory divided into fixed-size slots. You have seen them already — char buf[4096] in the sort program, rndtable[256] in the guessing game — but without examining the layout. Each slot holds one element of the declared type, and the slots sit immediately next to each other in memory, one after the other. That layout is what connects arrays to pointers.

An array name, used in an expression, is a pointer to its first element:

int  arr[3];
 
arr[0] = 10;

arr by itself is equivalent to &arr[0] — the address of the first element. The subscript syntax arr[i] is defined as *(arr + i): advance the pointer by i elements and dereference. These two lines do identical things:

printf("%d\n", arr[1]);     /* subscript notation */
printf("%d\n", *(arr + 1)); /* pointer notation */

You will use subscript notation throughout libtci. It reads more clearly. But understanding that s[i] is *(s + i) is important when you reach functions like tci_memcpy that walk two pointers forward in parallel.

char * as a string

In C, a string is a sequence of char values ending with \0 (the null terminator, the byte with value zero). A char * pointing at the first character, together with the convention that \0 marks the end, is a string:

char  *greeting;
 
greeting = "hello"; /* points at 'h'; memory is h e l l o \0 */

It is declaring that greeting is a pointer to a single char — the first one. The null terminator is what tells functions like tci_strlen where the string ends. Without it, there would be no way to know: the pointer alone only gives you the start.

const with pointers

Two of the most common pointer declarations in libtci use const:

const char *s
char * const p

const char *s means "a pointer to constant char" — you can change where s points, but you cannot change the characters it points at through s. This is what tci_strlen takes: it promises not to modify the string.

char * const p means "a constant pointer to char" — you cannot change where p points, but you can change what it points at. The distinction is about what is protected: const char *s protects the data (the pointer can move, but the characters cannot be written through it), while char * const p protects the pointer itself (it cannot be reassigned, but the characters it addresses can be written). A fixed base into a buffer is the natural use:

char          buf[64];
char * const  base = buf;
 
base[0] = 'h'; /* allowed — modifying what base points at */
base = NULL;   /* compile error — base cannot be reassigned */

In a function parameter this is almost never useful. The pointer is already a local copy, so restricting reassignment only constrains the function's own code. tci_memset and tci_memcpy walk their working pointer through the buffer, which requires reassignment — so they take plain void *.

When you see const char * in a function parameter, read it as "I will not modify this string." It is a promise from the function to its caller.

Double pointers

A pointer is just another variable. It has a type, it lives somewhere in memory, and it has an address. If you need a function to modify a pointer variable in its caller, you apply exactly the same reasoning as before: pass a pointer to it.

A pointer to a pointer is written **. The pattern is the same as add_one above — just one level deeper:

void set_string(char **s)
{
    *s = "hello"; /* modifies the char * variable in the caller */
}
 
int main(void)
{
    char  *str;
 
    str = NULL;
    set_string(&str);
    printf("%s\n", str); /* hello */
    return (0);
}

str is a char *. &str is its address — type char **. Inside set_string, *s is the char * variable back in main. Writing *s = "hello" changes what str holds.

The same pattern applies to any pointer type. You will encounter it again with other pointer types in later chapters.

char **argv in main is the same idea from a different angle. Each command-line argument is a string — a char *. The full list of arguments is an array of those strings. argv is a pointer to the first element of that array, so its type is char **: pointer to char *. argv[0] is the first char *, argv[1] is the second, and so on.

Triple pointers (***) exist but are rare. In libtci and the c-tier, you will only encounter **.

What this means for libtci

Every tci_* function that works on strings takes const char * for anything it reads and char * for anything it writes. Every function that needs to modify a pointer variable in its caller takes char **. Every size and count is size_t. Every function that can fail returns NULL on failure.

You now have the foundation. The next page builds the first three functions of the library.