A C string is a sequence of char values with a zero byte at the
end. That zero byte — \0, the null terminator — is the entire
mechanism. There is no length stored anywhere. There is no struct
wrapping the characters. There is just the bytes, and the agreement
that the first zero marks the end.
This design has one important consequence: any function that needs to
know a string's length must walk the entire string to find it.
tci_strlen does not look up a stored number. It counts. Every time.
The implication runs through every string function you will write:
if you allocate space for a string, you must allocate length + 1
bytes to hold the null terminator. Forgetting the + 1 is the most
common string bug in C.
The null terminator
When you write a string literal in C:
char *s;
s = "hello";the compiler places six bytes in memory: h, e, l, l, o,
\0. The \0 at the end is not something you typed — the compiler
adds it. The pointer s holds the address of the h. Functions that
work with s read forward until they hit the \0.
You can also declare a string as an array and the \0 is still
required:
char name[6];
name[0] = 'D';
name[1] = 'o';
name[2] = 'o';
name[3] = 'm';
name[4] = '\0'; /* without this it is not a valid C string */\0 is how you write a literal null byte in C source code. The
backslash-zero is the character with value zero — the same zero that
terminates strings.
tci_strlen
tci_strlen returns the number of characters in a string, not
counting the null terminator — "hello" gives 5, "" gives 0.
Because C stores no length alongside the bytes, the only way to find
it is to walk forward until the \0:
Create tci_strlen.c:
#include "libtci.h"
size_t tci_strlen(const char *s)
{
size_t len;
len = 0;
while (s[len]) /* s[len] is falsy when it reaches '\0' */
len++;
return (len); /* count does not include the '\0' itself */
}s[len] is the character at index len. When it is zero — when the
null terminator is reached — the condition is false and the loop
stops. The returned len does not include the \0.
Run man 3 strlen — the return type is size_t, not int. INT_MAX
is the largest value a 32-bit int can hold: 2,147,483,647. A string
longer than that would overflow an int counter on a 64-bit platform,
silently wrapping to a negative number. size_t cannot be negative
and is wide enough to count any valid string.
Add tci_strlen.c to SRCS, and the declaration to libtci.h:
size_t tci_strlen(const char *s);tci_strcpy
tci_strcpy copies a string from src into dst, including the null
terminator. In C, assigning one string pointer to another with =
copies only the pointer — both variables end up pointing at the same
bytes. tci_strcpy copies the characters themselves, giving dst its
own independent copy.
Create tci_strcpy.c:
#include "libtci.h"
char *tci_strcpy(char *dst, const char *src)
{
size_t i;
i = 0;
while (src[i]) { /* stop when src reaches '\0' — before copying it */
dst[i] = src[i];
i++;
}
dst[i] = '\0'; /* loop exits without copying the terminator; write it manually */
return (dst); /* return destination for call chaining */
}The loop copies every character until it hits \0, then writes
\0 explicitly after the loop ends. The \0 write is separate
because the loop condition while (src[i]) exits before copying
the terminator. Without the line dst[i] = '\0', the destination
would not be a valid C string.
The caller is responsible for ensuring dst has enough space. strcpy
does not check — it writes however many bytes the source contains.
Passing a destination too small to hold the source corrupts memory.
Run man 3 strcpy — the manual is explicit that the caller is
responsible for ensuring dst is large enough; there is no bounds
check inside the function.
Add tci_strcpy.c to SRCS, and the declaration to libtci.h:
char *tci_strcpy(char *dst, const char *src);tci_strncpy
tci_strncpy copies at most n bytes from src to dst. If src
is shorter than n, the remainder of dst is padded with null
bytes. If src is longer than n, the destination is not null-
terminated — a subtle trap that catches many programmers.
Create tci_strncpy.c:
#include "libtci.h"
char *tci_strncpy(char *dst, const char *src, size_t n)
{
size_t i;
i = 0;
while (i < n && src[i]) { /* stop at budget or end of src, whichever comes first */
dst[i] = src[i];
i++;
}
while (i < n) /* src shorter than n: pad remaining bytes with '\0' */
dst[i++] = '\0'; /* if src was longer than n, this loop runs zero times: no terminator */
return (dst);
}The first loop copies characters until it runs out of source or runs
out of budget. The second loop pads with zeros to reach exactly n
bytes. If the source was longer than n, the second loop runs zero
times and no terminator is written.
Run man 3 strncpy — the DESCRIPTION names this case explicitly:
it is specified behaviour, not a bug.
Add tci_strncpy.c and its declaration:
char *tci_strncpy(char *dst, const char *src, size_t n);BSD functions: tci_strlcpy and tci_strlcat
strcpy has no length limit — it writes however many bytes the source
contains without checking the destination size. strncpy adds a
limit, but its truncation behaviour is a trap: if the source is longer
than n, the destination is left without a null terminator.
strlcpy and strlcat are BSD replacements that solve both problems.
They always null-terminate, always take the destination's total buffer
size, and return the length of the source string rather than the
destination. The return value enables truncation detection: if it is
greater than or equal to the buffer size, the result was truncated:
char buf[8];
size_t needed;
needed = tci_strlcpy(buf, "hello world", sizeof(buf));
if (needed >= sizeof(buf))
/* truncation: source was 11 bytes, buffer holds 7 + '\0' */Both functions originate in BSD libc and are not in GNU libc on Linux — see the note in Setup. libtci provides them so c01–c05 projects can use them on Linux.
tci_strlcpy copies src into dst, writing at most size - 1
characters and always appending \0. It returns the length of src:
Create tci_strlcpy.c:
#include "libtci.h"
size_t tci_strlcpy(char *dst, const char *src, size_t size)
{
size_t src_len;
src_len = tci_strlen(src); /* measure once; needed for the branch and the return */
if (size == 0) /* no buffer at all: nothing to write */
return (src_len); /* still report what was needed for truncation detection */
if (src_len < size) /* source fits: copy characters and terminator together */
tci_memcpy(dst, src, src_len + 1); /* +1 includes the '\0' */
else { /* source longer than buffer: truncate */
tci_memcpy(dst, src, size - 1); /* size-1 reserves the last byte for '\0' */
dst[size - 1] = '\0'; /* always terminate, even when truncated */
}
return (src_len); /* source length, not bytes written: caller detects truncation */
}When src_len < size, the entire source fits — copy characters and
terminator in one operation. When the source is longer, copy size - 1
characters and write the terminator manually. Either way, dst is
always a valid C string.
Run man 3 strlcpy — on Linux this requires man-db and the BSD
manual pages package; the function is not in the GNU libc manual.
Add tci_strlcpy.c and its declaration:
size_t tci_strlcpy(char *dst, const char *src, size_t size);tci_strlcat appends src to the end of dst, writing into at most
size total bytes (including the existing content of dst). It
returns dst_len + src_len — the total length the result would have
been without truncation. The same comparison detects truncation: if
the return value is greater than or equal to size, the append was
cut short.
Create tci_strlcat.c:
#include "libtci.h"
size_t tci_strlcat(char *dst, const char *src, size_t size)
{
size_t dst_len;
size_t src_len;
dst_len = tci_strlen(dst); /* find where dst ends; append starts here */
src_len = tci_strlen(src); /* measure source; needed for the return value */
if (size <= dst_len) /* buffer already full or smaller than dst: nothing to append */
return (dst_len + src_len); /* still report total length that would have been needed */
tci_strlcpy(dst + dst_len, src, size - dst_len); /* dst+dst_len points at the '\0'; size-dst_len is remaining space */
return (dst_len + src_len); /* combined length without truncation: caller detects if append was cut short */
}dst + dst_len is a pointer to the null terminator of dst — the
point where appending begins. size - dst_len is the remaining space
in the buffer. If size <= dst_len, the destination is already full
or longer than the budget — nothing is written and the function
returns the total length that would have been needed.
Add tci_strlcat.c and its declaration:
size_t tci_strlcat(char *dst, const char *src, size_t size);tci_strcmp and tci_strncmp
In C, comparing two strings with == compares the pointers — it
answers "are these the same memory location?", not "do these contain
the same characters?" tci_strcmp compares the characters. It walks
both strings in parallel, returning the difference between the first
characters that differ. If the strings are identical, it returns zero.
Create tci_strcmp.c:
#include "libtci.h"
int tci_strcmp(const char *s1, const char *s2)
{
size_t i;
i = 0;
while (s1[i] && s1[i] == s2[i]) /* advance while s1 has chars and both match */
i++;
return ((unsigned char)s1[i] - (unsigned char)s2[i]); /* cast avoids sign-extension on bytes above 127 */
}The loop condition has two parts. s1[i] is truthy as long as the
character is not the null terminator — it stops the loop when s1
ends. s1[i] == s2[i] stops the loop the moment the characters
differ. Both must be true to keep advancing: not at the end of s1,
and the characters still match. If s2 is shorter, its null
terminator will not match the character in s1, and the loop exits.
After the loop, i sits at the first position where the strings
differ — or at the null terminator if they were identical throughout.
Subtracting s2[i] from s1[i] gives a positive number if s1's
character sorts later, negative if it sorts earlier, and zero if both
are the null terminator — meaning the strings are equal.
The (unsigned char) cast is necessary because plain char can be
signed. A character above 127 — an accented letter, for example —
would be treated as a negative value before the subtraction, producing
a wrong result. Casting both sides to unsigned char first ensures
a consistent ordering regardless of whether char is signed on the
platform.
tci_strncmp adds an upper limit on how many bytes to compare:
Create tci_strncmp.c:
#include "libtci.h"
int tci_strncmp(const char *s1, const char *s2, size_t n)
{
size_t i;
if (n == 0)
return (0); /* zero bytes to compare: always equal */
i = 0;
while (i < n - 1 && s1[i] && s1[i] == s2[i]) /* n-1: reserve last step for final comparison */
i++;
return ((unsigned char)s1[i] - (unsigned char)s2[i]); /* cast avoids sign-extension on bytes above 127 */
}Run man 3 strcmp — the return value is described as the sign of
the difference between the first differing bytes treated as
unsigned char; that is exactly what the cast in our
implementation enforces.
Add both to SRCS and declare them:
int tci_strcmp(const char *s1, const char *s2);
int tci_strncmp(const char *s1, const char *s2, size_t n);tci_strchr and tci_strrchr
tci_strchr finds the first occurrence of a character in a string.
tci_strrchr finds the last. Both return a pointer to the found
character, or NULL if the character is not present.
The target character is passed as int — the same libc convention
used by tci_memset. Both functions search for the low byte of the
value.
Note that both functions must also search for \0. If c is zero,
tci_strchr should return a pointer to the null terminator, not
NULL. The loop must include the terminator in the search.
Create tci_strchr.c:
#include "libtci.h"
char *tci_strchr(const char *s, int c)
{
unsigned char target;
target = (unsigned char)c; /* narrow to one byte before comparing */
while (*s) {
if ((unsigned char)*s == target)
return ((char *)s); /* cast strips const: libc convention */
s++;
}
if ((unsigned char)*s == target) /* check '\0' itself: c==0 must return the terminator, not NULL */
return ((char *)s);
return (NULL);
}while (*s) dereferences the pointer to read the current character
and uses it directly as the condition. A character with value zero —
the null terminator — is falsy, so the loop stops there. It is
equivalent to writing while (*s != '\0'). The loop checks each
character, advancing s with s++ rather than an index counter.
Both styles work; pointer increment is common in functions that walk
through a string without needing to return an index. After the loop,
one more check handles the case where c is the null terminator
itself.
The cast (char *)s strips const to match the return type. The
libc function does the same: the parameter is const char * but the
return is char *, because the caller who knows the underlying string
is not const can use the result to modify it.
Create tci_strrchr.c — the same logic, but walk the entire string
first and keep track of the last match:
#include "libtci.h"
char *tci_strrchr(const char *s, int c)
{
unsigned char target;
const char *last;
target = (unsigned char)c; /* narrow to one byte before comparing */
last = NULL; /* no match found yet */
while (*s) {
if ((unsigned char)*s == target)
last = s; /* record position; keep walking to find a later match */
s++;
}
if ((unsigned char)*s == target) /* check '\0' itself in case c == 0 */
last = s;
return ((char *)last); /* NULL if never matched; cast strips const */
}Run man 3 strchr — the manual confirms that searching for \0
must return a pointer to the null terminator; the post-loop check
in both functions implements this requirement.
Add both files to SRCS and their declarations to the header.
tci_strnstr
tci_strnstr searches for the string needle within haystack, but
only within the first len bytes — it will not read beyond that limit
or past a null terminator. The standard strstr searches the entire
string with no upper bound; tci_strnstr adds a length constraint. It
originates in BSD libc and is not in GNU libc on Linux — see the note
in Setup. libtci includes it so c01–c05 projects can use it on
Linux.
The use case is substring search within a bounded region: scanning a
fixed-size buffer without risking a read past its end. Returns a
pointer to the first match, or NULL if not found.
Create tci_strnstr.c:
#include "libtci.h"
char *tci_strnstr(const char *haystack, const char *needle,
size_t len)
{
size_t nlen;
size_t i;
if (!*needle) /* empty needle matches at the start */
return ((char *)haystack);
nlen = tci_strlen(needle);
if (nlen > len) /* needle longer than the window: impossible match */
return (NULL);
i = 0;
while (i <= len - nlen && haystack[i]) { /* stop when remaining window is too short or string ends */
if (tci_strncmp(haystack + i, needle, nlen) == 0)
return ((char *)haystack + i); /* cast strips const: libc convention */
i++;
}
return (NULL);
}Add tci_strnstr.c and its declaration:
char *tci_strnstr(const char *haystack, const char *needle,
size_t len);Build and test
Update SRCS to include all ten new files. Run make and confirm
no warnings. The library now has fifteen functions. The next page adds
the character classification group — ten short functions that teach
an important lesson about how C represents characters. tci_atoi
follows at the end of that page: it depends on tci_isspace and
tci_isdigit, so it lives where those functions are declared.