The C standard library is a set of functions and types available on
every conforming C implementation. You have already used one of them: printf from <stdio.h>.
sizeof is a built-in C operator, not part of the library.
This page covers the parts you need to build words.c.
Reading a line: fgets
scanf with %s stops at whitespace, which makes it unsuitable for
reading a whole sentence. fgets reads up to a full line:
#include <stdio.h>
int main(void)
{
char line[256];
fgets(line, sizeof(line), stdin);
printf("You entered: %s", line);
return (0);
}fgets(buffer, size, stream) reads at most size - 1 characters from
stream, stores them in buffer, and appends a null terminator \0.
It includes the trailing newline in the buffer if it fits. stdin is
the standard input stream — the same source scanf reads from.
char line[256] declares an array of 256 characters. An array is a
fixed-size block of contiguous memory. line[0] is the first
character, line[255] is the last. The \0 at the end marks where
the string ends — C strings are null-terminated by convention.
String functions: string.h
#include <string.h>
size_t len = strlen("hello"); /* 5 — does not count the \0 */strlen counts characters up to (but not including) the null
terminator. The return type is size_t — an unsigned integer type
large enough to hold any size on the platform. Print it with %zu.
Other common functions:
| Function | What it does |
|---|---|
strlen(s) | length of string s |
strcpy(dst, src) | copy src into dst |
strcat(dst, src) | append src to dst |
strcmp(s1, s2) | compare; returns 0 if equal |
Be careful with strcpy and strcat — they do not check that dst
is large enough. Writing past the end of an array is undefined
behaviour. The decision comes down to whether you can guarantee the
buffer is large enough.
If the sizes are known and controlled by your own code, they are fine.
If the input comes from the outside world — user input, a network
packet, a file — use strncpy or snprintf, which take a maximum
length. Note that strncpy does not null-terminate the destination if
the source fills the buffer;
snprintf is the safer general-purpose choice. The rule is not "never
use strcpy"; it is "never use strcpy when you cannot prove the
destination is large enough."
Characters: ctype.h
#include <ctype.h>
isalpha('A') /* non-zero (true) */
isdigit('3') /* non-zero (true) */
isspace(' ') /* non-zero (true) */
isspace('\t') /* non-zero (true) — tab is also whitespace */
isspace('\n') /* non-zero (true) — so is newline */
tolower('A') /* 'a' */
toupper('a') /* 'A' */These functions take an int (a character value) and return an int.
They are the building blocks for parsing text character by character.
words.c
words.c reads one line from standard input and prints the number of
words and the number of characters it contains. A word is a run of
non-whitespace characters. Transitioning from whitespace to
non-whitespace starts a new word.
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(void)
{
char line[1024];
int words;
int chars;
int in_word;
size_t i;
size_t len;
if (fgets(line, sizeof(line), stdin) == NULL) {
printf("0 words, 0 characters\n");
return (0);
}
words = 0;
chars = 0;
in_word = 0;
len = strlen(line);
i = 0;
while (i < len) {
if (line[i] == '\n')
break;
chars++;
if (!isspace((unsigned char)line[i])) {
if (!in_word) {
words++;
in_word = 1;
}
}
else
in_word = 0;
i++;
}
printf("%d %s, %d %s\n", words, words == 1 ? "word" : "words",
chars, chars == 1 ? "character" : "characters");
return (0);
}The (unsigned char) cast before isspace is a correctness
requirement that will make full sense once you reach types and memory
in the C chapters. For now, treat it as a rule: always cast to
(unsigned char) when passing a char to any ctype.h function.
Test it:
gcc -Wall -Wextra words.c -o words
echo "hello world foo" | ./words
echo "one" | ./words
echo "" | ./wordsExpected:
3 words, 15 characters
1 word, 3 characters
0 words, 0 charactersThe next page brings rand(), srand(), and a loop together into
the guessing game — the chapter's final program.