This is the third post on functions that should be avoided. I covered other functions in previous posts:

scanf

int scanf(const char * restrict format, ...);

scanf is yet another common source of confusion for C novices, especially since it often gets introduced early to demonstrate user input. While it is possible to use scanf safely and properly handle input issues, it is usually unnecessarily complicated compared to other means.

Consider a simple program to ask a user for their name and age. A novice who has just seen scanf be used might be tempted to write something like the following:

#include <stdio.h>

int main(void) {
    char name[80];
    int age;
    printf("Please enter your name: ");
    fflush(stdout);
    scanf("%s", name);
    printf("Please enter your age: ");
    fflush(stdout);
    scanf("%d", &age);
    printf("Your name is \"%s\" and your age is \"%d\"\n", name, age);
    return 0;
}

Sure enough, when you run the program things seem to work okay:

Please enter your name: Chris
Please enter your age: 100
Your name is "Chris" and your age is "100"

There are a number of problems with the program, some of which can be easily addressed:

  • The first scanf is like gets and cannot prevent a buffer overflow on too long an input.
  • It won’t work as desired if the user decides to enter a full name (i.e. two or more words).
  • It won’t work as desired if the user enters something for an age that doesn’t look like an integer.
  • It has no way of knowing if the integer entered was in range for an int. In fact, if the result cannot be represented by an int, the behavior is undefined1.

We can fix some of these problems reasonably simply:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    char name[80];
    int age;
    printf("Please enter your name: ");
    fflush(stdout);
    if (scanf("%79[^\n]", name) != 1) {
        exit(EXIT_FAILURE);
    }
    printf("Please enter your age: ");
    fflush(stdout);
    if (scanf("%d", &age) != 1) {
        exit(EXIT_FAILURE);
    }
    printf("Your name is \"%s\" and your age is \"%d\"\n", name, age);
    return 0;
}

Now we’re checking whether scanf succeeded in converting the items successfully, and have prevented the buffer overflow, and have catered for multiword names. I believe we now only have two issues left:

  • Still undefined behavior if the result of the %d conversion is out of range, no way around this while still using scanf.
  • If a name longer than the 80 character limit is entered, the rest of the name will be left on the standard input stream and parsed as though it was part of the age input.

At this point you really have no way of recovering with scanf other than discarding the extraneous input on that line until a newline is encountered. However, the problem is that you don’t know if scanf actually consumed a newline during reading of the name. So if you then try to consume all data up to the newline, this could be the next legitimate line of input (the age).

Therefore, it is best to simply use fgets to read the name line, then fgets followed by strtol to read the age line:

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int readline(char *buf, size_t size, FILE *stream) {
    if (fgets(buf, size, stream) == NULL) {
        return EOF;
    }
    char *n = strchr(buf, '\n');
    if (n) {
        *n = '\0';
        // If you got here, a line was read successfully without truncation.
    } else {
        int c;
        while ( (c = fgetc(stream)) != EOF && c != '\n');
        if (c == EOF) {
            return EOF;
        }
        // If you got here, a line was read successfully but was truncated.
    }
    return 0;
}
int main(void) {
    char namebuf[50], agebuf[50];
    printf("Please enter your name: ");
    fflush(stdout);
    if (readline(namebuf, sizeof namebuf, stdin)) {
        exit(EXIT_FAILURE);
    }
    printf("Please enter your age: ");
    fflush(stdout);
    if (readline(agebuf, sizeof agebuf, stdin)) {
        exit(EXIT_FAILURE);
    }
    char *endptr;
    errno = 0;
    long age = strtol(agebuf, &endptr, 0);
    if (errno == ERANGE || *endptr != '\0') {
        fprintf(stderr, "Not a valid number or out of range.\n");
        exit(EXIT_FAILURE);
    }
    printf("Your name is \"%s\" and your age is \"%ld\"\n", namebuf, age);
    return 0;
}

Hopefully this code is now reasonably robust. Truncation can be detected and it can optionally exit at that point.

Furthermore, if you want to request the user retype their input in case its invalid (e.g. entering an invalid age integer) you can easily wrap the above code in loops.

If you are not parsing numbers, using fgets to read a line then sscanf to parse the resulting string is a very reasonable approach. It is scanf that is problematic, not sscanf.

There is a gotcha with scanf where you run into a classic newbie infinite loop problem:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    int age, ret;
    do {
        printf("Please enter your age: ");
        fflush(stdout);
        if ( (ret = scanf("%d", &age)) != 1) {
            if (ret == EOF) {
                exit(EXIT_FAILURE);
            }
            printf("error: invalid age, try again.\n");
        }
    } while (ret != 1);

    printf("Your age is \"%d\"\n", age);
    return 0;
}

Here we are being diligent by checking the return value of scanf, but what actually happens when the user types a non-integer such as “hello” is that this is left on the stream, so the next time scanf is called with %d, it immediately returns 0 since it still doesn’t see an integer. The solution is to consume characters until the next newline and try again. However, as shown above, just using fgets is easier.

References

  1. scanf undefined behavior on out of range integer — C11 §7.21.6.2p10