C functions that should be avoided (part 1) - atoi and gets
Most non-beginner C programmers by now know not to use gets(). Indeed, it has even been removed completely from the latest C standard. However, there are a few other functions that should be mostly avoided for various reasons.
atoi() and friends
double atof(const char *nptr);
int atoi(const char *nptr);
long int atol(const char *nptr);
long long int atoll(const char *nptr);
It is tempting to use atoi
and its counterparts to convert string representations of numbers to appropriate types. For example:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
if (argc > 1) {
printf("%d\n", atoi(argv[1]));
}
return 0;
}
Looks fine, except:
- if you invoke the program with a number that is out of range for int, the behavior is undefined1
- If you call atoi on a non-number, such as atoi(“potato”), you will get 0.
atoi("123abc")
will return 123 without any indication that there’s other non-numeric junk remaining.
The solution is to use strtol
instead:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
if (argc > 1) {
printf("%ld\n", strtol(argv[1], NULL, 10);
}
return 0;
}
This code is better, as we have eliminated the undefined behavior, but it still suffers from being unable to validate input. The second argument to strtol should be used:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
int main(int argc, char *argv[]) {
if (argc < 2) {
fprintf(stderr, "No command line argument found.\n");
exit(EXIT_FAILURE);
}
char *endptr;
long result = strtol(argv[1], &endptr, 0);
if (errno == ERANGE || *endptr != '\0') {
fprintf(stderr, "Not a valid number or out of range.\n");
exit(EXIT_FAILURE);
}
printf("%ld\n", result);
return 0;
}
With strtol we’re able to verify the input fully. Inputs of “123foo” or “potato” will be treated as invalid, and out of range results will be handled. If you are expecting to parse the string further after the number, you can use endptr to know where to continue parsing.
Don’t use these | Use these instead |
---|---|
atoi | strtol |
atol | strtol |
atoll | strtoll |
atof | strtod |
gets()
char *gets(char *s); // from C99, removed in C11.
Almost everyone knows not to use gets(), but I’m including it here for completeness. The problem is that gets() doesn’t know how much room your buffer has:
#include <stdio.h>
int main(void) {
char buffer[20];
gets(buffer);
return 0;
}
Obviously, if you supply more than 20 bytes on a line to standard input, you will get a buffer overflow. This is undefined behavior. Thousands of less obvious buffer overflows in real code lead to security exploits. scanf has potentially the same problem, but can be mitigated:
scanf("%s", buffer); // Same problem as gets
scanf("%40s", buffer); // Okay, as long as you understand
// how scanf works. Covered in part 3.
The solution is to use fgets instead:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char buffer[20];
if (fgets(buffer, sizeof buffer, stdin) == NULL) {
fprintf(stderr, "end-of-file or error.\n");
exit(EXIT_FAILURE);
}
char *p;
if ((p = strchr(buffer, '\n')) == NULL) {
fprintf(stderr, "line too long.\n");
exit(EXIT_FAILURE);
}
*p = '\0'; // often we don't care about the newline anymore.
printf("Line supplied: \"%s\"\n", buffer);
return 0;
}
Here we’ve checked the return value of fgets and made sure we read a whole line.
C11 actually also provides gets_s
as part of its optional bounds checking extensions. I’m not going to cover it here, as the C implementations I use currently don’t support it. Furthermore, in the gets_s
recommend practice, it even says2:
The
fgets
function allows properly-written programs to safely process input lines too long to store in the result array. In general this requires that callers offgets
pay attention to the presence or absence of a new-line character in the result array. Consider usingfgets
(along with any needed processing based on new-line characters) instead ofgets_s
.
I cover strncpy in part 2.