ctype(3) - NetBSD Manual Pages

Command: Section: Arch: Collection:  
CTYPE(3)                NetBSD Library Functions Manual               CTYPE(3)

ctype -- character classification and mapping functions
Standard C Library (libc, -lc)
#include <ctype.h> isalpha(int c); isupper(int c); islower(int c); isdigit(int c); isxdigit(int c); isalnum(int c); isspace(int c); ispunct(int c); isprint(int c); isgraph(int c); iscntrl(int c); isblank(int c); toupper(int c); tolower(int c);
The above functions perform character tests and conversions on the inte- ger c. See the specific manual pages for information about the test or conver- sion performed by each function.
To print an upper-case version of a string to stdout, the following code can be used: const char *s = "xyz"; while (*s != '\0') { putchar(toupper((unsigned char)*s)); s++; }
isalnum(3), isalpha(3), isblank(3), iscntrl(3), isdigit(3), isgraph(3), islower(3), isprint(3), ispunct(3), isspace(3), isupper(3), isxdigit(3), tolower(3), toupper(3), ascii(7)
These functions, with the exception of isblank(), conform to ANSI X3.159-1989 (``ANSI C89''). All described functions, including isblank(), also conform to IEEE Std 1003.1-2001 (``POSIX.1'').
The argument of these functions is of type int, but only a very restricted subset of values are actually valid. The argument must either be the value of the macro EOF (which has a negative value), or must be a non-negative value within the range representable as unsigned char. Passing invalid values leads to undefined behavior. Values of type int that were returned by getc(3), fgetc(3), and similar functions or macros are already in the correct range, and may be safely passed to these ctype functions without any casts. Values of type char or signed char must first be cast to unsigned char, to ensure that the values are within the correct range. Casting a nega- tive-valued char or signed char directly to int will produce a negative- valued int, which will be outside the range of allowed values (unless it happens to be equal to EOF, but even that would not give the desired result). Because the bugs may manifest as silent misbehavior or as crashes only when fed input outside the US-ASCII range, the NetBSD implementation of the ctype functions is designed to elicit a compiler warning for code that passes inputs of type char in order to flag code that may pass nega- tive values at runtime that would lead to undefined behavior: #include <ctype.h> #include <locale.h> #include <stdio.h> int main(int argc, char **argv) { if (argc < 2) return 1; setlocale(LC_ALL, ""); printf("%d %d\n", *argv[1], isprint(*argv[1])); printf("%d %d\n", (int)(unsigned char)*argv[1], isprint((unsigned char)*argv[1])); return 0; } When compiling this program, GCC reports a warning for the line that passes char. At runtime, you may get nonsense answers for some inputs without the cast -- if you're lucky and it doesn't crash: % gcc -Wall -o test test.c test.c: In function 'main': test.c:12:2: warning: array subscript has type 'char' % LC_CTYPE=C ./test $(printf '\270') -72 5 184 0 % LC_CTYPE=C ./test $(printf '\377') -1 0 255 0 % LC_CTYPE=fr_FR.ISO8859-1 ./test $(printf '\377') -1 0 255 2 Some implementations of libc, such as glibc as of 2018, attempt to avoid the worst of the undefined behavior by defining the functions to work for all integer inputs representable by either unsigned char or char, and suppress the warning. However, this is not an excuse for avoiding con- version to unsigned char: if EOF coincides with any such value, as it does when it is -1 on platforms with signed char, programs that pass char will still necessarily confuse the classification and mapping of EOF with the classification and mapping of some non-EOF inputs. NetBSD 10.99 January 15, 2019 NetBSD 10.99
Powered by man-cgi (2024-03-20). Maintained for NetBSD by Kimmo Suominen. Based on man-cgi by Panagiotis Christias.