Portable character set

Portable Character Set is a set of 103 characters which, according to the POSIX standard, must be present in any character set. Compared to ASCII, the Portable Character Set lacks some control characters, and does not prescribe any particular value encoding.[1][2] The Portable Character Set is a superset of the Basic Execution Character Set as defined by ANSI C.[3]

name glyph C string Unicode Unicode name
NUL   \0 U+0000 NULL (NUL)
alert   \a U+0007 ALERT (BEL)
backspace   \b U+0008 BACKSPACE (BS)
tab   \t U+0009 CHARACTER TABULATION (HT)
newline   \n U+000A LINE FEED (LF)
vertical-tab   \v U+000B LINE TABULATION (VT)
form-feed   \f U+000C FORM FEED (FF)
carriage-return   \r U+000D CARRIAGE RETURN (CR)
space     U+0020 SPACE
exclamation-mark ! ! U+0021 EXCLAMATION MARK
quotation-mark " \" U+0022 QUOTATION MARK
number-sign # # U+0023 NUMBER SIGN
dollar-sign $ $ U+0024 DOLLAR SIGN
percent-sign % % U+0025 PERCENT SIGN
ampersand & & U+0026 AMPERSAND
apostrophe ' \' U+0027 APOSTROPHE
left-parenthesis ( ( U+0028 LEFT PARENTHESIS
right-parenthesis ) ) U+0029 RIGHT PARENTHESIS
asterisk * * U+002A ASTERISK
plus-sign + + U+002B PLUS SIGN
comma , , U+002C COMMA
hyphen - - U+002D HYPHEN-MINUS
period . . U+002E FULL STOP
slash / / U+002F SOLIDUS
zero 0 0 U+0030 DIGIT ZERO
one 1 1 U+0031 DIGIT ONE
two 2 2 U+0032 DIGIT TWO
three 3 3 U+0033 DIGIT THREE
four 4 4 U+0034 DIGIT FOUR
five 5 5 U+0035 DIGIT FIVE
six 6 6 U+0036 DIGIT SIX
seven 7 7 U+0037 DIGIT SEVEN
eight 8 8 U+0038 DIGIT EIGHT
nine 9 9 U+0039 DIGIT NINE
colon : : U+003A COLON
semicolon ; ; U+003B SEMICOLON
less-than-sign < < U+003C LESS-THAN SIGN
equals-sign = = U+003D EQUALS SIGN
greater-than-sign > > U+003E GREATER-THAN SIGN
question-mark ? ? U+003F QUESTION MARK
commercial-at @ @ U+0040 COMMERCIAL AT
A A A U+0041 LATIN CAPITAL LETTER A
B B B U+0042 LATIN CAPITAL LETTER B
C C C U+0043 LATIN CAPITAL LETTER C
D D D U+0044 LATIN CAPITAL LETTER D
E E E U+0045 LATIN CAPITAL LETTER E
F F F U+0046 LATIN CAPITAL LETTER F
G G G U+0047 LATIN CAPITAL LETTER G
H H H U+0048 LATIN CAPITAL LETTER H
I I I U+0049 LATIN CAPITAL LETTER I
J J J U+004A LATIN CAPITAL LETTER J
K K K U+004B LATIN CAPITAL LETTER K
L L L U+004C LATIN CAPITAL LETTER L
M M M U+004D LATIN CAPITAL LETTER M
N N N U+004E LATIN CAPITAL LETTER N
O O O U+004F LATIN CAPITAL LETTER O
P P P U+0050 LATIN CAPITAL LETTER P
Q Q Q U+0051 LATIN CAPITAL LETTER Q
R R R U+0052 LATIN CAPITAL LETTER R
S S S U+0053 LATIN CAPITAL LETTER S
T T T U+0054 LATIN CAPITAL LETTER T
U U U U+0055 LATIN CAPITAL LETTER U
V V V U+0056 LATIN CAPITAL LETTER V
W W W U+0057 LATIN CAPITAL LETTER W
X X X U+0058 LATIN CAPITAL LETTER X
Y Y Y U+0059 LATIN CAPITAL LETTER Y
Z Z Z U+005A LATIN CAPITAL LETTER Z
left-square-bracket [ [ U+005B LEFT SQUARE BRACKET
backslash \ \\ U+005C REVERSE SOLIDUS
right-square-bracket ] ] U+005D RIGHT SQUARE BRACKET
circumflex ^ ^ U+005E CIRCUMFLEX ACCENT
underscore _ _ U+005F LOW LINE
grave-accent ` ` U+0060 GRAVE ACCENT
a a a U+0061 LATIN SMALL LETTER A
b b b U+0062 LATIN SMALL LETTER B
c c c U+0063 LATIN SMALL LETTER C
d d d U+0064 LATIN SMALL LETTER D
e e e U+0065 LATIN SMALL LETTER E
f f f U+0066 LATIN SMALL LETTER F
g g g U+0067 LATIN SMALL LETTER G
h h h U+0068 LATIN SMALL LETTER H
i i i U+0069 LATIN SMALL LETTER I
j j j U+006A LATIN SMALL LETTER J
k k k U+006B LATIN SMALL LETTER K
l l l U+006C LATIN SMALL LETTER L
m m m U+006D LATIN SMALL LETTER M
n n n U+006E LATIN SMALL LETTER N
o o o U+006F LATIN SMALL LETTER O
p p p U+0070 LATIN SMALL LETTER P
q q q U+0071 LATIN SMALL LETTER Q
r r r U+0072 LATIN SMALL LETTER R
s s s U+0073 LATIN SMALL LETTER S
t t t U+0074 LATIN SMALL LETTER T
u u u U+0075 LATIN SMALL LETTER U
v v v U+0076 LATIN SMALL LETTER V
w w w U+0077 LATIN SMALL LETTER W
x x x U+0078 LATIN SMALL LETTER X
y y y U+0079 LATIN SMALL LETTER Y
z z z U+007A LATIN SMALL LETTER Z
left-brace { { U+007B LEFT CURLY BRACKET
vertical-line | | U+007C VERTICAL LINE
right-brace } } U+007D RIGHT CURLY BRACKET
tilde ~ ~ U+007E TILDE

Character Classes

edit

Characters grouped by their class.[4]

Unicode range Character Class POSIX.1-2017 Standard
U+0000 Control Portable
U+0001 to U+0006 Control Non-Portable
U+0007 to U+0008 Control Portable
U+0009 to U+000D White-space Portable
U+0010 to U+001F Control Non-Portable
U+0020 White-space Portable
U+0021 to U+002F Punctuation Portable
U+0030 to U+0039 Digit Portable
U+003A to U+0040 Punctuation Portable
U+0041 to U+005A Uppercase Letter Portable
U+005B to U+0060 Punctuation Portable
U+0061 to U+007A lowercase letter Portable
U+007B to U+007E Punctuation Portable

References

edit
  1. ^ "The Open Group Base Specifications Issue 7, 2018 edition". IEEE and The Open Group. 2018. Retrieved 2018-03-21.
  2. ^ "The Open Group Base Specifications Issue 6". IEEE and The Open Group. 2004. Retrieved 18 August 2014.
  3. ^ "Working draft — ISO/IEC 9899:202x, Information technology — Programming languages — C, § 5.2.1" (PDF). International Organization for Standardization. 2018. Retrieved 2020-08-03.
  4. ^ "American National Standard Code for Information Interchange | ANSI X3.4-1977" (PDF). National Institute for Standards. 1977. Archived (PDF) from the original on 2022-10-09. (facsimile, not machine readable)