Send
Close Add comments:
(status displays here)
Got it! This site uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website.nbsp; Note: This appears on each machine/browser from which this site is accessed.
Character data
1. Character data
2. The computer age
When you use a computer, which design decision part of the computer might be considered the oldest?
Hint: It has to do with how the human uses interacts with the computer.
3. Characters
A knowledge of characters is important in programming.
Programming languages consist of characters.
Text editors accept keystrokes that represent character data or commands.
Text editor macros developed for productivity require a knowledge of character data.
4. In the beginning
In the beginning of the computer age, there were only a few characters needed.
All upper case
Digits
Blank
Some punctuation
Some control characters
Over time, more and more characters were added while maintaining compatibility with previous decisions.
5. Letters and control characters
1 A=65 a=97
2 B=66 b=98
3 C=67 c=99
4 D=68 d=100
5 E=69 e=101
6 F=70 f=102
7 G=71 g=103
8 H=72 h=104
9 I=73 i=105
10 J=74 j=106
11 K=75 k=107
12 L=76 l=108
13 M=77 m=109
14 N=78 n=110
15 O=79 o=111
16 P=80 p=112
17 Q=81 q=113
18 R=82 r=114
19 S=83 s=115
20 T=84 t=116
21 U=85 u=117
22 V=86 v=118
23 W=87 w=119
24 X=88 x=120
25 Y=89 y=121
26 Z=90 z=122
6. ASCII punctuation
20 = 32 = " "
21 = 33 = "!"
22 = 34 = "\""
23 = 35 = "#"
24 = 36 = "$"
25 = 37 = "%"
26 = 38 = "&"
27 = 39 = "'"
28 = 40 = "("
29 = 41 = ")"
2A = 42 = "*"
2B = 43 = "+"
2C = 44 = ","
2D = 45 = "-"
2E = 46 = "."
2F = 47 = "/"
7. ASCII digits
30 = 48 = "0"
31 = 49 = "1"
32 = 50 = "2"
33 = 51 = "3"
34 = 52 = "4"
35 = 53 = "5"
36 = 54 = "6"
37 = 55 = "7"
38 = 56 = "8"
39 = 57 = "9"
8. More ASCII punctuation
3A = 58 = ":"
3B = 59 = ";"
3C = 60 = "<"
3D = 61 = "="
3E = 62 = ">"
3F = 63 = "?"
40 = 64 = "@"
ASCII uppercase letters start at 41h or 65d
5B = 91 = "["
5C = 92 = "\\"
5D = 93 = "]"
5E = 94 = "^"
5F = 95 = "_"
60 = 96 = "`"
ASCII uppercase letters start at 61h or 97d
7B = 123 = "{"
7C = 124 = "|"
7D = 125 = "}"
7E = 126 = "~"
9. ASCII delete character
ASCII character 127d or 7Fh is the delete character, a control character.
127d = 7Fh = 01111111b
It is also known as the rubout character since, on a paper take, all the holes could be punched to ignore that character.
On the IBM PC and thereafter, the backspace character, Ctrl-H, 08d = 08h = 00001000b, was used for deleting the previous character
10. Extended ASCII
With the introduction of the IBM PC in 1981, extended ASCII was introduced.
This provided characters in the range 128d to 255d, 80h to FFh, 10000000 b to 1111 111 b.
Problem: Many vendors provided different characters in this range.
Solution: Code page switching - still causing issues today.
Signature pattern: ASII characters are recognized and displayed properly but not the extended ASCII.
11. Unicode
The Unicode-2 (2 bytes) or Unicode-16 (16 bites) (same format) character representation format allows 65535 characters to be represented.
The UTF-8 encoding is the most popular encoding since most characters are in the 0 to 127 range.
Unicode allows more characters to be represented (e.g., for ancient fonts such as Gothic) but many times the software being used does not properly support such representations.
12. UTF-8
The leading bits serve as a state machine for what to expect next.
There is some fault tolerance so that the sequence/state can be picked up after an error.
13. One byte
1 byte : 0-127: 7 bits
0 x x x x x x x : 7 bits : 0-127
14. Two bytes
2 bytes: 128-2047: 5 + 6 = 11 bits
1 1 0 x x x x x : 5 bits : 0-31 : 110xxxxx = 192-223
1 0 x x x x x x : 6 bits: 0-63 : 10xxxxxx = 128-191
15. Three bytes
3 bytes : 2048-65,535: 4 + 6 + 6 = 16 bits
1 1 1 0 x x x x : 4 bits : 0-15 : 1110xxxx = 224-239
1 0 x x x x x x : 6 bits: 0-63 : 10xxxxxx = 128-191
1 0 x x x x x x : 6 bits: 0-63 : 10xxxxxx = 128-191
16. Four bytes
4 bytes : 65,536-1,114,111: 3 + 6 + 6 + 6 = 21 bits
1 1 1 1 x x x x : 3 bits : 0-15 : 1111xxxx = 240-255
1 0 x x x x x x : 6 bits: 0-63 : 10xxxxxx = 128-191
1 0 x x x x x x : 6 bits: 0-63 : 10xxxxxx = 128-191
1 0 x x x x x x : 6 bits: 0-63 : 10xxxxxx = 128-191
17. End of page
18. Multiple choice questions for this page
1 questions omitted (login required)