STIG AS Level ASCII and Unicode

Last updated 6 months ago
6 questions
Required
2
ASCII and Unicode are examples of _______ _______ which are used as an encoding _______
2
ASCII has _______ bits which can represent _______ characters
1

The character 'A' is represented by the binary value 1000001. What would 'D' be?

1
Extended ASCII uses _______ bits so it extends original ASCII to represent _______ characters. The first _______ characters are the same as original ASCII.
1
Universal character encoding standard is also known as _______ that aims to represent all _______ from all _______ systems.
1

Drag the correct output to the correct character set

ASCII
Extended ASCII
Unicode
ش
A
123
À
ü
𪡊
÷
😀

Introduction to Character Representation

In computing, characters are represented internally as binary data. The specific binary representation depends on the character set being used. In this document, we'll explore three major character sets: ASCII, Extended ASCII, and Unicode.

ASCII (American Standard Code for Information Interchange)

  • ASCII is a 7-bit character encoding standard that represents 128 characters.
  • It includes uppercase and lowercase English letters, digits, punctuation symbols, and control characters.
  • Example: The character 'A' is represented by the binary value 1000001 (decimal 65).
CopyDecimal Binary Character ------- ------ --------- 65 1000001 A 66 1000010 B 67 1000011 C ... ... ...

Extended ASCII

  • Extended ASCII is an 8-bit character encoding that extends the original ASCII to represent 256 characters.
  • It includes additional characters like accented letters, symbols, and box-drawing characters.
  • The first 128 characters are the same as standard ASCII.
  • However, the characters in the extended ASCII range (128-255) are not standardized and may vary between different code pages.
  • Example: The character 'ç' (c with cedilla) is represented by the binary value 10000111 (decimal 135) in the Windows-1252 code page, but it may have a different representation in other extended ASCII code pages.

Unicode

  • Unicode is a universal character encoding standard that aims to represent all characters from all writing systems.
  • It supports over 143,000 characters across multiple languages and scripts, including characters with diacritics like 'á'.
  • Unicode has several encoding forms, including UTF-8, UTF-16, and UTF-32.
  • UTF-8 is the most common encoding on the web. It uses 1 to 4 bytes to represent a character.
  • Example: The character 'á' (a with acute accent) is represented by the bytes 11000011 10100001 (UTF-8) or the code point U+00E1.

Conclusion

Understanding character representation is crucial for working with text data in computing. While you're not expected to memorize specific character codes, being familiar with ASCII, extended ASCII, and Unicode will help you effectively handle and manipulate character data in various contexts.