1.1.1.3 Strings and Encoding

How computer stores sentence ?

Computer does not know characters and symbols. Computer has a unique numerical value for a symbol.
Computer keep storing numerical values associated with the symbols.
For us its a sequence of character, but for computer its a sequence of numerical values 

There are following major strategies


  • ASCII:- 
    • Very simple. 
    • Limited number of symbols. 
    • Like 65 for A, 66 for B and so on.
    • Sample ASCII table is here  
  • Unicode
    • UTF32
      • uses 4 bytes space for each character/symbol
      • wastage of space
    • UTF16
      • 2 byte space for frequently used characters and for remaining 4 bytes
      • economical as compared to UTF32.
    • UTF08
      • Uses 1,2,4 bytes for frequently used characters
      • Bit difficult to process because variation in data size.

2.3.2 Essential & Non-essential data