BFOIT - Introduction to Computer Programming

Cryptography - BFOIT Saturday - October 13, 2007

Caesar's Cipher

The idea for this lesson came from the book:
In Code, Sarah Flannery with David Flannery.

Introduction

Caesar's cipher is a simple substitution algorithm where ciphertext characters are substituted for plaintext characters. In Caesar's Cipher, a ciphertext character is just the plaintext character shifted through the alphabet by an offset number.

Example (with shift offset of 5):

ciphertext F G H I J K L M N O P Q R S T U V W X Y Z A B C D E
plaintext A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

So, what do we need to know to write a program that encrypts text in this manner?

ASCII Characters

As I talk about in the very first lesson in BFOIT's curriculum, Programming - What Is It, everything inside a computer is represented as numbers.  For symbols, characters in this case, one standard representation is the ASCII character set.  Here is a sample from it:

ASCII
Character
in
Octal
in
Decimal
ASCII
Character
in
Octal
in
Decimal
ASCII
Character
in
Octal
in
Decimal
ASCII
Character
in
Octal
in
Decimal
space
040
32
0
060
48
A
101
65
a
141
97
!
041
33
1
061
49
B
102
66
b
142
98
(
050
40
2
062
50
C
103
67
c
143
99
)
051
41
3
063
51
D
104
68
d
144
100

Logo Primitives ASCII and CHAR

So, what do we need to encrypt and decrypt characters, to play with them mathematical?

If you try to add a number to a character in Logo, you get an error.  If you type in the instruction 'SUM "A 5', you get ERROR: Can't convert 'A' to a number.

We need to be able to convert a character into a number and then, given some number, convert it into a character.  Luckily, even though Logo will not do this conversion automatically, Logo does have two primitive operators that do exactly what we want, ASCII and CHAR.

jLogo Character Procedures
Name Input(s) Description
ASCII character Outputs the integer (between 0 and 127) that represents the character in the ASCII code.
CHAR number outputs a single character word that is the ASCII code corresponding to the input (an integer between 0 and 127).
    ? println ascii "A
    65
    ? println ascii "Z
    90
    ? println char 67
    C
    ? println char 97
    a
    ?

Now, back to the math stuff we need.

    ? println char sum (ascii "A) 10
    K
    ? println difference (ascii "E) (ascii "A)
    4
    ? println difference (ascii "W) (ascii "A)
    22

So now we know how to convert the letters of the alphabet (characters in Logo) to numbers.  From this point on we will talk about characters and character indices.  A character index will have the association: A=0, B=1, C=2, etc..., X=23, Y=24, and Z=25.  Here is an operator that produces a character index, given an uppercase character as its input.

    ;output the index of the uppercase character
    ;       A=0, B=1, C=2, ...  X=23, Y=24, Z=25
    to chIdx :uppercaseChar
      output difference (ascii :uppercaseChar) (ascii "A)
      end

It is very important that we start with A=0 and not A=1.  Why?  Think about it for a while...  Then continue reading.

Modulo Math

As with the QUOTIENT operator, given one number (the dividend) to be divided by another (the divisor), the modulo operator produces what's left over after the division.  For example, when 5 is divided by 2, the remainder is 1, when 8 is divided by 4, the remainder is 0.  In Logo, the modulo operator is the REMAINDER operator.

    ? println remainder 5 2
    1
    ? println remainder 8 4
    0
    ? println remainder 12345 100
    45
    ?
                        

So why is the modulo operator important in cryptology?

Think about what happens when we try to shift the last characters.  Take a shift amount of 5.  Five characters after A in the alphabet is F.  Five characters after M is R.  But... five characters after Y?  Ah oh...  We have to wrap-around!  Five characters after Y is D where A follows Z.

This is where the modulo operator comes in; it does just what we need it to do.  Given any character index and any shift amount, when they are added together and the result divided by the size of our alphabet, the remainder of this division is a legal character index.

If we supply the remainder operator with a character index as its first input and the alphabet size (26 in our case) as the second input, its output is the encrypted character index, with the wrap around covered!

Think about a very simple case, a shift amount of one (1).  For any character up to, but not including, Z, when you add one to the character's index and divide by twenty six (26) you get the correct encrypted character.  When you have a Z and you add one (1) to its character index (25), you get 26; when this is divided by 26 (the alphabet size), you get zero.  And zero is A, which is the correct answer for Z shifted by one.  So, let's look at the code to encrypt Logo sentences.

Caesar's Cipher Logo Code

    ;number of characters in the encrypted text alphabet
    to alphabetSize
      output difference (sum (ascii "Z) 1) ascii "A
      end

    ;output the index of the uppercase character
    ;       A=0, B=1, C=2, ...  X=23, Y=24, Z=25
    to chIdx :uppercaseChar
      output difference (ascii :uppercaseChar) (ascii "A)
      end

    ;amount (modula alphabetSize) a character is shifted
    to shiftAmount
      output 5
      end

    ;output the uppercase equivalent of :character, if any
    ;       otherwise :character itself
    to toUpperCase :character
      if less? (ascii :character) (ascii "a) [output :character]
      if greater? (ascii :character) (ascii "z) [output :character]
      output char difference (ascii :character) (difference (ascii "a) (ascii "A))
      end

    ;output the encrypted representation of a character
    ;the character is shifted by shiftAmount
    to encryptChar :character
      make "character toUpperCase :character
      if less? (ascii :character) (ascii "A) [output :character]
      if greater? (ascii :character) (ascii "Z) [output :character]
      output char sum (ascii "A) (remainder (sum (chIdx :character) shiftAmount) alphabetSize)
      end

    ;output the encrypted representation of a word
    to encryptWord :wd
      if empty? :wd [output " ]
      output word (encryptChar first :wd) (encryptWord butfirst :wd)
      end

    to encryptSentence :sent
      if empty? :sent [output [] ]
      output sentence (encryptWord first :sent) (encryptSentence butfirst :sent)
      end
                        

Your Turn - Decryption

Extend what I've given you; write code to decrypt a Logo sentence.  Write the procedures decryptChar, decryptWord, and decryptSentence which mirror their encrypt counterparts.

More Advanced Ciphers

So what's the problem with Caesar's Cipher?

The problem is that there are only twenty six alternatives to arriving at a solution.  But, the same technique can be used with a greater number of possible cases.  What if instead of encrypting only uppercase characters, we encrypted uppercase, lowercase, digits, and special symbols?  This increases the number of possible substitutions.

And then there is the possibility of encrypting pairs of characters.  This really expands the possibilities. 

Internet links to checkout:

  1. http://en.wikipedia.org/wiki/Ascii
  2. http://en.wikipedia.org/wiki/Caesar_cipher
  3. http://en.wikipedia.org/wiki/Modulo_operation
  4. http://en.wikipedia.org/wiki/Sarah_Flannery
  5. http://secretcodebreaker.com/caesar.html

Further Reading:

  1. In Code, Sarah Flannery with David Flannery, Workman Publishing, 2001.
  2. The Code Book, Simon Singh, Anchor Books, 1999.

Public Domain Mark
This work (BFOIT: Introduction to Computer Programming, by Guy M. Haas),
identified by Berkeley Foundation for Opportunities in IT (BFOIT),
is free of known copyright restrictions.