Adventures in Cryptography with Python – Caesar Cipher

 

If you have ever been interested in cryptography or started to learn about it then there is a high probability that you would have come across Caesar Cipher. In case you haven’t, here’s how wikipedia explains it

In cryptography, a Caesar cipher, also known as Caesar’s cipher, the shift cipherCaesar’s code or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number of positions down the alphabet. For example, with a left shift of 3, D would be replaced by AE would become B, and so on. The method is named after Julius Caesar, who used it in his private correspondence.

In this post and further posts this series we would implement these encryption and decryption in Python.

Note – You would need Python installed on your system to run the code in this post

 

Caesar Cipher Encoder

This part of the algorithm is fairly straight forward. The code below asks the plain text that you like to encrypt using the Caesar Cipher and the number of alphabets that you would like to be shifted.

Then it calculates the number of characters in the input string as this will be used for the number of iterations that need to be done to encrypt everything (one character at a time)

During the iteration it uses the alphabets string as its base for shifting the characters skipping the characters that are not part of this base string to keep the structure of the input intact

Finally it outputs the ciphertext for you to use.

 

alphabets = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

input_str = raw_input("Enter message that you would like to encrypt using caesar cipher: ")
shift = int(raw_input("Enter a shift value:"))

no_of_itr = len(input_str)
output_str = ""

for i in range(no_of_itr):
    current = input_str[i]
    loc = alphabets.find(current)
    if location < 0:
        output_str += input_str[i]
    else:
        newLocation = (location + shift)%26
        output_str += alphabets[newLoc]

print "ciphertext: ", output_str

 

I am assuming that you have either already have python installed on your system or you would do it now. To encode your message using Caesar Cipher save the above code in a .py file and then int the command prompt navigate the folder where the file saved and run the command similar to below:

 

D:\>python caesar_encoder.py
Image showing sample run of Caesar Cipher Encoder

Image showing sample run of Caesar Cipher Encoder

Caesar Cipher Decoder

Just like hiding something is simple and finding it is difficult. Similarly encoding is much simpler than decoding it. Now assume that you encoded your message using Caesar Cipher but then you forgot how many character you actually shifted. In this solution we would go through creating a solution for deciphering the ciphertext for all the possible shifts and then using the Oxford dictionary API to check which of these output actually makes any sense.

The initial section of the code is straightforward. We would ask for the ciphertext and your guess of what the shift could be

alphabets = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

input_str = raw_input("Enter cipher to decode: ")
shift = int(raw_input("Enter your guess of shift: "))

n = len(input_str)
output_str = ""

 

Then we would try to decode the ciphertext using your guess and see how good bad it was

 

for k in range(n):
        c = input_str[k]
        location = alphabets.find(c)
        if location < 0:
            output_str += input_str[k]
        else:
            new_location = (location - shift)%26
            output_str += alphabets[new_location]
data = output_str.split(" ", 20)
base_output_str = output_str
base_score = getScore(data[:20])

 

Hers’s the code for the getScore function. This calculates the scores of the decoded string based on the first 20 words by reaching out to the Oxford dictionary and checking if the decoded string is actually a valid word.

 

# This function gets the meaning score of the potential plain text
def getScore(potential_plain_text):
    current_score = 0.0
    for temp in potential_plain_text:
        #Limiting the match of words with 4 alphabets or more makes it less likely for false positive
        if len(temp) > 3:
            url = base_url + language + query_divider + temp.lower() + prefix_param + limit_param
            r = requests.get(url, headers = {'app_id': app_id, 'app_key': app_key})
            json_data = json.loads(r.text)
            if json_data['metadata']['total'] > 0 and json_data['results'][0]['score'] > 0:
                current_score += json_data['results'][0]['score']*json_data['metadata']['total']
    return current_score

 

Now we loop through all the possible options of character shift and then sort the results based on score

for i in range(26):
    output_str = ""

    for j in range(n):
        c = input_str[j]
        location = alphabets.find(c)
        if location < 0:
            output_str += input_str[j]
        else:
            new_location = (location - i)%26
            output_str += alphabets[new_location]
    data = output_str.split(" ", 20)
    score = int(getScore(data[:20]))
    results.append(Result(score,i,output_str))

results.sort(key=operator.attrgetter('score'))
results.reverse()

print "Plain text from your guess: ", base_output_str, " - with shift: ", shift, " - the score is", base_score, "\n"

print "Below are the potential plain text sorted by the probability of being correct in descending order!"

for t in results:
    print "Potential plain text: ", t.plain_text, " - with shift: ", t.shift, " - the score is", t.score

 

Here’s a sample run

C:\>python caesar_decoder.py
Enter cipher to decode: WKLV LV PB PHVVDJH
Enter your guess of shift: 3

 

And here’s the a sample output

Image showing sample run of Caesar Cipher Decoder

Image showing sample run of Caesar Cipher Decoder

 

The entire source code for this post can be found at https://github.com/abhishuk85/cryptography-plays

Any questions, comments or feedback are most welcome.