Friday, March 24, 2023

Fantastical F-Strings

f-strings are pretty amazing. They're not just for making your print statements easier on the eyes, they can actually do quite a bit more. Let's start off with some basics first.

Given this bit of code, we'll go through the variations of print statement methods to insert variables.

from datetime import date

age = int((date.today() - date(1949, 5, 9)).days / 365.2425)
dob = "May 9th, 1949"
first = "Billy"
last = "Joel"

 

The first is string concatenation which provided plenty of opportunity to fat finger a quotation mark or a plus symbol.

print(last + ", " + first + ", " + str(age) + " years old" + " - Born on: " + dob)

 

Then there is the format command which made it easier, but you had to remember to keep your variables in the right order at the end to replace the variables marked with{}.

print("{}, {}, {} years old - Born on: {}".format(last, first, age, dob))

 

Finally, there's f-strings which made it a whole lot easier on the eyes.

print(f"{last}, {first}, {age} years old - Born on: {dob}")

 

Each of the above three print statements will provide an identical output. 

Output:

Joel, Billy, 73 years old - Born on: May 9th, 1949

 

You can even do multi-line f-strings like this, but the ouput will look different:

print(f"""
Name: {last}, {first}
Age: {age} years old
Born on: {dob}""")

Output:

Name: Joel, Billy
Age: 73 years old
Born on: May 9th, 1949

 

One really neat trick is to use f-strings to help you debug. By placing an equals symbol after the variable name, it will print out the variable name and its value.

print(f"{last=}, {first=}, {age=}, {dob=}")

Output:

last='Joel',first='Billy', age=73, dob='May 9th, 1949'

 

Let's take another example to show you how f-strings can help you debug.

num1 = 5
num2 = 3
print(f"{((num1 * num2) / num2)=}")

Output:

((num1 * num2) / num2)=5.0

 

Notice how it outputs the formula as well as the answer. One thing I wish it would do better is replace the inner variables with the values e.g.

((5 * 3) / 3)=5.0

So how useful is this really? I see it as more of a shortcut when you have a lot of variables you want to print out or the variable names are long e.g.

print(f'this_is_a_long_variable_name = {this_is_a_long_variable_name}')

can be shortened to:

print(f'{this_is_a_long_variable_name=}') 

Let's look at some other great things you can do with f-strings. If you place a ':' after your variable name, you can assign various formatting modifiers to handle how you want the format the variable. In the above example, we used .2f. The 'f' means to output a float and the '.2' identifies we want two decimal places.

num = 3.14159
print(f"Num rounded to 2 decimal places = {num:.2f}")

Output:

Num rounded to 2 decimal places = 3.14

 

We can use a similar modifier on strings when we need to truncate the length. The '.15' modifier says to print only the first 15 characters of the string. This might be useful if you are printing output in columns and don't want a long string throwing off your columns.

myString = "Hello, welcome to f-strings"
print(f"{myString:.15}")

Output:

Hello, welcome

 

When you're dealing with large numbers, you can use f-strings to use separators to separate thousands, millions, billions etc.

big_num = 1234567890
print(f"{big_num:_}")
print(f"{big_num:,}")

Output:

1_234_567_890
1,234,567,890

 

What if you want spaces as separators. Well, you can't do it directly, but using the replace() method you can make it happen.

big_num = 1234567890
print(f"{big_num:,}".replace(',', ' '))

Output:

1 234 567 890

 

Further, you can combine modifiers such as the separator and the number of decimal places in a float:

big_float = 7654321.1234567
print(f"{big_float:,.3f}")

Output:

7,654,321.123

 

You can use the '%' to display percentages and designating you want two decimal places like the below code demonstrates:

numerator = 22
denominator = 53
percentage = numerator / denominator
print(percentage)
print(f"Percentage: {percentage:.2%}")

Output:

0.41509433962264153
Percentage: 41.51%

 

Additionally, you can display scientific notation just as easily:

num = 1234567.7654321
print(f"{num:e}")
print(f"{num:E}")
print(f"{num:.2e}")
print(f"{num:.4E}")

Output:

1.234568e+06
1.234568E+06
1.23e+06
1.2346E+06

 

Pretty cool huh? But wait, there's more. How about a nice easy way to format the datetime.

import datetime
print(f'{datetime.datetime.now():%Y-%m-%d %H:%M:%S}')

Output:

2023-03-23 19:21:26

 

Let's cover three special modifiers, !s, !r, !a which are string, repr and ascii respectively. In the below example, we want to convert the emoji variable to its ascii unicode and the unicode variable to its string.

emoji = '😝'
unicode = '\U0001f61d'
print(f'{emoji!a}')
print(f'{unicode!s}')

Output:

'\U0001f61d'
😝

 

Now let's quickly look at the !r (repr). For the last two lines I've used the traditional repr() function and the !r to show they are indeed the same.

import datetime
today = datetime.datetime.now()
print(today)
print(repr(today))
print(f"{today!r}")

 Output:

2023-03-23 19:23:24.729137
datetime.datetime(2023, 3, 23, 19, 23, 24, 729137)
datetime.datetime(2023, 3, 23, 19, 23, 24, 729137)

 

If you've ever needed to show whether a number was positive or negative, you can use the '+' modifier for that.

numbers = [1,-5,4]
for number in numbers:
print(f'num: {number:+}')

Output:

num: +1
num: -5
num: +4

 

Now let's look at one of my favorite things with f-strings... Padding. A lot of my output is data which I like to put into columns. f-strings makes this pretty easy. Let's break the below exmaple. The '11' is how many characters wide I want the column to be. The '>', '<', and '^' determine whether I want right justified, left justified or centered text respectively. I also added the * character as padding filler and a line of digits at the top so it's more clear what's happening. You can replace the * with pretty much any character as you'll see later.

# Padding
num = 12345
print("12345678901")
print(f"{num:*>11}")
print(f"{num:*<11}")
print(f"{num:*^11}")

Output:

12345678901
******12345
12345******
***12345***

 

So how can we use this to output data in columns. Glad you asked. Let's take a look at this code. In the below we've defined four variables with increasing character counts. Then we're going to print each variable out on a separate line using the linefeed '\n' and finally we're going to specify a column width of 15 characters. Granted, we're still working with a single column here, we'll get to multiple columns in a bit, but I wanted to show you how you can align text within a column. The first print, we're left justifying, second print, centering, and last print right justifying.

a = "1"
b = "12"
c = "123"
d = "1234"
print(f"{a:<15}\n{b:<15}\n{c:<15}\n{d:<15}\n")
print(f"{a:^15}\n{b:^15}\n{c:^15}\n{d:^15}\n")
print(f"{a:>15}\n{b:>15}\n{c:>15}\n{d:>15}\n")

Output:

1
12
123
1234

1
12
123
1234

1
12
123
1234

 

One problem with the above code is that we've hard coded our column width at 15, we can fix that with a slight modification and add a variable called cw which will be our column width.

a = "1"
b = "12"
c = "123"
d = "1234"
cw = 15
print(f"{a:<{cw}}\n{b:<{cw}}\n{c:<{cw}}\n{d:<{cw}}\n")
print(f"{a:^{cw}}\n{b:^{cw}}\n{c:^{cw}}\n{d:^{cw}}\n")
print(f"{a:>{cw}}\n{b:>{cw}}\n{c:>{cw}}\n{d:>{cw}}\n")

 

So now, if we want to play with my column widths, we only need to change the cw variable once.

Time to play with multiple columns. Take a look at the below code. We've defined a list with tupples containing the line number, the item purchased, and the price. We iterate through these and use an f-string to print out the line number as a double digit padded with a zero, the item in a left justified 20 character column padded with periods, and then the cost as a 5 character column as a float with two decimal places.

bill = [(1, "Coffee", 2.15), (2, "Sandwich", 11.20), (3, "Juice", 1.45)]
for n, item, cost in bill:
print(f"{n:02}. {item:.<20} ${cost:>5.2f}")

Output:

01. Coffee.............. $ 2.15
02. Sandwich............ $11.20
03. Juice............... $ 1.45

 

You can see we have our line number preceeded with a zero, then our item, some eye appealing periods and then our price nicely right justified. One thing you may notice is on line 1 and 3, there is whitespace between the '$' and the price. This may or may not bother you, but let's explore how to keep the '$' with the price. To do this, we're going to nest f-strings. I wouldn't recommend doing this a lot because it can make reading your f-string which was designed to be readable, a little less readable.

If we change our print statement to include a nested f-string, then we can move the dollar sign next to the price:

print(f"{n:02}. {item:.<20} {f'${cost:.2f}':>6}")

 

Couple things to point out. The nested f-string fixes the space between the '$' and the price, but we've had to increase our column width to six to accomodate that the dollar sign is now part of the price (one string). You'll see what I mean if you change the six back to a five in the above print statement and count the characters.

Output:

01. Coffee.............. $2.15
02. Sandwich............ $11.20
03. Juice............... $1.45

 

We are nearing the end of this article, but we definitively can't end this without showing how f-strings can make base conversions a walk in the park.

dec = 245
print(f"{dec:b}") # binary
print(f"{dec:o}") # octal
print(f"{dec:x}") # lower hex
print(f"{dec:X}") # upper hex
print(f"{dec:c}") # ascii

 

The trick to keeping this easy is that all your conversions using f-strings should be based on your input being an integer. Provided you do this, then converting to binary, octal, lower case hex, uppercase hex and ascii character is child's play.

Output:

11110101
365
f5
F5
õ

 

Sadly, we can't use the debug trick of putting an '=' symbol at the end like we did earlier so we'd know which conversion is which. We'd have to go old school print(f"Bin: {dec:b}")

You can also use the modifiers we learned earlier and do something like this:

dec = 87
print(f"{dec:08b}") # pad with zeros for 8 characters
print(f"{dec:_b}") # group binary into 4 bits separate by _
print(f"{dec:03o}") # pad with zeros for 3 characters
print(f"{dec:02x}") # pad with zeros for 2 characters
print(f"{dec:02X}") # pad with zeros for 2 characters
print(f"{dec:#02x}") # pad with zeros for 2 characters and add 0x
print(f"{dec:#02X}") # pad with zeros for 2 characters and add 0X

Output:

01010111
101_0111
127
57
57
0x57
0X57

 

Here's a cute trick to generate an ASCII table.

bases = {
"d": "dec",
"b": "bin",
"c": "chr",
"x": "hex",
"X": "HEX",
"o": "oct",
}

cw = 8

for k, v in bases.items():
print(f"{v:>{cw}}", end=' ')
print()

for n in range(48, 97):
for k, v in bases.items():
print(f"{n:>{cw}{k}}", end=' ')
print()

 

Note, there are a number of non-printing characters between 0 - 255 decimal which will mess up the formatting which is why I limited the range on this. You could easily create a list containing bad decimal values resulting in non-printing characters and skip over them in the first for loop.

Output:

dec bin chr hex HEX oct
48 110000 0 30 30 60
49 110001 1 31 31 61
50 110010 2 32 32 62
51 110011 3 33 33 63
52 110100 4 34 34 64
53 110101 5 35 35 65
54 110110 6 36 36 66
55 110111 7 37 37 67
56 111000 8 38 38 70
57 111001 9 39 39 71
58 111010 : 3a 3A 72
59 111011 ; 3b 3B 73
60 111100 < 3c 3C 74
61 111101 = 3d 3D 75
62 111110 > 3e 3E 76
63 111111 ? 3f 3F 77
64 1000000 @ 40 40 100
65 1000001 A 41 41 101
66 1000010 B 42 42 102
67 1000011 C 43 43 103
68 1000100 D 44 44 104
69 1000101 E 45 45 105
70 1000110 F 46 46 106
71 1000111 G 47 47 107
72 1001000 H 48 48 110
73 1001001 I 49 49 111
74 1001010 J 4a 4A 112
75 1001011 K 4b 4B 113
76 1001100 L 4c 4C 114
77 1001101 M 4d 4D 115
78 1001110 N 4e 4E 116
79 1001111 O 4f 4F 117
80 1010000 P 50 50 120
81 1010001 Q 51 51 121
82 1010010 R 52 52 122
83 1010011 S 53 53 123
84 1010100 T 54 54 124
85 1010101 U 55 55 125
86 1010110 V 56 56 126
87 1010111 W 57 57 127
88 1011000 X 58 58 130
89 1011001 Y 59 59 131
90 1011010 Z 5a 5A 132
91 1011011 [ 5b 5B 133
92 1011100 \ 5c 5C 134
93 1011101 ] 5d 5D 135
94 1011110 ^ 5e 5E 136
95 1011111 _ 5f 5F 137
96 1100000 ` 60 60 140


Ok, that wraps this article. I hope you learned something new. I know I did.

Wednesday, March 15, 2023

Base Conversions in Python

As a CyberSecurity Professional, there have been innumerable occasions where I needed to convert some obfuscated data into other things so that I could understand what some piece of malware, phishing email, or GET/POST request was doing.

Over the years, I've written many Python scripts to do various decoding of data and thought I'd share what I've learned.

You can find the below tutorial program code on my github page.

There are many ways to do base conversions, but I've always found it simplest to translate it to decimal first using int() and then do a whole slew of conversions from there.

So let's say we have a piece of data in hex, we'd use the int() function to convert it first to decimal as so:

decimal = int(str_to_conv, 16)

or something in octal:

decimal = int(str_to_conv, 8)

Since we know base 10 is decimal, base 16 is hex, base 8 is octal, and base 2 is binary, using the int() function with what we want to convert and the argument being the base we're converting from makes conversions to decimal simple. So let's look at code for various bases.

# binary
decimal = int(str_to_conv, 2)
# character
decimal = ord(str_to_conv)
# hexadecimal
decimal = int(str_to_conv, 16)
# octal
decimal = int(str_to_conv, 8)

There are a few other base conversions you may be aware of such as base16, base32, base64 and base85. These are a bit different, but we'll also cover these in this article.

Let's set up a little program that takes two inputs. Input one is what we're converting from and input two is the data we want to convert. So when we call the program we'll do it like: base_converter.py hex FE or base_converter.py bin 1011

Let's also make a function called convert() to do all our conversions that returns a dictionary containing all our conversions.

import sys

def convert(conv_type, str_to_conv):
# Convert input(s) into decimal as a starting place for all encodings
if conv_type == 'bin':
decimal = int(str_to_conv, 2)
elif conv_type == 'bcd':
decimal = int(str_to_conv)
elif conv_type == 'chr':
decimal = ord(str_to_conv)
elif conv_type == 'dec':
decimal = int(str_to_conv)
elif conv_type == 'hex':
decimal = int(str_to_conv, 16)
elif conv_type == 'oct':
decimal = int(str_to_conv, 8)
# Set up dict to track all conversions
encodings = {"decimal": decimal}
return encodings

encodings = convert(sys.argv[1], sys.argv[2])
print(encodings)
  • We're importing sys so that we can pull in the command line arguments. 
  • The convert function takes two arguments, which are our command line arguments.
  • The next section converts our str_to_conv to decimal based on conv_type.
  • Next we're creating our encodings dictionary, right now with just the decimal.

If you run the above code, with base_converter.py hex FE you should see a dict with a single element called decimal like: {'decimal': 254}

Now, let's add code to handle conversions to various bases.

import sys

def convert(conv_type, str_to_conv):
# Convert input(s) into decimal as a starting place for all encodings
if conv_type == 'bin':
decimal = int(str_to_conv, 2)
elif conv_type == 'bcd':
decimal = int(str_to_conv)
elif conv_type == 'chr':
decimal = ord(str_to_conv)
elif conv_type == 'dec':
decimal = int(str_to_conv)
elif conv_type == 'hex':
decimal = int(str_to_conv, 16)
elif conv_type == 'oct':
decimal = int(str_to_conv, 8)
# Set up dict to track all conversions
encodings = {"decimal": decimal}

# Convert to binary and inverse
encodings["binary"] = format(encodings["decimal"], '08b')
encodings["binary_inv"] = ''.join('1' if x == '0' else '0' for x in encodings["binary"])

# Convert to decimal inverse
encodings["decimal_inv"] = int(encodings["binary_inv"], 2)

# Convert to hexadecimal and inverse
encodings["hex"] = format(encodings["decimal"],'02x').upper()
encodings["hex_inv"] = format(encodings["decimal_inv"],'02x').upper()

# Convert to octal and inverse
encodings["octal"] = format(encodings["decimal"],'02o').upper()
encodings["octal_inv"] = format(encodings["decimal_inv"],'03o').upper()
return encodings

encodings = convert(sys.argv[1], sys.argv[2])
print(encodings)

We first want to do the binary conversion because inverting a binary string is the easiest way to then convert that back to the decimal inverse which we can use for all other inverse conversions. Let's break down what's happening.

The format() function takes an input, like encodings["decimal"] and the base and format. Let's use the encodings["binary"] line as an example. We're passing it encodings["decimal"], the lower case 'b' is to convert to binary and the '08' part is to format it as an eight bit binary number. For example, let's say we called our program with the following arguments:

base_converter.py bin 1111

This is the decimal equivilant of 15, but we want to pad it with enough zeros to make it eight bits, so it becomes 00001111. e.g. base_convert.py bin 1 would output 00000001. base_convert bin 101 would convert to 00000101. 

Let's look at how we invert the binary conversion next.

''.join('1' if x == '0' else '0' for x in encodings["binary"])

Here we're using a list comprehension to iterate over each bit in the encodings["binary"] string and if it's a 1, change it to a 0 and vice-versa. If you're not familiar with list comprehensions, check out this tutorial. The join function is a neat way to take a list and put it back together as a string. In this case "".join() is joining the bits back together with no separator But let's say you wanted to separate each bit with a dash, you'd use "-".join(), or if you wanted to separate them with a space - space, then you could do " - ".join().

Let's look at the decimal_inv line now. We're using the int() function to take the encodings["binary_inv"] and telling it to convert from base 2 (binary) into an int. This is identical to what we're doing at the top of the function to covert whatever input we received into a decimal as our starting point.

On the encodings["hex"] line we're converting to x (hexadecimal) with a format of 02. So if we called our program with something like base_convert.py dec 2, we will get the output of 02, or base_convert.py dec 15, we'll get an output of ff. Notice the lower case, which is why were using the string method .upper() to convert ff to FF. We'll skip over the octal line as it's pretty much the same as the hex line except we're not using the string method .upper() because octal digits can only be 0-7 (no letters).

Alright, so if you run the code above, you'll see the output contains decimal, decimal_inv, bin, bin_inv, hex, hex_inv, oct and oct_inv. Now let's tackle some more difficult conversions with characters. The conversion itself is simple enough, not much different than what we've already done. The trickiness comes from the fact that there are a number of unprintable characters that will either display nothing, or worse, a line feed or backspace which will mess up your printed output.

Let's just look at the basic code of doing a char conversion and then add code to fix issues.

encodings["char"] = chr(encodings["decimal"])

Simple right? We're using the chr() function to return the character from the ordinal value represented by encodings["decimal"]. Now the messy part. Let's look at an ASCII chart. This one is nice because it shows you all the conversions we're doing here in an easy to read table. Let's assume you call the program with base_convert.py dec 10. Looking at the chart we see that is a line feed (like hitting enter, at least on *nix).

Let's change our code a bit on how we're printing out our encodings and add in the char encoding.

import sys

def convert(conv_type, str_to_conv):
# Convert input(s) into decimal as a starting place for all encodings
if conv_type == 'bin':
decimal = int(str_to_conv, 2)
elif conv_type == 'bcd':
decimal = int(str_to_conv)
elif conv_type == 'chr':
decimal = ord(str_to_conv)
elif conv_type == 'dec':
decimal = int(str_to_conv)
elif conv_type == 'hex':
decimal = int(str_to_conv, 16)
elif conv_type == 'oct':
decimal = int(str_to_conv, 8)
# Set up dict to track all conversions
encodings = {"decimal": decimal}

# Convert to binary and inverse
encodings["binary"] = format(encodings["decimal"], '08b')
encodings["binary_inv"] = ''.join('1' if x == '0' else '0' for x in encodings["binary"])

# Convert to decimal inverse
encodings["decimal_inv"] = int(encodings["binary_inv"], 2)

# Convert to hexadecimal and inverse
encodings["hex"] = format(encodings["decimal"],'02x').upper()
encodings["hex_inv"] = format(encodings["decimal_inv"],'02x').upper()

# Convert to octal and inverse
encodings["octal"] = format(encodings["decimal"],'03o').upper()
encodings["octal_inv"] = format(encodings["decimal_inv"],'03o').upper()

# Convert to ASCII char
encodings["char"] = chr(encodings["decimal"])

return encodings

encodings = convert(sys.argv[1], sys.argv[2])

for k, v in encodings.items():
print(f'{k.upper()}: {v}')

We've added the char encoding at the end of our function, and changed our print to iterate through our dictionary and print each key / value pair with pairs separated on each line. 

Go ahead and run it now with base_convert.py dec 33. Notice on the last line of the output, you have CHAR: !. Now run it with base_convert.py dec 10 and you'll notice that not only is there nothing next to CHAR:, but there's an extra blank line below CHAR:. So how do we fix this? Glad you asked. Let's take a look at the below snippet for the char conversion.

if encodings["decimal"] in range(33, 127) or encodings["decimal"] in range(161,256):
encodings["char"] = chr(encodings["decimal"])

We're going to use an if statement to determine whether encodings["decimal"] is between decimal 33 (inclusive) and 127 (not inclusive) or that it's between 161 (inclusive) and 256 (not inclusive). Let's look back at the ASCII chart.

We're going to ignore converting anything to char that is less than decimal 33 and greater than decimal 126, and in the extended ASCII range (128-256) ignoring anything from 128 to 161 and include anything from 161 to 255. So now that we've excluded a bunch of stuff, it might be nice to show that we couldn't print something on the CHAR: line. How about we use 'xxx' to show we couldn't print that character. We can do that with a simple else statement.

if encodings["decimal"] in range(33, 127) or encodings["decimal"] in range(161,256):
encodings["char"] = chr(encodings["decimal"])
else:
encodings["char"] = 'xxx'

Now when we run our program with base_convert.py dec 7, we'll get CHAR: xxx in the output.

Great right? We've removed all the non-printable characters. Or did we? There's decimal 173 which produces an unprintable character called a soft-hyphen and it's kind of in the middle of our range(161, 256). We have a couple ways to do this. We can put in another if statement to take care of this one special case, or we can add an and condition to our original if statement with the two ors. Let's do the latter so it looks like this:

# Convert to ASCII char and inverse and replace unprintable chars
if (encodings["decimal"] in range(33, 127) or encodings["decimal"] in range(161,256)) and encodings["decimal"] != 173:
encodings["char"] = chr(encodings["decimal"])
else:
encodings["char"] = 'xxx'

Now that we have the char conversions taken care of, let's tackle Binary Coded Decimal (BCD). To refresh your memory, and mine, BCD is the binary representation of a single digit (0-9). So, for the decimal 5, the BCD would be 0101. 10 would be 0001 0000.

What has to happen here is let's say we run out program with base_convert.py dec 255. Since BCD is a binary representation of each digit, we need to break apart each digit (2, 5, 5) and convert each digit to binary and then put those binary conversions back together separated by a space. We're going to revisit our friends list comprehension and the join() function. Let's take a look at the code.

encodings["bcd"] = " ".join(format(int(x), '04b') for x in str(encodings["decimal"]))
encodings["bcd_inv"] = " ".join(format(int(x), '04b') for x in str(encodings["decimal_inv"]))

Let's break down the encodings["bcd"] line.

We already know that format(int(x), '04b') is going to take x and convert to binary with four places.

We know that " ".join() is going to take whatever is inside and put it back together separated by a space.  

And we're left with the list comprehension of do something for each digit in encodings["decimal"], which we need to convert to a string first because you can't iterate an int.

Ok, let's put all the code back together and test it out.

import sys

def convert(conv_type, str_to_conv):
# Convert input(s) into decimal as a starting place for all encodings
if conv_type == 'bin':
decimal = int(str_to_conv, 2)
elif conv_type == 'bcd':
decimal = int(str_to_conv)
elif conv_type == 'chr':
decimal = ord(str_to_conv)
elif conv_type == 'dec':
decimal = int(str_to_conv)
elif conv_type == 'hex':
decimal = int(str_to_conv, 16)
elif conv_type == 'oct':
decimal = int(str_to_conv, 8)
# Set up dict to track all conversions
encodings = {"decimal": decimal}

# Convert to binary and inverse
encodings["binary"] = format(encodings["decimal"], '08b')
encodings["binary_inv"] = ''.join('1' if x == '0' else '0' for x in encodings["binary"])

# Convert to decimal inverse
encodings["decimal_inv"] = int(encodings["binary_inv"], 2)

# Convert to hexadecimal and inverse
encodings["hex"] = format(encodings["decimal"],'02x').upper()
encodings["hex_inv"] = format(encodings["decimal_inv"],'02x').upper()

# Convert to octal and inverse
encodings["octal"] = format(encodings["decimal"],'03o').upper()
encodings["octal_inv"] = format(encodings["decimal_inv"],'03o').upper()

# Convert to ASCII char and inverse and replace unprintable chars
if (encodings["decimal"] in range(33, 127) or encodings["decimal"] in range(161,256)) and encodings["decimal"] != 173:
encodings["char"] = chr(encodings["decimal"])
else:
encodings["char"] = 'xxx'

# Convert to BCD
encodings["bcd"] = " ".join(format(int(x), '04b') for x in str(encodings["decimal"]))
encodings["bcd_inv"] = " ".join(format(int(x), '04b') for x in str(encodings["decimal_inv"]))

return encodings

encodings = convert(sys.argv[1], sys.argv[2])

for k, v in encodings.items():
print(f'{k.upper()}: {v}')


So the only thing we have left now is to do a few other conversions to stuff like base64. But these conversions we want to do on the original input, not the conversions to other stuff. i.e. if you run the program with base_convert.py bin 10110110, we want to do a base64 of the binary, not the decimal we converted it to. Or, maybe you do, but I'll let you work that one out on your own.

To do these conversions, we'll need to import the base64 module. Let's take a look at the new code to do base16, base32, base64 and base85 encoding.

# Base16 conversion of input
encodings["b16"] = base64.b16encode(str_to_conv.encode('utf-8')).decode('utf-8')

# Base32 conversion of input
encodings["b32"] = base64.b32encode(str_to_conv.encode('utf-8')).decode('utf-8')

# Base64 conversion of input
encodings["b64"] = base64.b64encode(str_to_conv.encode('utf-8')).decode('utf-8')

# Base85 conversion of input
encodings["b85"] = base64.a85encode(str_to_conv.encode('utf-8')).decode('utf-8')

Nothing too crazy here, for each encoding we're calling base64.xxxencode depending upon which encoding we want. So what's up with the encode('utf-8') and decode('utf-8') stuff. Well, the base64 module requires an input in byte objects and we need to encode it for base64 to process, then decode it to be clean human readable. You could choose to do encode('ascii') instead, but UTF-8 encodes to unicode which can support pretty much any character in the world, not just what's in the limited ASCII characters.

Ok, so let's put the whole thing together.

import base64
import sys

def convert(conv_type, str_to_conv):
# Convert input(s) into decimal as a starting place for all encodings
if conv_type == 'bin':
decimal = int(str_to_conv, 2)
elif conv_type == 'bcd':
decimal = int(str_to_conv)
elif conv_type == 'chr':
decimal = ord(str_to_conv)
elif conv_type == 'dec':
decimal = int(str_to_conv)
elif conv_type == 'hex':
decimal = int(str_to_conv, 16)
elif conv_type == 'oct':
decimal = int(str_to_conv, 8)
# Set up dict to track all conversions
encodings = {"decimal": decimal}

# Convert to binary and inverse
encodings["binary"] = format(encodings["decimal"], '08b')
encodings["binary_inv"] = ''.join('1' if x == '0' else '0' for x in encodings["binary"])

# Convert to decimal inverse
encodings["decimal_inv"] = int(encodings["binary_inv"], 2)

# Convert to hexadecimal and inverse
encodings["hex"] = format(encodings["decimal"],'02x').upper()
encodings["hex_inv"] = format(encodings["decimal_inv"],'02x').upper()

# Convert to octal and inverse
encodings["octal"] = format(encodings["decimal"],'03o').upper()
encodings["octal_inv"] = format(encodings["decimal_inv"],'03o').upper()

# Convert to ASCII char and inverse and replace unprintable chars
if (encodings["decimal"] in range(33, 127) or encodings["decimal"] in range(161,256)) and encodings["decimal"] != 173:
encodings["char"] = chr(encodings["decimal"])
else:
encodings["char"] = 'xxx'

# Convert to BCD
encodings["bcd"] = " ".join(format(int(x), '04b') for x in str(encodings["decimal"]))
encodings["bcd_inv"] = " ".join(format(int(x), '04b') for x in str(encodings["decimal_inv"]))

# Base16 conversion of input
encodings["b16"] = base64.b16encode(str_to_conv.encode('utf-8')).decode('utf-8')

# Base32 conversion of input
encodings["b32"] = base64.b32encode(str_to_conv.encode('utf-8')).decode('utf-8')

# Base64 conversion of input
encodings["b64"] = base64.b64encode(str_to_conv.encode('utf-8')).decode('utf-8')

# Base85 conversion of input
encodings["b85"] = base64.a85encode(str_to_conv.encode('utf-8')).decode('utf-8')
return encodings


encodings = convert(sys.argv[1], sys.argv[2])

for k, v in encodings.items():
print(f'{k.upper()}: {v}')

And there you have it, your first secret encoder / decoder ring.

If you want to take this further, here's some ideas.

  • Accept multiple inputs, like bin 'FF 07 80' or dec '21 56' or a whole word.
  • Add ROT-13, or even better, let the user input how many characters they want to shift by. Like, rot cattle and it will ask how many to shift by.
  • Create hash values like MD5, SHA1, SHA256. Take a look at the hashlib module.
  • Roman numeral conversions.
  • If you really want to take this to the next level, convert braille, morse code or upside down text. Hint, you'll want to create dictionaries like {'A': '.-'} for morse, and {'G': '⠛'} for braille.
  • And finally, if you want a real challenge convert to and from pig-latin. You're gonna lose some hair, I promise. Dealing with punctuation, legal consonant pairs etc.