MI5 Coding Challenge

MI5 (Military Intelligence, Section 5) is the United Kingdom's domestic counter-intelligence and security agency and is part of its intelligence machinery alongside the Secret Intelligence Service (MI6), Government Communications Headquarters (GCHQ) and Defence Intelligence (DI). The service is directed to protect British parliamentary democracy and economic interests, and counter terrorism and espionage within the UK.

You would think that with this description their recruitment process is very strict, but recently I found this coding challenge and was disappointed with the redundancy of the solution. I hope my approach was incorrect and that the real solution to the stenographic challenge is more complicated than I found. Nonetheless, it is a good "Hello World" exercise if you are into the analysis of data, cryptography, and/or lack-of-data-driven investigation.

iVBORw0KGgoAAAANSUhEUgAAAFwAAACjCAYAAAAHK3mUAAAAAXNSR0IArs4c6QAA
AARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAAAT2SURBVHhe7dhR
btw6EERRA9lVlpgdZUEGsgsHMjADjnwlcUQ2WdXSx/l5X7ltYFR8H/9+/fnK5Pff
T2kf9B+3UKA66pjprYPvodjbzz94t4NHohBXL7/hFHt15X16GPLRpJAMqPWIzUqh
YHXUkWYWUrCiw48mxamjDhVNK4Vi3VBXpC6zkEKyoN4W03c4RaqjjlryDx8Kdva9
UihU3TrExeEspNirovu8a/gOp5AMqJVIP3woTB11lKxfmhSsbnelUKQ66lByehZS
bDbU3Sp8h1NIBtRaQ/7h04qONVPVR5NCMqDWaPIrhQ6ljjoerGfhgoKV3bNwsOqP
JsVdFd2n1rSVQiGOqG2P3CykKHXUscVuh1Owk7Q7nDoUNM9Cir1t/8GldziFuHv+
hlOwuzJUxeFHk0IcUdsMp1cKRamjjtHCZiEFu6K+s4btcArJgFr3DDv4bHSsGd6a
hRSSAbVGkf//4XQgddTxIH/wIxSsrPo3nGLVUcdsXT6aFOuOOnsIXSkUkgG11pKc
hRSpjjqI/A6nOCfrHvmDLyjE1Y9ZSMHO1n2zVe1wCsmAWqMNffhQtDrqaCHz0qRY
d9Qp/7SnEEePHvmDb6Eodcu/2/bgRyhYQfUOpyh11DFbt4cPBbujzlbhL00KyYJ6
j1g87d9Bh1GCH00KyYJ6R5JYKXQYddRR4374DNb0G04HUkcdI4V9NCnWHXW+a/hK
oRBH1FZDahZSmDrq2CO7wynOCTUtZA++oBB30gdvRcGzHe5wCsmAWkcY9vChaHXU
0UrmpUnB2Syd8k/7NQpx8vxoUpyTMkrZy0qhkAzKxtm6zEKKVEcdI4TvcIq9sudH
k46lrgxxUbVSKPbq6E41ps5CCslm3Wy3wx8oTt3y77Y9+BEKVnB6pVCkOuoY7bI7
vERNUbrtcArJgnrPCn/4nEXh6qhjTfbgeyjWRehKoWOpo46e7odPZ9RZktzhFOKI
2qwePhSlbt2Q5qVJsYp2VwqFqaMOJU2zkIKdUFO05h1OIVlQb6uhDx+KUkcdLWRe
mhSbUdVKoQOpow4FXWYhBWdD3WcM3eEU4ojaakk8fChKHXXUsHhpUrCr75VCkerW
IS4OZyHF3s7/wafucApxRG1bZB4+JYpSRx1E8uB7KNbJZR8+1DFC+Cyk2Ct7/qTQ
sdSVIS6qfsMpNhvqjtD1o0khGVDrWXYrZQsdStHLR5NCHJVNaqpWCkWpow4FzbOQ
Ym/bf/DwHd6CQtxZ7/AjZaiKw5VCIVlQb7SQWUhx6qgjQvgOpzgn1NRiyMOHQrKg
3j1DDt6CItVRx4P8wfdQrLrDHU6h6qhDRdPDh2LdUFekLi9NCsmAWltJP+2P0JHU
WR98jQLVTFkpdCx11HHG9FlIcU6oaY/EDqeQDKjV+uHTio4UrfmjSSEZUGsPUiuF
wtVRxx7pWUiB7ix3OIW4ePloUtzVlffpYchKoZAMqPWIzSykYHXUkWKHU6yqoR9N
OpY66mhx7/ABykabWUghjix3eDQ6VC9hH00KyYJ6a8muFApVRx1rlrOQYl10/Q2n
46ijjkhhH02Ku5muFApxsfsbTrGuqG+Gtz6aFJIBtUa5d3iBOnqbNgsp2BG17ZHY
4RSSAbVKHLwFhSr7noUUksE6VkHzDqdQddQxSvjDh4IdUdsZQ1+aFJIF9RKLpz0F
qqOOhcXBSxTn4/PrP/yACfdzSOttAAAAbHRFWHRDb21tZW50AEFzIEkgcmVhZCwg
bnVtYmVycyBJIHNlZS4gJ1R3b3VsZCBiZSBhIHNoYW1lIG5vdCB0byBjb3VudCB0
aGlzIGFydCBhbW9uZyB0aGUgZ3JlYXQgdGV4dHMgb2Ygb3VyIHRpbWU3gX+qAAAA
AElFTkSuQmCC

A chunk of random alphanumeric characters, some forward slash, and plus signs. By the character set we can assume this is a BaseN-encoded data. RFC 4648 defines the specification for the Base16, Base32, and Base64 data encodings. For a string foobar the resulting of these three encoders is this:

BASE64("foobar") = "Zm9vYmFy"
BASE32("foobar") = "MZXW6YTBOI======"
BASE16("foobar") = "666F6F626172"

Decoding these strings we find that it is encoded in Base64 as the other two return either an invalid data or invalid UTF-8 string. Notice that the Base32 command is not available in an standard Unix installation, but the package exists and for the Base16 decryption I wrote a script in Perl.

$ echo "iVBOR...uQmCC" | base16 -d 1> /dev/null ; echo $? # --> 1
$ echo "iVBOR...uQmCC" | base32 -d 1> /dev/null ; echo $? # --> 1
$ echo "iVBOR...uQmCC" | base64 -d 1> /dev/null ; echo $? # --> 0

After saving the output of the base64 command we want to use the Unix command file to detect its mime-type which will give us a hint at what data is contained in it. We discover that the data is a PNG image with 92x163 pixels.

$ echo "iVBOR...uQmCC" | base64 -d 1> output.ext
$ file output.ext
file.ext: PNG image data, 92 x 163, 8-bit/color RGBA, non-interlaced

At this point we know this is an steganographic challenge, which generally means there is hidden data in the least significant bits of the image. Usually, the Unix command strings can give us the solution right away if the hidden data is embedded in the comment section, other times the data is another image embedded at the end of the original one, and other times the bits composing the header were modified to make it look like an image.

$ strings output.png
IHDR
sRGB
[...]
ltEXtComment
As I read, numbers I see. 'Twould be a shame not to count this art among the great texts of our time7
IEND

Is this the solution? What does "As I read, numbers I see" means? Is this gibberish to distract us? If we open the file with an image viewer we can see a zebra-like pattern of pink and blue stripes, it looks like image noise.

Remember that an image is composed by pixels, maybe if we read pixel by pixel we can find something. But working with colored images is usually not a good idea. PBM is a simpler format we can use to find hidden data through pixels, we can use Gimp and export the PNG image to PBM in ASCII format, this will represent the black pixels with integer one and the white pixels with integer zero. But this is a coding challenge so lets use code to get the binary output.

First, lets lets convert to black and white using ImageMagick:

$ convert output.png -threshold 50% threshold.png

Now lets use Python to read pixel by pixel, but instead of reading X and Y we will assume that the data is a long single-line image which means we will read from top to bottom on the X axis.

#!/usr/bin/env python
from PIL import Image

solution = ""
image = Image.open("threshold.png")
picture = image.load()

for y in range(image.size[1]):
    for x in range(image.size[0]):
        print(picture[x, y])

Lets go back to the first hint "As I read, numbers I see" there must be some significant meaning on this. We already have a pixel-by-pixel reader and we are seeing numbers, but there is not solution yet, lets think about it... We have an image with random black pixels in a white canvas — or the other way around if you prefer — are they really random? Could the position of the pixels mean something? Lets modify the script to print the position of the pixels are Unicode characters and see what happens:

#!/usr/bin/env python
from PIL import Image

solution = ""
image = Image.open("threshold.png")
picture = image.load()
number = 1
color = 0

for y in range(image.size[1]):
    for x in range(image.size[0]):
        pixel = picture[x, y]
        if pixel == color:
            number += 1
        else:
            solution += unichr(number)
            number = 1
        color = pixel

print(solution)

Voilà! We have an hexadecimal string, lets decode it:

$ python solution.py | tr -d '-' | xxd -p
Songratulations, you solved the puzzle! Why don�t
you apply to join our team? mi5.gov.uk/careers
20 days ago
  • 8da57acFix minor bugs found by the code static ana…
20 days ago
  • caa417fAdd option to configure the malware scanner…
21 days ago
  • 9c86744Modify default value for some of the alert …
21 days ago
  • 84dd39dAdd option to stop sending the failed login…
21 days ago
  • eb05935Add pre-checks for every plugin page for si…
21 days ago
  • d21a062Modify mechanism to ignore files from integ…
22 days ago
  • b1a9169Add developer option to disable failed pass…
23 days ago
  • 4e3ef13Add support for other English and Spanish b…
23 days ago
  • 2d07b4eFix error interception for Firewall API err…
24 days ago
pushed to master at cixtor/slackapi
  • 4a2c1c8Modify data type for methods related to cha…
  • 6716199Add CLI handler for the users.identity API …
  • f9c448dAdd CLI handler for the mpim.open API endpo…
  • 305d1c4Add CLI handler for the mpim.mark API endpo…
  • 8bb89afAdd CLI handler for the mpim.close API endp…
  • 202a017Add CLI handler for the dnd.teamInfo API en…
  • 68819e9Add CLI handler for the dnd.info API endpoi…
  • 9a2b29aAdd CLI handler for the dnd.endSnooze API e…
  • e7dc86aAdd CLI handler for the dnd.setSnooze API e…
  • 111c53fAdd CLI handler for the dnd.endDnd API endp…
  • cdb620dFix token usage when there are no extra par…
  • View comparison for these 11 commits
24 days ago
pushed to master at cixtor/slackapi
25 days ago
  • 6497e80Remove unnecessary automatic blacklisting o…
25 days ago
opened pull request Sucuri/sucuri-wordpress-plugin#40
Fix multiple bugs with the API calls and queue system
18 commit with 793 additions and 293 deletion
25 days ago
25 days ago
28 days ago
  • 38cc02aModify timing for the dashboard alerts afte…
28 days ago
  • 350c074Fix infinite loop with email alerts and SMT…
28 days ago
  • acff4aaFix detection of base URL with built-in fun…
28 days ago
29 days ago
29 days ago
30 days ago
30 days ago
opened pull request Sucuri/sucuri-wordpress-plugin#39
Add queue system for the security logs and cache improvement
5 commit with 517 additions and 541 deletion
30 days ago
1 month ago
  • 4c51445Fix static function call of non-static Site…
1 month ago
1 month ago
  • af47581Add changelog to release version 1.8.6
1 month ago
opened pull request Sucuri/sucuri-wordpress-plugin#38
Add changelog to release version 1.8.5
6 commit with 4062 additions and 1841 deletion
1 month ago
  • dc1a05aAdd changelog to release version 1.8.5
1 month ago
Do you have a project idea? Let's make it together!