CTF Team at the University of British Columbia

[ACTF 2022] FFSK

26 Jun 2022 by - Angus


This is the challenge entitled “FFSK” from Azure Assassin Alliance CTF 2022.

The given description is as follows:

I’ve bought the second commercial modem for computers in a big city of the UK.

激情澎湃的球迷迷恋这个地方。遇上球赛季,酒吧里的热情、呐喊、啤酒、摇滚,足球 让这个城市充满活力和希望。


阳光明媚的日子,开始出发,北京时间00:50 开始起飞,一个梦的距离,就可以到达荷 兰阿姆斯特丹,短暂停留之后,然后转机飞往英国

南航的飞机配置完备,全程可以充电,还有wifi,影视屏有面前最新的电影。睡睡醒醒 ,在飞机上觅到一部《北京爱情故事》,让我在三万英尺的空中哭的稀里哗啦。

A modem.wav file is provided. As the name suggests, it’s a WAV file that sounds like modem communication, and presumably is a signal of some sort.

During the competition, it was solved by exactly one team - us!

First off, some meta musing stuff

If you’re boring enough (like me) to have done a bunch of escape rooms, puzzle hunts, and the like, you’ll eventually get an idea of what a “good challenge” is. In my opinion, the important things:

  1. You shouldn’t have to guess (that much).
  2. You should know when you’re making progress.
  3. Red herrings are generally bad.
  4. If you use something to make progress, you shouldn’t have to use it again.

Usually, this isn’t too relevant for a CTF - you only need to figure out one exploit or two to get a flag - but this challenge gave the impression that there’d be many more than two steps required for a solution. So it’s useful to think of this like an escape room, where we have an inventory of items, and each item will serve exactly one purpose to make progress in the challenge.

Building an inventory

I can read maybe a couple Chinese characters, but thankfully [insert search engine of choice] does a much better job. Eventually I found a blog post with identical text, and it described a trip to Manchester, which happens to be a big city in the UK.

Searching for “the second commercial modem for computers” leads to the Wikipedia page for the Bell 103 modem. This excerpt will be relevant:

The Bell 103 modem used audio frequency-shift keying to encode data. Different pairs of audio frequencies were used by each station:

  • The originating station used a mark tone of 1,270 Hz and a space tone of 1,070 Hz.
  • The answering station used a mark tone of 2,225 Hz and a space tone of 2,025 Hz.

And the challenge itself is named “FFSK”, which evidently stands for fast frequency-shift keying. There aren’t too many details on the “fast” part, but the other words certainly line up.

Thus, we’ve used up the challenge name and description, and in return, we’ve got a fairly good hunch that the given WAV file uses FSK to encode data, and it’s using the frequencies used by the Bell 103. And something to do with Manchester?

To convince ourselves of this, we can write copy Stack Overflow code to compute the power spectrum of the given signal:


Those four peaks correspond to 1070, 1270, 2025, and 2225 Hz - the frequencies used by the Bell 103. So we’re on the right track!

Standing on the shoulders of amateur radio operators

Here’s a fantastic program which will be incredibly useful throughout this challenge.

It even comes with built-in presets for the Bell 103, so we run it and get…

$ minimodem --rx --file modem.wav 300
### CARRIER 300 @ 1250.0 Hz ###
### NOCARRIER ndata=955 confidence=1.863 ampl=0.140 bps=298.50 (0.5% slow) ###

[further results snipped as they're much the same...]

Hmm. It looks like it confidently decoded a bunch of garbage. The output also shows that it interpreted the signal using 1250 Hz, though - examining the minimodem source shows that it uses the 1270/1070 Hz pairing by default, so the program must have shifted the frequency slightly to better match the data. We haven’t used the other pairing of 2225/2025 Hz though. What does that give?

$ minimodem --rx -M 2225 -S 2025 --file modem.wav 300
### CARRIER 300 @ 2250.0 Hz ###
�����:HIT_Hammin'@d$ddPdddddddPddddPP(28).CCode; C'nW��; Why do you use such a
slow method with a high Bit Error Ratio for communication? It took me a lot of
effort to correct bit-flips, making sure the communication was less
TRANSFORMATION! Fortunately, we can now communicate properly on another channel
while enjoying a vacation in this BG CITY--I mean, IEEE 80r.3.....Wait, what is
the new protocol? Guess by yourself!
### NOCARRIER ndata=503 confidence=2.049 ampl=0.148 bps=300.02 (0.0% fast) ###

This looks way better! The start of the message is kind of garbled (and unfortunately, ignoring it will prove problematic later on…), but everything else is parseable. Another reference is made to a “big city”, and combined with the “IEEE 80r.3” leads to Manchester code, specifically the IEEE 802.3 convention.

OK, so let’s do a quick inventory check:

We don’t know what the last thing is, so the other thing seems good to work on now that we can guess that the initial decoding was garbage because it required an additional Manchester-decoding step.

Reversing the Manchester code is simple enough - a falling edge (1 followed by 0) is a 0, and a rising edge (0 followed by 1) is a 1. Notably, there should never be three consecutive 0s or 1s in the signal.

Oh say can you C

Fortunately (unfortunately?), minimodem is written in C and is fairly easily to build from source, so I spent a few hours trying to decipher and add to the code. I ended up realizing that instead of attempting to be clever and join bits together, it was a lot easier to read the entire signal in and calculate each bit in one pass, and ran the following snippet:

// in minimodem.c:
// make the buffer large enough to read the entire signal in one go
samplebuf_size = 40000000;

// in fsk.c:
// code to manually read all bits
for (int i = 0; i < 107280; i++) {
    memcpy(fskp->fftin, samples+ (i*bit_nsamples), bit_nsamples * sizeof(float));
    float mag_mark  = band_mag(fskp->fftout, fskp->b_mark,  magscalar);
    float mag_space = band_mag(fskp->fftout, fskp->b_space, magscalar);
    // mark==1, space==0
    debug_log("%d", (mag_mark>mag_space));
// no need to do anything else in this run

This gave a bit string that miraculously had no instances of three 0s or 1s in a row, so Manchester decoding worked and gave a string of 107280/2 = 53640 bits.

In retrospect, all this code does is break up the initial signal of 17164800 samples into 107280 sections, and determine whether each section looked more like a 1270 Hz (mark) signal, or a 1070 Hz (space) signal. It would’ve been much simpler to write a Python script, but…too late.

Quick maths break

There are a lot of magic numbers here, but everything actually divides very nicely, so props to the challenge author for making it so.

The initial signal contains 17164800 samples of a single audio channel at 48000 Hz. The Bell 103 uses a baud rate of 300, which means that 160 samples should be used to decode each bit. Thus, there are 17164800/160 = 107280 bits to be decoded, which is conveniently an even number so Manchester decoding works with no problems.

As stated previously, a Good Challenge should minimize guessing and give good indicators of progress, so when things seem to work out in a coincidence, it’s likely very intentional.

OK, now what?

At this point, I tried feeding the 53640-bit string back into minimodem’s decoding algorithm with little success. Thankfully, I only ended up wasting a few hours (only), because the organizers released a few hints:

  1. 所有人都认为,吃鸡蛋前,原始的方法是打破鸡蛋较大的一端。可是当今皇帝的祖父 时候吃鸡蛋,一次按古法打鸡蛋时碰巧将一个手指弄破了,因此他的父亲,当时的皇帝, 就下了一道敕令,命令全体臣民吃鸡蛋时打破鸡蛋较小的一端,违令者重罚。 老百姓们 对这项命令极为反感。历史告诉我们,由此曾发生过六次叛乱,其中一个皇帝送了命,另 一个丢了王位…关于这一争端,曾出版过几百本大部著作,不过大端派的书一直是受禁的 ,法律也规定该派的任何人不得做官。 ——乔纳森·斯威夫特,《格列佛游记》
  2. Hamming code block size: 20bits
  3. Bell 103

The third hint is useless, since we’ve already used that information. The first hint is a Gulliver’s Travels quote that mentions “Big Endian”, which is the only remotely relevant term in the quote. And the second hint…let’s go back to our inventory:

Imagine a very loud slap - that was me bringing my palm and forehead together after realizing the “garbled” part of the first signal decoding was probably not actually meant to be garbled. Playing around with some of the minimodem arguments (namely -c and -l) gave a very slightly different, but very much more useful message:

�����rHINT_Hamming@ddddPdddddddPdddPdPP(20).ECCode; Content: Why do you use such
a slow method with a high Bit Error Ratio for communication? It took me a lot of
effort to correct bit-flips, making sure the communication was less
TRANSFORMATIONS! Fortunately, we can now communicate properly on another channel
while enjoying a vacation in this BIG CITY--I mean, IEEE 802.3.....Wait, what is
the new protocol? Guess by yourself!

The message itself is cleaned up a bit, and notably “TRANSFORMATION” in the original decoding became plural form, but the beginning of the message now clearly references Hamming code, giving a mapping of which bits are used for parity checking and which bits are used for data.

Hamming it up

The math again works out perfectly: we can split 53640 bits perfectly into blocks of 20 bits.

Conveniently, the Wikipedia article uses 20-bit blocks as an example, so the procedure I followed here was:

  1. Read the Wikipedia article.
  2. Give up.
  3. Copy the first search engine result for Hamming code implementation.

This surprisingly worked out, albeit with some missteps where I thought that the “big endian” clue was supposed to be used here. Once I got the code working, a couple clues that I was on the right path revealed themselves:

  1. Every block of 20 bits had a single bit error.
  2. And all of the error indices were between 1 and 20, which isn’t guaranteed since 5 parity bits could indicate an error between indices 1 and 32.

Applying the error correction and removing the parity bits results in a new string of 40320 bits. We just have this and the “Big Endian” hint left in our inventory…

Fixed-width fonts FTW

This step happened mostly by accident. I printed out the resulting string from the previous step in a Python shell, and here’s what part of it looked like:


It’s much less obvious without being able to scroll up and down, but:

and this pattern repeats with a period of 10 columns. This was incredibly lucky, as my terminal happened to be 190 columns wide.

If this hadn’t happened, I might’ve lucked out by resizing my terminal window and noticing the identical columns. Failing that, I (hopefully) would’ve realized that there was another thing left in the inventory: the actual FSK decoding of the signal that we’d left behind many steps ago. The convention for a single ASCII character is 1 stop bit, followed by 8 ASCII character bits, followed by 1 stop bit. The MSB should always be 0, and the start and stop bits happen to be identical, even though they don’t have to be, so this all indicates that we’re looking at padded ASCII characters.

This is also where the “Big Endian” hint finally reveals its use - kind of: it got me careful to check endianness of the ASCII characters, so I had to reverse the bit order to parse them correctly. Which means that they were little-endian, but in any case they decoded to sane-looking characters. (Also, yet again, the string of 40320 bits evenly divided into blocks of 10 bits.)

The final steps

The sane-looking characters in question were:


👀 👀 👀, as the saying goes.

Pasting all this into a browser gave a QR code:


There was a QR code challenge in this CTF that had only one solve, but thankfully this code could be scanned and contained the text ACTF{wow_h0w_IEEE_U_r}, which is the end of our journey.