For the past eight months or so, I’ve been working sporadically on a side project of mine I call Poet. Poet is basically a tool for hackers that’s useful for post exploitation, that is, after you’ve initially exploited and gotten access to the computer you’re not supposed to have access to. Poet is useful because it essentially acts as a backdoor you can install into a system to help you maintain access once you’ve gotten your foot in the door.
As a disclaimer, I am building Poet purely for my own education and learning experience. The code is freely available because I think it might be useful to others interested in learning about this sort of thing. Use it responsibly.
I’ve learned a lot during the process of building this tool and I thought it would be cool to write a blog post (possibly more to come) documenting that process.
The initial motivation for this project came from an experience I had participating in the 2014 Northeast Collegiate Cyber Defense Competition. In short, the competition requires a team of students to protect a small business IT infrastructure from a red team of hackers. Usually the red team is really good and completely owns you at some point or other throughout the competition, and at the end they tell you what they did and give you tips on how to improve. In particular, the red team told us that there was pre-installed “beaconing” malware on many of our systems from the start of the competition that would “phone home” to a command and control (C2) server every once in a while to get commands and tasks to execute on the target system. This idea was pretty interesting to me, and a basic implementation didn’t actually seem too hard to write, so I decided to give it a try, even though at this point, I had no experience with network programming.
The first version of Poet was drastically different from the current form and was just about as simple and primitive as it gets for something like this. In this version, the client program (executed on target) would repeatedly attempt to connect to a socket (port 80 by default) on the server (attacker’s C2 server) at a specified interval. If the connection failed (server wasn’t running), the client would sleep, otherwise it would execute a command sent from the server, sending back the stdout of the command. The server simply maintained a queue of commands to execute and would one by one pop them off the queue and send them to the client, printing out the stdout when it came back. This was a great exercise to learn the basics of socket programming, but of course wasn’t very useful at all, for a number of reasons. First, ideally the client’s interval is very large so as to minimize network use and remain stealthy but that puts a hard limit on the rate at which commands can be executed. This system was also very inflexible because there was no way to reorder or edit commands in the queue, since the “user interface” was just a server script that was run with the commands to execute as arguments. Overall, a good start, but there was definitely much work ahead to actually make this a semi-realistic tool.
The second version of Poet involved a pretty substantial redesign although one of the things to stay the same would be the high level client/server beaconing dynamic. This is far superior than having the client attempt to listen on the target’s end because in any sort of “real” scenario, the target will likely be behind a firewall that will reject incoming packets on arbitrary ports. The beaconing model will allow the tool to bypass most standard firewalls that aren’t specifically targeting it because outbound port 80 traffic is almost certainly allowed. I later changed the default port to 443 because it’s just as likely to be allowed out and because it could potentially avoid packet inspection, since traffic on 443 is usually encrypted.
Building on top of this model, there were a couple other brainstorms I had to build on top of v0.1. Instead of executing a single command for every ping, what about sending over multiple commands? What about a pastebin/gist URL to a script that the client would download and execute? These would help solve the rate limit problem because an arbitrary number of commands, instead of one, could be executed for each ping. What about user interface? What about creating an actual web user interface for managing the command queue that the client was pulling from each ping? This would help solve the flexibility problem.
While these would be relatively simple to add to v0.1, if I asked myself, “If I were a hacker, would I want to use this tool?” the answer would be “No way!” because I would only be able to interact with my target system via discrete scripts, and I would have to wait the ideally large time interval between pings to get any sort of feedback on my actions.
This made it obvious that I would have to move from a design where each ping from the client was an opportunity for the user to run x actions on the target, to a design where each ping from the client was an opportunity for the user to interactively control the client for an unlimited time, and perform actions on the target with continuous feedback. With this in mind, I opted to use a shell as the user interface on the server side since it seemed simpler to implement and I was more familiar with implementing a shell versus something like a web interface (which would likely have to have a shell built into it anyway for executing commands). The server design would be similar to that of v0.1 in that the server would only be running when the user wanted to control the client, and the client would use the inability to connect to the server as an sign to “go to sleep” for another interval, although this isn’t strictly necessary. Another server design I thought of would be an always-on model where the server would always answer the client’s ping with some kind of binary state value, which would work equally well, but wouldn’t be strictly necessary because state can be inferred as described above.
In designing the actual protocol the client
and server use to communicate, I decided to
use HTTP to mildly obfuscate the client’s initial check if
the server is running. The client’s ping consists of a
GET request for
/style.css on the server 1. Of course, the server isn’t
a real web server, but it temporarily masquerades as one for the purposes of
the initial handshake and sends back a hardcoded HTTP response of some random
css file, and launches the control shell for the user. At this point, the
protocol used is as simple as it gets: size of the following data + the data
itself. This being my first time doing socket programming, my implementation
was a little weird and reserved the first five bytes of the data sent over the
wire for the ASCII decimal representation of the size (
hey, it worked.
The majority of the work left for v0.2 was essentially deciding on the features that the control shell would have and implementing the “userland utilities” or commands you could run at the shell. The commands I thought of and implemented were:
exec: This was the first command I wrote. It executes one or more commands on the target, sending the stdout of all of them back in one big chunk of text. I later added a flag that would save the big chunk to a file in the archive directory. Useful for stuff like grabbing process dumps.
recon: Basically like
exec, but the commands are pre-selected and are tailored towards “reconnaissance” purposes. Stuff like
shell: Launches an actual remote shell on the target (inside the original control shell). Was implemented really crudely in this version with the execution backend on the client simply being something like
exfil: Exfiltrate files and saves to the archive directory. Pretty standard. Current implementation is pretty crude, and loads entire file into memory, rather than paging the data somehow.
selfdestruct: Exit the client and delete script on disk. Without this, the user would have to do something weird (nay, treasonous?) like killing the client’s process from its own remote shell to completely turn off the client.
dlexec: Download an executable from the internet and execute it. Also pretty standard, useful for upgrading or installing additional tools on target
exit: Pretty self explanatory, this tells the client that the server’s done for now and that the client can now go back to sleep and begin pinging again in one time interval.
This work resulted in a decently functional prototype that could feasibly be used for post-exploitation.
Version 0.3 was a pretty arbitrary decision, but mostly involved significant
refactoring of the backend code, with a couple new user facing features.
One notable change was the refactoring of the entire codebase from imperative,
C-style programming to object oriented style, which gave the code much better
structure. The communications protocol was also slightly refactored to be
more standard by reserving the first four bytes for
the binary data size value which simultaneously conserved
bytes sent over the wire and increased the maximum data that could be sent in
one message between client and server. An additional shell command I implemented
chint, standing for “change interval”, which lets the server
change the client’s ping delay interval after the client’s been started.
All that’s great, but the most significant set of improvements in my opinion were related to fleshing out the remote shell feature. While it “worked” to a decent degree for most standard commands there were two main problems with it that kept it from being a “real” shell. For reference, here’s what the code looked like for executing commands in v0.2.
1 2 3
For those that aren’t as familiar with Python’s
subprocess library, this
executes an arbitrary command line (
cmd), sending the stderr to the stdout,
and returns any stdout of the command.
The first problem was that the shell
output was not continuous – when executing a command like
ls -R /, which
typically results in lots of scrolling output in a normal terminal, my remote
shell would instead block on the server end while the client executed the
command to its completion and sent over the entire stdout as one big piece.
I solved this problem by adapting the code to continuously poll the
stdout file descriptor for new lines of output, sending those over individually
so that the server would get each line of output as soon as it was available.
The second problem was that if certain commands like
ping were executed in the shell,
the client would effectively become unusable because
ping (when executed
-c parameter) is usually ended by being sent a INT signal (SIGINT),
typically by hitting Ctrl-C on the keyboard. The problem is, the client side
had no mechanism to receive signals and send them to the running process, so
it would be eternally running this unending process and the user
would totally lose control of the target. To solve this problem, I needed a way for the
client to simultaneously execute the requested command, and listen for
messages from the server, presumably telling the client to end the running
process. To do this, I learned to use the
select() function which is an
easy way for an application to multiplex data streams (in this case,
the stdout of the running process, and the socket connection to the server)
and process their data without requiring concurrency at the application
The resulting code from these two fixes is below. Select takes in multiple file descriptors (File objects in Python) and returns which ones are readable (in this example). After it returns, I can check which file descriptors it returned, and proceed accordingly. In the expected case where we can read from the process’s stdout file descriptor, we get a line of stdout from the process, forwarding it to the server immediately. In the exceptional case where we can read from the socket, we receive the message, making sure it contains the proper keyword to end the process (‘shellterm’) and terminating the process if it does.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
That’s all for v0.3. Again, the v0.3 decision was pretty arbitrary and I ultimately chose to ship it because all the other features I had lined up at the time were more labor intensive/experimental and I really wanted to get my fancy new remote shell into master :D.
At this point, I’m pretty satisfied with the state of the project, but as always, there’s more work to be done. Here are some future features/ideas/improvements that may or may not ever get implemented:
- crypto: If anything, I’d say encrypted communications are the one thing keeping this from being a really usable tool. Right now, communications are sent in the clear essentially, although they are base64 encoded for the slightest amount of obfuscation. Ideally I’d use Python’s ssl library, probably doing something like hardcoding a server public key into the client. A solution that wouldn’t be as much work, but only slightly more secure than the current cleartext communications would be to use a basic xor cipher which would be pretty easy to write, and force an analyst to retrieve the key from memory, or the initial exchange over the network, depending on how I chose to implement it.
- protocol improvement: This shouldn’t actually be too hard to implement,
but the data section of a poet message is typically some type of keyword, a space,
then any relevant data. For example, to start a shell, the server sends
over “shell”, to get recon data, the server sends “recon”, for an
execcommand, the server sends over “exec” followed by the commands to execute. Instead of using these string keywords, it would be possible to move them into the protocol as a single byte after the data size and have some sort of lookup table for referencing the appropriate action to each key.
- interval fuzzing: Instead of having a strict, predictable delay time interval for client pings, I could implement some sort of fuzzing so that the delay time is slightly variable for further obfuscation purposes.
and last, but not least…
- botnet(?!): Now that I more or less have the infrastructure down for controlling a single client, it would be pretty cool to fork the project and adapt it for a more distributed design with multiple clients connecting to the server, all receiving commands to execute.
Again, all the code for this project is available on github. Hopefully this was interesting/helpful for some people, and as always thanks for reading!
Writing this post and thinking about this again actually helped me discover a bug in the server where the server would terminate if it happened to receive a non-client HTTP request while waiting for the client. This would enable a third party that wanted to mess with the Poet user to spam the Poet user’s machine with HTTP requests (assuming they knew the proper port to send to) at any interval smaller than the Poet interval, and effectively DOS Poet.↩