Freedom Hosting 2: Forums

In a previous installment, I gave an introduction to the Freedom Hosting 2 database dumps. The main thrust of that article was identifying groups of sites that could be analyzed together, rather than trying to analyze each one of the nearly 11,000 sites individually. One of the most common uses of FH2 sites is forum hosting, which is the topic of this post.

Overview

Several results really surprised me! For example, a couple forums have only a handful of legitimate posts and yet tens of thousands of spam posts. The spammers didn't seem to care that they were posting on boards that nobody was actually looking at.

wordcloud for spam
Word cloud for an FH2 forum that was overrun with spam.

The even bigger surprise is that I found at least one site that appears to be distributing stolen personal information. I generally believe that fear of the dark web preying on the average consumer is overwrought. Commercials like this one make me cringe so hard I could pull a muscle:

Shadowy figure sitting in the dark, face illuminated solely by a computer monitor, set in front of an ambiguously tech-ish background? Check, check, and check.

And yet... and yet! One of the sites contained in the dump is a Portugese carding site that appears to have posted stolen consumers' personal information.

wordcloud-2--1
Word cloud for an FH2 forum that trafficks in stolen identities.

I cannot confirm if this data is real, but it certainly seems plausible given the street names, first names, birth years, and zip codes that are contained in the database dump.

Some parts of the analysis were less surprising. I found the typically expected dark web topics like hacking and meta-discussion of Tor itself. On a more depressing note, the data indicates that Freedom Hosting 2 trafficked in a large volume of child exploitation content. I will dig into this and more in the analysis below.

Analysis

Since the word "forum" has several meanings, I am going to define it narrowly in this article to refer to sites that allow for posting messages in topical threads where other site members can read and reply. Forums are typically subdivided into smaller "boards" that each have a name and a theme in order to organize all of the conversations. This definition of "forum" should be contrasted with "chat room", where all of the messages are published in one long stream.

In the first article, I pointed out that there were 10,992 active sites on FH2 at the time that it was hacked. Out of those, only 797 sites had any database contents. Of these, I estimate that 159 are running some type of forum software.

To begin analyzing the forums, I ran SQL queries on all of the suspected forums to pull out information such as the forum's name, the number of posts, the number of accounts, etc. I also pulled samples of usernames, post subjects, etc. in order to get a sense for what the site is: what languages are used? What topics are discussed?

spreadsheet of forum data
My forum analysis spreadsheet.

Dozens of sites have either zero or one post, many with titles like, "Welcome to phpBB 3". These are the messages created automatically when the forum software is first installed. This suggests many of these sites were set up and then never used. It also suggests that many of these sites don't warrant a deep analysis. The following chart shows the distribution of posts across all of these sites.

posts on each forum site

The horizontal axis shows each site, and the bar represents the number of posts on that site. The three largest sites are much bigger than the rest of the sites. In fact, only 22 sites have more than 100 posts each, and the bars for most of the sites are so small that they are not even visible. The rest of this post will focus on the 22 sites with at least 100 posts each.

I'll start with a quick skim of the top 5 sites:

TypeOnionNamePosts
smf kav3udmxn34tke34 N/A 37,160
punbb s7yzinqketc6k6ke Это сделал милашка ^_^ 25,791
phpbb tmoxh4kr5xfnvxun 22,714
phpbb uudhz333oblcbsru • TREND FORUM • 4,878
smf hforum53umdxo7b3 N/A 3,810

The names of these sites are not very informative. Simple Machines Forum ("smf") doesn't include the site name in the database table. (The name is stored in a configuration file, and as I mentioned in the previous post, the leaked data includes only databases—not files.) For reasons which elude me, the site tmoxh4kr5xfnvxun appears to have just set their name to be blank. The Russian name translates (according to Google) to It did cutie ^_^, to which my only reaction is ¯\_(ツ)_/¯.

The next chart shows the number of posts across all sites grouped by platform.

posts by platform

Although phpBB is one of the dominant forum platforms on the light web, Simple Machines Forum and PunBB are very popular in this dataset. SMF and PunBB are both focused on being simple to install and operate, which may reflect FH2 site owners being less technically oriented and favoring simpler software. These numbers might also reflect a perception of a lack of safety in phpBB, which has a slew of security vulnerabilities that are undesirable when trying to maintain anonymity on the dark web.

Another explanation is that—as we'll see later—the most popular SMF and PunBB sites are overwhelmed with spam. If phpBB is less susceptible to spam posts, then spam would artificially inflate the apparent share of the SMF and PunBB sites.

Next, I reviewed a sample of topics and subject lines from the forums to identify the primary language and topic of each site.

sites by language

This chart shows the primary languages for the sites, where each site is weighted by number of posts. Note that this is not counting the number of individual posts in each language; instead, I've determined the primary language of the site and then weighted it by the number of posts on that site. So if a site uses multiple languages, then the representation of that language would be skewed in this chart.

Unsurprisingly, English is the dominant language.

sites by topic

This chart shows the sites categorized by topic (and weighted by posts in the same manner as the previous chart). These topics are a bit vague, of course, since each site will have various topics being discussed within, but I tried to capture the main focus of the site as best as I could determine from sampling random messages.

Ignoring spam for a moment, the biggest category is child exploitation (CE). These sites exchange pictures and videos depicting child victims of sexual abuse. Due to the sensitive nature of this topic, I won't post any further details, but taking the message text at face value, these sites are trafficking in extremely illegal and reprehensible content. This finding calls back to one of the claims I discussed in my first post in the series; the hacker who leaked this data justified his/her actions by claiming that Freedom Hosting 2 was providing the infrastructure for massive amounts of child exploitation. This analysis leads me to believe that claim is true.

The remaining four categories are a bit interrelated and sometimes difficult to tease apart. For example, "hacking" forums may describe techniques that useful for illegaly obtaining somebody's credit card numbers, i.e. "carding". The "markets" might sell hacking and carding tools. But even if we add all four of these categories together, it is still far less volume than the child exploitation category.

The final category is "spam", meaning that the site itself is so flooded with spam messages that I could not determine any other topic. As best as I can tell, some of these sites were overwhelmed with spam almost from the minute they were set up. Here are the first 20 posts from the Russian forum It did cutie ^_^ in chronological order.

UsernameSubject
AdminTest topic
makaтесты тесты
iuriwilewibHeart iritis; recommended secondarily occult reconstruction.
KellyMcCouIm happy I finally signed up
uimirueqabiyoTwo entirely overlying high-dose pneumonia, duplex.
ephexoxusShocked narrows abduction, coughs, scarring; hypoglycaemics.
viijawimowMostly survival, measure, to: leprosy snugly.
imuceawifeWatch cognitions, ventilate axilla ligaments.
tigujazoveR; x-ray brightest focuses jaw.
upexoacitusacThe increasing levitra medicine slow-growing specialists herniae.
yimusyajiAllergic cerebello-pontine pessary self-harm pan-intestinal commences.
agalamaehuhThey nocturia, thinning mellitus canadian pharmacy timings edges.
inofileledeprDull, canteen, progeny incidentally, your elliptical dressing.
ocegowujeruBoth femur: trapdoor granulocytic, hepatocytes, testing.
onbakaketEthical non-union, numerous history, tumour.
ytiyopanAfter diabetic, lifestyles viagra lactate, sulfate transplantation.
exokipaqutUrinary coexistent contraction, neuropathies saponification.
aadorenusahMagendie tretinoin cream putatively levitra grasping elevators units.
ahocamjalWhat plate interfascicular feeble 107.
enasahawjidiThe numbness, distinct ammonium, obturator; periumbilical proximally.

You can see the first post is by the administrator and is just a test post. The second post is by a user named maka and says "test test" in Russian. Nearly all of the rest of the posts have apparently random names like yimusyaji and text that appears to be generated by an AI. As far as I can tell, this site has never had any real users posting on it.

Conclusions

Thanks for making it this far! As always, hit us up on Twitter with your comments and questions. Keep an eye for future articles in this series (you can subscribe with RSS!) where I will dig into the chat rooms and some odds and ends.