Possible major advance in spammer techhiques: Bayesian classifiers

On September 26, 2015, I saw the first pair of examples of what appeared to be much smarter SMTP spam. Both the envelope 'From ' sender and the internal 'From: ' sender were credibly forged to impersonate two personal friends, Michael Siladi and Alison Stern. That wasn't new: Forging of the envelope sender has been a well-tested art since the infamous revenge-spam attack against Joe Doll in 1997 that gave the world the term 'Joe-job'.[1] What was new was the personalised tailoring of some of the body text _and_ most especially the use of recipients in the To: and Cc: headers who were among Michael and Alison's frequent contact addresses -- other people in the science fiction convention-running community and private mailing lists for convention-running. Not that it matters, but the injection point of those mails, back in September, was IP address 212.40.185.205 in Germany, with the prior-hop Received header (before the one for the German mail provider) claiming that it had originated at an ISP POP in Bogota, Colombia. Both Michael and Alison are in Mountain View, California. Back then in September, I sent Michael and Alison a detailed header analysis, pointing out the probable significance of the highly personalised recipient list: I inferred that the spammers had not only harvested detailed traffic information from malware on the MS-Windows box of someone in Michael & Alison's social circle, but also was now using traffic analysis -- turning loose Bayesian classifier software on harvested data concerning who corresponds with whom -- to programmatically compose _more-credible_ spam targeted at the forged sender's known associates, with some message-text contents likewise personalised to the sender. Today, another blast of forged mail arrived on about six diverse mailing lists for science-fiction convention-running plus the "basfa' discussion mailing list of the Bay Area Science Fiction Society -- purporting to be from Michael Siladi, as before. Each of the targeted mailing lists duly transmitted the forgeries to all recipients. The targeted mailing lists + other CC'd/To'd recipients were picked from ones Michael corresponds with. The phrase 'artshow15' in the body text is a name of a private mailing list operated for the 2015 BayCon, a local science fiction convention in the San Francisco Bay Area of which Michael is convention chair. I have posted full data on the BASFA copy of the forgery, plus my personal analysis, here: http://linuxmafia.com/pipermail/conspire/2015-November/008205.html http://linuxmafia.com/pipermail/conspire/2015-November/008206.html Notice my point that Michael's ISP, Netcom, is still in 2015 failing to publish any MX-authentication data (SPF, DKIM, or variants thereof) in its DNS, so it's no wonder that forgeries of Michael's address could not be detected. In my second post, I concluded: I expect a lot of mailing lists will soon have forged-mail spam problems -- not a problem until now. This is a wake-up call. Anyone else seeing this? Other thoughts? [1] See the 'Joe-job' entries on http://linuxmafia.com/kb/Mail/ , if you don't know this story. (I was among the many recipients of the flamebait attempt to lure anti-spam people to attack Joe Doll, probably because I was a regular poster to net.admin.net-abuse.email at the time.)

On Tue, Nov 03, 2015 at 11:14:18PM -0800, Rick Moen wrote:
On September 26, 2015, I saw the first pair of examples of what appeared to be much smarter SMTP spam. Both the envelope 'From ' sender and the internal 'From: ' sender were credibly forged to impersonate two personal friends, Michael Siladi and Alison Stern. That wasn't new: Forging of the envelope sender has been a well-tested art since the infamous revenge-spam attack against Joe Doll in 1997 that gave the world the term 'Joe-job'.[1]
are they spamming to the list or directly to list subscribers? spamming a list and forging a sender-address trawled from the list archives (or via a spammer subscribing and archiving the list) has long been a spammer practice. ditto with sending to addresses known to be subscribed to a list, with forged from address also known to be subscribed. craig -- craig sanders <cas@taz.net.au>

Quoting Craig Sanders (cas@taz.net.au):
are they spamming to the list or directly to list subscribers?
Both to the mailing list and to individual addresses who are established correspondents with the forged sender. (This is obvious to me because I run in very much the same circles.) I retained only one of the ~6 forged mails sent out purporting to be from Michael Siladi today (the BASFA one), but many of the mailing lists (unlike BASFA's) have no public archives, and some of the Cc/To co-recipients were probably not subscribers, either.
spamming a list and forging a sender-address trawled from the list archives (or via a spammer subscribing and archiving the list) has long been a spammer practice.
ditto with sending to addresses known to be subscribed to a list, with forged from address also known to be subscribed.
All of these things are individually old, though forging the envelope header too hasn't been the general rule. What's new, it appears to me, is the intelligent use of traffic analysis in composition of the payload and set of recipients. I'm seeing a greatly more focussed targeting of credible correspondents only and inclusion of body-text snippets actually characteristic of the forged sender. (I'm really _not_ new to this. ;-> ) Let me elaborate on my surmise: Both the Never Say Anything people in Fort Meade, their various Five Eyes co-conspirators in Australia, Canada, Enn-Zed, and the UK, and an increasing tribe of corporate bandits such as Palantir Technologies, have lately made fashionable setting loose Bayesian classifier software on large traffic data sets, looking for exploitable patterns. Operators of botnets vacuum up huge datasets all the time, about malware-infected MS-Windows users' associates and the mutual communication back and forth. It was only a matter of time before botnet-using criminal enterprises started doing the NSA thing on their dataset and using traffic analysis to programmatically craft much-smarter spam. I think that day has recently come. And I think that MTAs that service mailing lists are going to soon need to be _really_ diligent about validating posters' domains MX IPs. Which, in turn, is going to require domain owners to get serious about consistently providing authentication data. My domain does. Michael Siladi's large, established ISP, Netcom, still doesn't. Just a data point. Make of it what you will.
participants (2)
-
Craig Sanders
-
Rick Moen