parse squid logfiles for cryptolocker URL's

I'm trying to find a simple way to parse squid logfiles looking for cryptolocker (http://en.wikipedia.org/wiki/CryptoLocker) URL's. The proxy in question denies these anyway because the current version of cryptolocker doesn't authenticate and this proxy requires authentication, so right now it's a useful trigger to notice an infection after the fact but before it has downloaded enough to start infecting user files. The url's in question are <something>.net/com/biz/etc, and some examples of the something are: qoemswifeitgetscytkircyfq diqkbihifambsnvbylvtdcyyd tlfmwcyfikzcuqoqgpzdpz so they are random strings of varying length. The challenge is to find a way to identify them without an excessive amount of CPU time (eg not dictionary lookups). Any suggestions? Thanks James

I'm trying to find a simple way to parse squid logfiles looking for cryptolocker (http://en.wikipedia.org/wiki/CryptoLocker) URL's. The proxy in question denies these anyway because the current version of cryptolocker doesn't authenticate and this proxy requires authentication, so right now it's a useful trigger to notice an infection after the fact but before it has downloaded enough to start infecting user files.
The url's in question are <something>.net/com/biz/etc, and some examples of the something are: qoemswifeitgetscytkircyfq diqkbihifambsnvbylvtdcyyd tlfmwcyfikzcuqoqgpzdpz
so they are random strings of varying length. The challenge is to find a way to identify them without an excessive amount of CPU time (eg not dictionary lookups).
Taking advantage of the fact that the requests are DENIED, and that the url is http://<name>.<tld>/ with no further path, this gets relatively few false positives: zgrep DENIED /var/log/squid/access.log-201404* | egrep 'http://[^.]{10,}\.(com|biz|net)\/ ' but obviously still hits on a few legitimate but long url's. Given that it gets a tiny handful of hits for a non-infected computer, but hundreds and hundreds for an infected computer, it should be relatively easy to sift the results a bit and come up with something. Further suggestions appreciated though! Thanks James

James Harper wrote:
I'm trying to find a simple way to parse squid logfiles looking for cryptolocker (http://en.wikipedia.org/wiki/CryptoLocker) URL's. The proxy in question denies these anyway because the current version of cryptolocker doesn't authenticate and this proxy requires authentication, so right now it's a useful trigger to notice an infection after the fact but before it has downloaded enough to start infecting user files.
To ask a stupid question; I take it that this is a legitimate 'luv-main' subject because the parsing is being done on a Linux machine ? I notice from the above URL : "CryptoLocker is a ransomware trojan which targets computers running Microsoft Windows[1] and first surfaced in September 2013................. "; there being no suggestion that Linux machines are affected ? thanks Rohan McLeod

James Harper <james.harper@bendigoit.com.au> writes:
I'm trying to find a simple way to parse squid logfiles looking for cryptolocker (http://en.wikipedia.org/wiki/CryptoLocker) URL's. The proxy in question denies these anyway because the current version of cryptolocker doesn't authenticate and this proxy requires authentication, so right now it's a useful trigger to notice an infection after the fact but before it has downloaded enough to start infecting user files.
The url's in question are <something>.net/com/biz/etc, and some examples of the something are: qoemswifeitgetscytkircyfq diqkbihifambsnvbylvtdcyyd tlfmwcyfikzcuqoqgpzdpz
so they are random strings of varying length. The challenge is to find a way to identify them without an excessive amount of CPU time (eg not dictionary lookups).
Taking advantage of the fact that the requests are DENIED, and that the url is http://<name>.<tld>/ with no further path, this gets relatively few false positives:
zgrep DENIED /var/log/squid/access.log-201404* | egrep 'http://[^.]{10,}\.(com|biz|net)\/ '
but obviously still hits on a few legitimate but long url's. Given that it gets a tiny handful of hits for a non-infected computer, but hundreds and hundreds for an infected computer, it should be relatively easy to sift the results a bit and come up with something.
Further suggestions appreciated though!
logcheck has magic to remember the inode & offset of the last scan; if the inode hasn't changed, it starts from where it left off (otherwise from 0). Or you could just use logcheck -- add your DENIED.*\.(com|biz|net)/ regexp to its "security alerts" list of regexps.

trentbuck@gmail.com (Trent W. Buck) writes:
logcheck has magic to remember the inode & offset of the last scan; if the inode hasn't changed, it starts from where it left off (otherwise from 0).
Or you could just use logcheck -- add your DENIED.*\.(com|biz|net)/ regexp to its "security alerts" list of regexps.
Oh, but squid doesn't log via syslog(3) by default. So you'd need to tell logcheck to also read squid/access.log and to whitelist "expected" lines from that.
participants (3)
-
James Harper
-
Rohan McLeod
-
trentbuck@gmail.com