Thursday, 8 March 2012

Detecting Spoofed Emails with SIFT's pffexport and some Perl scripting


One likely issue facing today's forensicator is the sheer number of emails people keep in their Inboxes.
These numbers can grow at a phenomenal rate especially if the user subscribes to multiple mailing lists.
Thinking from an Incident Response perspective - given a bunch of emails in the Inbox, how can we perform a quick check for any spoofed emails (ie forged "From" fields)?

Rob Lee (unsure if was SANS Rob Lee :o) recently suggested using pffexport for one of my previous posts dealing with email analysis. Like readpst, pffexport is installed on SANS SIFT and can be used to extract emails from MS Outlook ".pst" files. Unlike readpst however, pffexport will also automatically extract any attachments - no more pesky base64 decoding!

You can launch it using something like:
"pffexport /mnt/caseX/Documents\ and\ Settings/UserX/Local\ Settings/Application\ Data/Microsoft/Outlook/outlook.pst -t userx"

The results will be stored in the current directory under the "userx.export" sub-directory. The "-t userx" argument tells pffexport what first name to use for the export folder.
If you don't specify the "-t userx" argument, pffexport will use the default "outlook.pst.export" folder name.

Under "userx.export" we should now have a bunch of sub directories - for the retrieved emails go to the "userx.export/Top of Personal Folders" sub folder. There you will find separate folders for "Inbox", "Sent Items", "Deleted Items" etc. And under these folders you will find that each email message gets its own numbered folder. pffexport also separates out the mail header information, attachments and body text into their own respective files.
For our purposes, under each "Inbox" / "Deleted Items" message folder we can find an "InternetHeaders.txt" file which captures that particular email's header info.
For example, "outlook.pst.export/Top of Personal Folders/Inbox/Message00007/InternetHeaders.txt" might look something like:

Return-Path: <badguy@badboys.com>
X-Original-To: victim@spoofme.com
Delivered-To: x2789967@spunkymail-mx5.g.dreamhost.com
Received: from rv-out-0304.google.com (rv-out-0304.google.com [209.85.198.210])
    by spunkymail-mx5.g.dreamhost.com (Postfix) with ESMTP id 10B7A41CB9
    for <victim@spoofme.com>; Sun,  6 Jul 2008 00:58:25 -0700 (PDT)
Received: by rv-out-0304.google.com with SMTP id b20so2037328rvf.23
        for <victim@spoofme.com>; Sun, 06 Jul 2008 00:58:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=beta;
        h=domainkey-signature:received:message-id:date:from:to:subject
         :mime-version:content-type:content-transfer-encoding;
        bh=2JhxMj72b75tltAvNhxeDuk6ECfLUR1gUXtJ7Hx6b60=;
        b=TT6rYh6A2JkKPxXT3aJbHpCsGbUiWjLxfROkZe2kKhBzd+4bwR1QfAvakfjIj/YXt9
         oHGqQz1qaVe6n2gCVsVA==
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=google.com; s=beta;
        h=message-id:date:from:to:subject:mime-version:content-type
         :content-transfer-encoding;
        b=r3HhXow0DdCIy9QnmfByeJ+D57+/rOQ7zAmtqb4HOclOdURK8AIOFOw1vOiyC1Hnso
         fi5Id7j4XDRIDMBLwHRg==
Received: by 10.114.108.8 with SMTP id g8mr1855176wac.28.1215331104741;
        Sun, 06 Jul 2008 00:58:24 -0700 (PDT)
Message-ID: <2638157.101571215331104745.JavaMail.ins-frontend@google.com>
Date: Sun, 6 Jul 2008 00:58:24 -0700 (PDT)
From: goodguy@goodguys.com
To: victim@spoofme.com
Subject: Google Email Verification
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit


FYI For this exercise I will be using a heavily modified copy of the pffexport extracted results from an M57 Jean outlook.pst. I have placed this modified copy in "/home/sansforensics/mod-outlook.pst.export".
I have deleted a shed load of "Inbox" emails (for ease of testing) and then manually edited a couple of "InternetHeaders.txt" files to reflect a spoof attack. ie I have modified the "Return-Path" (badguy@badboys.com) so it does not match the "From" field (goodguy@goodguys.com).

In real life, the "From" field can be manipulated by the submitting party when interfacing with the recipients email server. It is not usually verified.
The "Return-Path" field is (usually) set by the recipients email server at the time of reception and consequently it should reflect the actual sending party.
For more information on the "Return-Path" and "From" fields see this programmer's forum.

Coding

So knowing all this, it's now time to write a Perl script to parse all of the "InternetHeaders.txt" files under the "Inbox" and "Deleted Items" message folders and then check that the "From" fields match the "Return-Path" fields.
If we get a mismatch, we will print out both a warning and the "From" / "Return-Path" address fields. We will also add the capability to exclude user specified email addresses (to reduce the resultant data set).
After processing all relevant emails, we will also print out the total number of suspect emails.
I've (inelegantly) named the script "pffexport-spoofchk.pl" and saved it under "/usr/local/bin/". I've also made it world executable by typing "chmod a+x /usr/local/bin/pffexport-spoofchk.pl".



Here's the code:


# START CODE

#!/usr/bin/perl -w

# Perl script to take the output of pffexport and check the "From" email address fields against the "Return-Path" in InternetHeaders.txt
# from each Inbox/Deleted Items email

use strict;

use Getopt::Long;
use File::Find;
use Email::Address;

my $version = "pffexport-spoofcheck.pl v2012.03.05";
my $help = ''; # help flag
my @directories; # input directories from -dir flag (must use absolute paths)
my @exfilter; # exclude addresses in form user@domain
my $mismatchcount = 0;

GetOptions('help|h' => \$help,
    'x=s@' => \@exfilter,
    'dir=s@' => \@directories);

if ($help||@directories == 0)
{
    print("\n$version\n");
    print("Perl script to process the output of pffexport and check for spoofed emails.\n");
    print("\nUsage: pffexport-spoofcheck.pl [-h|help] [-dir dirname] [-x user\@domain]\n");
    print("-h|help ........... Help (print this information). Does not run anything else.\n");
    print("-dir dirname ...... Directory containing exported pffexport Inbox/Deleted Items files.\n");
    print("-x user\@domain .... Exclude this email address from processing.\n");
    print("\nExamples:\n");
    print("pffexport-spoofcheck.pl -dir /cases/outlook.pst.export/Top of Personal Folders/Inbox/ -x goodguy\@goodguys.com\n\n");
    print("pffexport-spoofcheck.pl -dir /cases/outlook.pst.export/Top of Personal Folders/Deleted Items/ -x goodguy\@goodguys.com\n\n");
    print("pffexport-spoofcheck.pl -dir /cases/outlook.pst.export/Top of Personal Folders/ -x goodguy\@goodguys.com\n\n");
    exit;
}

# Setup Email::Address exclusion filter array
my @exEmailAddresses;
my $idx = 0;
if (@exfilter > 0)
{
    foreach my $email (@exfilter)
    {
        $exEmailAddresses[$idx] = new Email::Address($email);
        $idx++;
    }
}

# Main processing loop
print("\nRunning $version\n");

# Recursively process folders specified using the -dir flag
# Note: Will NOT follow symbolic links to files
find(\&ProcessDir, @directories);

print ("\nFound $mismatchcount mismatched emails\n");

# Gets called for each file/folder under user specified dir
sub ProcessDir
{
    # $File::Find::dir is the current directory name,
    # $_ is the current filename within that directory
    # $File::Find::name is the complete pathname to the file.
    my $filename = $File::Find::name; # should contain absolute path eg /cases/outlook.pst.export/Top of Personal Folders/Inbox/
    my @retaddresses;
    my @fromaddresses;
    my $fromuser;
    my $fromhost;
    my $retuser;
    my $rethost;

    # InternetHeaders.txt files from "Inbox" or "Deleted Items"
    if ( ($filename =~ /InternetHeaders.txt/) && (($filename =~ /Inbox/)||($filename =~ /Deleted Items/)) )
    {
        # Open the file for reading
        if (open(IHFILE, "<", $filename))
        {
            # Read each line of the file
            while (<IHFILE>) # Assigns each line in turn to $_
            {
                if ($_ =~ /Return-Path/)
                {           
                    # If it contains a "Return-Path" string try to extract email address
                    @retaddresses = Email::Address->parse($_); 
                }
                elsif ($_ =~ /From/)
                {           
                    # If it contains a From string try to extract email address
                    @fromaddresses = Email::Address->parse($_);
                }
            }
            if (@retaddresses > 0)
            {
                $retuser = $retaddresses[0]->user;
                $rethost = $retaddresses[0]->host;
            }
            else
            {
                close IHFILE;
                return; # Bail out - there should be at least 1 retaddress field
            }
            if (@fromaddresses > 0)
            {
                $fromuser = $fromaddresses[0]->user;
                $fromhost = $fromaddresses[0]->host;
            }
            else
            {
                close IHFILE;
                return; # Bail out - there should be at least 1 fromaddress field
            }

            # Check if retaddresses or fromaddresses are supposed to be filtered out/ignored
            if (@exEmailAddresses > 0)
            {
                foreach my $email (@exEmailAddresses)
                {
                    if ( (($email =~ $retuser)&&($email =~ $rethost))
                        || (($email =~ $fromuser)&&($email =~ $fromhost)) )
                    {
                        close IHFILE;
                        return; # don't print anything or spoofcheck, this email is being filtered out
                    }
                }
            }   

            # Normal case (no filtering) - Compare the Return-Path to the From path and print out if there's a discrepancy
            if (($retuser ne $fromuser)||($rethost ne $fromhost))
            {
                print ("\n*** Mismatched From and Return-Path addresses in $filename ***\n");
                print ("From user = $fromuser, From host = $fromhost\n");
                print ("Return-Path user = $retuser, Return-path host = $rethost\n");
                $mismatchcount++;
            }
            close IHFILE;
        }
        else
        {
            print("Unable to open $filename for analysis\n");
        }
    }
}

# END CODE


Code Summary

The first few sections are similar to those in "exif2map.pl" - there's a GetOptions to handle user specified arguments and a Help screen printout check. One thing to note is the inclusion of the "use Email::Address" Perl module. This pre-existing  module makes it easier to look for/handle email addresses. We can install it by typing:
 "sudo cpan Email::Address".
After the help section, we set up an array list (exEmailAddresses) of any user specified email addresses which are to be excluded for processing.
Next we call the File::find function to call the ProcessDir subroutine for each file/directory it finds under the user specified directory.
After all directories have been processed, we then print the "mismatchcount" and exit.
The ProcessDir subroutine first checks if the filename path (eg "/cases/outlook.pst.export/Top of Personal Folders/Inbox/Message00001/InternetHeaders.txt") contains both "InternetHeaders.txt" AND either "Inbox" or "Deleted Items". If it does, it opens the "InternetHeaders.txt" file, reads each line and tries to extract "Return-Path" and "From" email addresses (using the Email::Address::parse function). If its a valid "InternetHeaders.txt" file, there should be at least 1 of each email address. If none are found, we bail out of the ProcessDir function (close the "InternetHeaders.txt" file and call return) and the next file/directory will have ProcessDir called against it.
Next, we check to see if the user specified any exclusion filters. If so, we compare the extracted "From" and "Return-Path" email addresses and bail out of ProcessDir if there's a match.
The last major part occurs if we have a "From" and "Return-Path" and they are not being filtered out.
We compare the "From" user and the "From" domain against the "Return-Path" user and domain. If there's a mismatch, we print out our message plus the various email fields and increment our "mismatchcount" counter.

Testing

For this testing scenario, I have conjured up 2 spoofed emails in the "Inbox" and left a bunch of googlealerts (and other mailing list emails) in both the "Inbox" and "Deleted Items".
We will now verify that we can detect these 2 spoofed emails and also reduce the data set with our filter parameter (-x).
Additionally, we will point the script to the "Inbox", "Deleted Items" and finally the "Top of Folders" directories and see that it processes all relevant emails.

Processing the "Inbox" without filters

Typing "pffexport-spoofcheck.pl  -dir /home/sansforensics/mod-outlook.pst.export/Top\ of\ Personal\ Folders/Inbox/" results in the following output:

Running pffexport-spoofcheck.pl v2012.03.05

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00045/InternetHeaders.txt ***
From user = googlealerts-noreply, From host = google.com
Return-Path user = 3M25zSBQKBF0BJJBG95G9MON-IJM9KGTBJJBG9.7JHE95IHac.6DU, Return-path host = alerts.bounces.google.com
.
.
.
*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00044/InternetHeaders.txt ***
From user = googlealerts-noreply, From host = google.com
Return-Path user = 3BGRzSBQKBCQGOOGLEALERTS-NOREPLYGOOGLE.COMJEANMfh.BIZ, Return-path host = alerts.bounces.google.com

Found 37 mismatched emails
sansforensics@SIFT-Workstation:~$


Note: I have edited out a bunch of output entries to save space. As you can see there's a lot of mismatched emails besides the 2 we created. This is because mailing lists typically have different "Return-Path" and "From" fields. The "Return-Path" fields are usually set to bounce any replies. To reduce these output results we will set a filter in the next test.

Processing the "Inbox" with filter on "googlealerts-noreply@google.com"

Typing "pffexport-spoofcheck.pl  -dir /home/sansforensics/mod-outlook.pst.export/Top\ of\ Personal\ Folders/Inbox/ -x googlealerts-noreply@google.com" results in the following output:

Running pffexport-spoofcheck.pl v2012.03.05

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00007/InternetHeaders.txt ***
From user = goodguy, From host = goodguys.com
Return-Path user = badguy, Return-path host = badboys.com

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00026/InternetHeaders.txt ***
From user = admin, From host = associatedcontent.com
Return-Path user = webadmin, Return-path host = associatedcontent.com

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00037/InternetHeaders.txt ***
From user = allsongs, From host = n.npr.org
Return-Path user = newsletters, Return-path host = n.npr.org

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00008/InternetHeaders.txt ***
From user = accounts, From host = goodguys.com
Return-Path user = badguy, Return-path host = badguys.com

Found 4 mismatched emails
sansforensics@SIFT-Workstation:~$


Much better! We can actually see our 2 x "badguy" spoof attempts but we can also filter out "admin@associatedcontent.com" and "allsongs@n.npr.org".

Processing the "Inbox" with multiple filters

Typing "pffexport-spoofcheck.pl  -dir /home/sansforensics/mod-outlook.pst.export/Top\ of\ Personal\ Folders/Inbox/ -x googlealerts-noreply@google.com -x admin@associatedcontent.com -x allsongs@n.npr.org" results in the following output:

Running pffexport-spoofcheck.pl v2012.03.05

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00007/InternetHeaders.txt ***
From user = goodguy, From host = goodguys.com
Return-Path user = badguy, Return-path host = badboys.com

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00008/InternetHeaders.txt ***
From user = accounts, From host = goodguys.com
Return-Path user = badguy, Return-path host = badguys.com

Found 2 mismatched emails
sansforensics@SIFT-Workstation:~$


Aha! We have narrowed down our 50 odd emails to our 2 suspected spoof emails!
Now lets try processing the "Deleted Items" (there shouldn't be any spoofs)...



Processing the "Deleted Items" without filters

Typing "pffexport-spoofcheck.pl  -dir /home/sansforensics/mod-outlook.pst.export/Top\ of\ Personal\ Folders/Deleted\ Items/" results in the following output:

Running pffexport-spoofcheck.pl v2012.03.05

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Deleted Items/Message00001/InternetHeaders.txt ***
From user = googlealerts-noreply, From host = google.com
Return-Path user = 3HER-SBQKBCcJRRJOHDOHUWV-QRUHSObJRRJOH.FRPMHDQPik.ELc, Return-path host = alerts.bounces.google.com

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Deleted Items/Message00002/InternetHeaders.txt ***
From user = googlealerts-noreply, From host = google.com
Return-Path user = 3zwuASBQKBKMJRRJOHDOHUWV-QRUHSObJRRJOH.FRPMHDQPik.ELc, Return-path host = alerts.bounces.google.com

Found 2 mismatched emails


So we can see that there are just 2 mismatched google alert emails in the "Deleted Items".
Now we will try specifying the "Top of Folders" directory (unfiltered) and see if it picks up the 37 mismatched "Inbox" emails PLUS the 2 "Deleted Item" mismatches = 39 total.

Processing the "Top of Personal Folders" without filters


Typing "pffexport-spoofcheck.pl  -dir /home/sansforensics/mod-outlook.pst.export/Top\ of\ Personal\ Folders/" results in the following output:

Running pffexport-spoofcheck.pl v2012.03.05

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00045/InternetHeaders.txt ***
From user = googlealerts-noreply, From host = google.com
Return-Path user = 3M25zSBQKBF0BJJBG95G9MON-IJM9KGTBJJBG9.7JHE95IHac.6DU, Return-path host = alerts.bounces.google.com
.
.
.
*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Deleted Items/Message00002/InternetHeaders.txt ***
From user = googlealerts-noreply, From host = google.com
Return-Path user = 3zwuASBQKBKMJRRJOHDOHUWV-QRUHSObJRRJOH.FRPMHDQPik.ELc, Return-path host = alerts.bounces.google.com

Found 39 mismatched emails

OK we found all 39 mismatches from specifying the "Top of Personal Folders". Now we will filter out the googlealerts by typing:

"pffexport-spoofcheck.pl  -dir /home/sansforensics/mod-outlook.pst.export/Top\ of\ Personal\ Folders/ -x googlealerts-noreply@google.com" which results in the following output:

Running pffexport-spoofcheck.pl v2012.03.05

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00007/InternetHeaders.txt ***
From user = goodguy, From host = goodguys.com
Return-Path user = badguy, Return-path host = badboys.com

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00026/InternetHeaders.txt ***
From user = admin, From host = associatedcontent.com
Return-Path user = webadmin, Return-path host = associatedcontent.com

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00037/InternetHeaders.txt ***
From user = allsongs, From host = n.npr.org
Return-Path user = newsletters, Return-path host = n.npr.org

*** Mismatched From and Return-Path addresses in /home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00008/InternetHeaders.txt ***
From user = accounts, From host = goodguys.com
Return-Path user = badguy, Return-path host = badguys.com

Found 4 mismatched emails
sansforensics@SIFT-Workstation:~$


Which shows that we can point "pffexport-spoofchk.pl" at the "Top of Personal Folders" and it will process (and filter) both "Inbox" and "Deleted Items" emails. WHEW!

Miscellaneous Note
By typing something like:
 
grep -in "goodguy\@goodguys\.com" /home/sansforensics/mod-outlook.pst.export/Top\ of\ Personal\ Folders/Inbox/Message*/InternetHeaders.txt
we can search a bunch of emails for a particular email address. The results will look something like:
/home/sansforensics/mod-outlook.pst.export/Top of Personal Folders/Inbox/Message00007/InternetHeaders.txt:26:From: goodguy@goodguys.com

So that's all for today folks.
I'm not sure how useful the script will be in real life but as a Perl programming exercise I think it was interesting enough (for me anyway). Once again, any comments/suggestions will be appreciated.

3 comments:

  1. Very useful and practical post! Thank you very much for writing this!

    ReplyDelete
  2. ...and now my whole original post is gone. :(

    "Not all spam/spoofed email follows this pattern though, so don't just discard the rest without checking them.
    Here are two example headers from messages I recently recieved (slightly cleaned up):"
    (headers- see pastebin link)
    "Just so you know, all of these fields can be spoofed, so you can't trust any of them.
    Rob Jones' "Internet Forensics" is an excellent book for learning more about header analysis, among other subjects.
    Nice post by the way- keep 'm coming. :)"

    ReplyDelete
    Replies
    1. Hi Anon,

      Yeah, I knew this script wouldn't handle all types of spoofing - just one particular type. But it did give me a chance to code some Perl.

      Thanks for commenting, the encouragement and the book reference!

      Delete