By now
you have most likely started to generate some hash databases of your own using hashdog and it is time
to start to put them to use. In this blog post I will describe how I usually go
about to categorize my hash databases and how I use them. You might want do
things differently based on the type of forensic investigations you are
involved in or the type of environment you are supporting.
Common hash categories
You are today
probably already using the Reference Data Set (RDS) from the National Software
Reference Library (NSRL) as one of your hash databases. These databases have
been created by NIST in a controlled environment and contain hashes from application
or operating system that are mostly generated from files still on their
original media. These files are known to be good and are usually put in a hash
category called the ‘KnownGood’. This category contains hash databases of files
that are known to be benign, files that you are not interested in investigating
further.
You
might even have hash databases that contain hashes of malicious files that you
want to search for. Those databases are part of a hash category known as the ‘KnownBad’
category. As with the ‘KnownGood’ category you do not really want to spend your
time analyzing any files you find matches for in this category. If you get a
match for a hash in any of the databases you have in this category, chances are
high that the malware already has been analyzed by multiple organizations before
you came across it. A time better spend is trying to figure out how the malware
got on the system you are investigating and if other data pieces like registry
keys and tmp files are consistent with previous analysis that has been done.
After all, the file could just have been placed on your system to throw you off
and keep you from finding the real anomaly.
Extending the KnownGood and KnownBad categories
When I started
to write hashdog and was generating hash databases of my own, it did not seem right
to put some of my databases in the ‘KnownGood’ category. When I was creating
databases from files downloaded directly from the vendor or extracted from a
verified ISO image, I put the hash databases I generated in the ‘KnownGood”
category. However, when I was generating hashes from standard OS build images and
files listed in application shares, I had no really good way of guarantying
that the files was absolutely free from malware. After thinking a lot and
discussing it with Glenn, I came up with a solution that works for the kind of
forensic investigations I am mostly involved in – looking for anomalies in a
system that could indicate that the system has been compromised.
Instead
of using just the two hash categories mentioned above I decided to use a third and
a forth category, calling them the ‘KnownUsed’ and ‘KnownForbidden’ categories.
The ‘KnownUsed’ category contains databases of hashes generated from files that
are actively being used by the organization I am investigating. Any hits I get
from hashes part of databases in this category are treated differently than any
matches I get from one of my ‘KnownGood’ databases. For instance, if a file has
a hash that is part of any of the ‘KnownGood’ databases, that file will be
discarded immediately without any further analysis being made. If I get a hit for
a hash part of one of the hash databases in the ‘KnownUsed’ category, the file
will not be completely discarded but I will not pay so much attention to the
file, at least not at my initial analysis. The argument for this is that if I
am looking for an anomaly, it is highly unlikely that this anomaly is a file
that I have previously generated a hash for and included in my ‘KnownUsed’
database.
The
forth category that I call the ‘KnownForbidden’ contains databases of hashes
created from files part of applications that is not allowed within the
organization. Common hashes to put in this category are generated from files
belonging to non-cooperate encryption software such as Truecrypt, penetration
testing software like Metasploit and privacy and cleaning tools like CCleaner. These
are applications that are not malicious in them self but could indicate a
malicious use if they are found on a system I am investigating. As with the
other categories mentioned above, I want to get alerted if any files are
detected but I do not want to analyze any files. To sums things up these are
the categories I use and they way I threat any matches I get for files in the hash
databases.
- KnownGood - Discard any files from further analysis.
- KnownBad - Alert on any matches but do not analyze the file.
- KnownUsed - Put these files aside for later analysis.
- KnownForbidden - Alert on any matches but do not analyze the file.
By breaking
up the hash categories this way it is easier for me to focus on the files that
I have not seen before, the unknown file whose functionality is not known.
No comments:
Post a Comment