Dataset Comparison

Suspended
Joined
17 Oct 2011
Posts
5,707
Location
Buckingamshire
I have two datasets. Both sets contain files in a variety of formats, but all have Windows type file extensions.

There are files common to both sets, but there are also files unique to both sets. I'd like to be able to identify which files are common, but also which are unique.

What's the best way to go about this?

I had thought that hashing all the files would be a start, then I could possibly compare the hash values or is there a smarter way of doing this?
 
It really depends on accuracy required. Hashtag is the best way to ensure a match is actually a match, but if filesize and name is sufficient, them doing a DIR /s /N dump and importing into excel is a quick and dirty way to compare them
 
LOL :)

The file names aren't the same (although the file contents are) because they've been exported from an eDiscovery package. It's going to have to be the hashing method.
 
Back
Top Bottom