Best free program that recognises duplicate files on my PC?

Soldato
Joined
24 Jul 2004
Posts
5,565
Hi, I'm looking for a free program that will show me any duplicate files on my PC. It'll need to recognise stuff like file size, name, details and so on.

I have a feeling I have a lot of the same music in different places!

Thanks.
 
I have just modified a script I use for finding duplicates. It is a Powershell script, so you will need to install it if you aren't on Windows 7. You can grab it from Microsoft here.

Once installed or if you have Windows 7, launch a Powershell console as an Administrator [Start > Programs > Accessories > Windows Powershell > Windows Powershell]

Type in the following then press enter:
Code:
Set-ExecutionPolicy RemoteSigned
Type Y and press enter at the prompt.

The above tells PowerShell to only allow scripts written locally to run. You need only do it once.

Ok, now you need to create a new file, I will assume the name get-duplicates.ps1 and paste the code at the bottom of this post into it. Save the file somewhere sensible.

Back in a PowerShell console type the following to run the script:
Code:
C:\path\to\script\get-duplicates.ps1 C:\whatever
You will need to enter the full path to the get-duplicates.ps1 file [or you can drag and drop the script file into an open PowerShell window] and change C:\whatever to the directory you wish to find duplicates.

The script will only look for music files. Edit line 3 of the script to add more extensions as necessary. The script determines a duplicate by first of all collecting all files with exactly the same byte count - a good indication two files are the same, but not foolproof. It then opens these files and MD5 hashes the contents. Any MD5s that match are definitely duplicates.

Duplicates are written to the console and also to a file called dupes.txt which will be created in the same folder as the script is located. Be warned, if you intend to use this on your entire HDD, it can take a long time.

Have fun! :)

Code:
param ([string] $Path = (Get-Location))

$file_types = @("*.mp3", "*.ogg", "*.wav", "*.flac", "*.mp4", "*.wma")

function Get-MD5([System.IO.FileInfo] $file = $(throw ‘Usage: Get-MD5 [System.IO.FileInfo]‘))
{
    # This Get-MD5 function sourced from:
    # http://blogs.msdn.com/powershell/archive/2006/04/25/583225.aspx
    $stream = $null;
    $cryptoServiceProvider = [System.Security.Cryptography.MD5CryptoServiceProvider];
    $hashAlgorithm = new-object $cryptoServiceProvider
    $stream = $file.OpenRead();
    $hashByteArray = $hashAlgorithm.ComputeHash($stream);
    $stream.Close();

    ## We have to be sure that we close the file stream if any exceptions are thrown.
    trap
    {
        if ($stream -ne $null) { $stream.Close(); }
        break;
    }

    return [string]$hashByteArray;
}

function Get-Duplicates([string]$Path)
{

    $fileGroups = Get-ChildItem $Path -Recurse -Include $file_types `
    | Where-Object { $_.Length -gt 0 } `
    | Group-Object Length `
    | Where-Object { $_.Count -gt 1 };

    foreach ($fileGroup in $fileGroups)
    {
        foreach ($file in $fileGroup.Group)
        {
            Add-Member NoteProperty ContentHash (Get-MD5 $file) -InputObject $file;
        }

        $fileGroup.Group `
        | Group-Object ContentHash `
        | Where-Object { $_.Count -gt 1 };
    }
}


$dupes = Get-Duplicates $Path

$outfile = (Split-Path -Parent $MyInvocation.MyCommand.Definition) + "\dupes.txt"
Set-Content $outfile (Get-Date -f "dd-MM-yyyy HH:mm")

foreach($dupe in $dupes)
{
    foreach($file in $dupe.Group)
    {
        Write-Host $file.FullName
        Add-Content .\dupes.txt $file.FullName
    }
    
    Write-Host
    Add-Content $outfile "`n"
}
 
how can I better trap/fix this error:

You cannot call a method on a null-valued expression.
At D:\d_desktop\powershell\duplicates.ps1:12 char:29
+ $stream = $file.OpenRead <<<< ();
+ CategoryInfo : InvalidOperation: (OpenRead:String) [], ParentContainsErrorRecordException
+ FullyQualifiedErrorId : InvokeMethodOnNull

Code:
function Get-MD5([System.IO.FileInfo] $file = $(throw ‘Usage: Get-MD5 [System.IO.FileInfo]‘))
{
    # This Get-MD5 function sourced from:
    # http://blogs.msdn.com/powershell/archive/2006/04/25/583225.aspx
    $stream = $null;
    $cryptoServiceProvider = [System.Security.Cryptography.MD5CryptoServiceProvider];
    $hashAlgorithm = new-object $cryptoServiceProvider
    $stream = $file.OpenRead(); <<<<<<<<<< line 12
    $hashByteArray = $hashAlgorithm.ComputeHash($stream);
    $stream.Close();
...
 
DoubleKiller is my preferred one. A but more manual set up and checking needed but I find it does a better job than any of the alternatives as you can set it to find files that are similar.
 
Back
Top Bottom