Add-In Review: DupeCleaner

Fri, Dec 5, 2008 | Jim Clark

Add-In Review: DupeCleaner

A lot of WHS users, including myself, are using their WHS machine as a repository for their the digital data and use this computer to serve this data to other devices throughout the home.  I back up this data to other places and I do hope that others do so, also.  One of the housekeeping chores that a user must deal with, as a result, is the potential for duplication of data in the Shares folder.  As you copy data from many sources, the potential for duplication increases as the number of source/client machines in the home network increases.

DupeCleaner is a utility that will search your Shares folder, sniff out duplicated files, and provide you the ability to delete the duplicated files.

DupeCleaner installs like most other add-ins.  Simply open the WHS console Settings window, click on the Add-ins tab and install the program from the Available list.

<p></p>

As usual, the console will close.  Upon reopening the console, a new tab entry, DupeCleaner, will be found.

</p>

It is a very simple add-in with 3 options.  I believe that the first 2 options, Check and Clean Selected are self-explanatory.  To make sure I had a duplicate, I copied a file from one folder to another.  I clicked on the Check icon to let DupeCleaner do its thing.  I waited.  And waited.  And waited some more.  I could tell that DupeCleaner was doing something, as I could see the HD lights in my backplane doing a lightshow.  But I never got any returned duplicate files.  Over the course of the next few days, I tried this again with the same results.  According to the author, Brent Friedman, DupeCleaner is quite CPU and HD intensive and suggested that I exclude some folders to see what would happen.  As I was running Boinc on my WHS at a 75% CPU usage, this may have contributed to the slow response of DupeCleaner.

Which leads me to the 3rd option, Settings.  As can be seen in the following picture, you can exclude certain folders and files from the search process.  I excluded all the folders except the one I knew I had a duplicate in and tried again.

<br />

Within a very short time, I had a list of duplicated files.  I was actually quite relieved to learn that this add-in works!

<p></p>

From this list, you can highlight single and/or multiple duplications.  Once highlighted, simply click on the Clean Selected icon which will bring you to the following window.

</p>

From here, simply highlight the files you want to delete and click on the Clean Selected Files button at the bottom of the window.  I chose the preceding example to illustrate that DupeCleaner is not perfect at finding duplicated files.  All the files in that window have the same extension, but have different overall file names.

So what are my final thoughts on this add-in?  A very nice concept that has lots of potential.  The user interface is quite simple and intuitive.  The search code needs some serious tweaking.  I have 800MBs of total disk space being used, which is quite small compared to some, or many, WHS setups.

In the end, it is a worthy add-in, as long as one uses the Settings option to exclude folders.  Hopefully, Brent can optimize his search code in a future release.

Author: Brent Friedman
Version Reviewed: 0.0.0.2 Beta
Release: 26Apr2008

More info: Download | Discuss

Subscribe to We Got Served: RSS | Email | Twitter



Similar Posts:

This post was written by:

Jim Clark - who has written 85 posts on We Got Served.

Hello. I’m from the heartland of U.S. Lots of corn and beans, although Iowa is a lot more than just farmland. It also has a few computer enthusiasts (no, not me!). I’ve been around PC’s since I got my 1st PC XT aloooong time ago. WGS is one of the first sites I found centered around WHS. And the best. Every once in awhile, I do get away from the KB and enjoy time with and my wife and our 4 kids. And I do have a day job.

Contact the author

2 Responses to “Add-In Review: DupeCleaner”

  1. Kent Says:

    Due to the Single-instance storage technology in WHS, is this even necessary? I was under the impression that WHS does not store duplicate information twice, but just adds a pointer to the existing data.

    Or, does the Single-instance storage only apply to PC backups and not shares?

  2. Brent Says:

    @Kent:
    This is if you have files that are the same in your shares. This has nothing to do with how Drive Extender’s Duplication. This has to do with the same files in more than one location, duplicate files. Windows Home Server does not prevent this from happening and nor does DupeCleaner. However, you can search for existing duplicates and clean them using DupeCleaner. In a future version I have been planning for a while, I will add an option to replace all but on duplicate with a pointer to the real file. This will cut the storage size down and not mess with reason behind non-accidental duplicates. I also plan on decreasing the time it takes to serach through data, limiting the need for hashing all files.

Leave a Reply

Tags:
Separate individual tags by commas