Proudly South African
Is your hard drive filled with backups of holiday pictures or downloaded images? How do you find and delete similar files if they have different formats, resolutions or rotations?
DuMP3 (derived from Duplicate MP3) is a Java program to find any duplicate or similar file.
It finds files by calculating a fingerprint based on the image, audio or text data for each file and then comparing the fingerprints. It does not compare filenames or even ID3 tags (even though plugin classes could be written that perform these operations). Calculated fingerprints can be stored in a MySQL database so that they do not have to be calculated again.
As an extra bonus DuMP3 will mark the files that can not be read or decoded correctly as corrupt or with a signature mismatch.
DuMP3 can find files that are not exact duplicates:
Some valid pictures are marked as corrupt because
Some very large pictures (around 4000x4000 pixels) could cause DuMP3 to run out of memory and crash.
Sometimes GIF files contain JPEG data or vice versa. This occurs usually in images downloaded from websites. These pictures will be marked as having a signature mismatch but fingerprinting will still be attempted.
DuMP3 is subject to some Java limitations as well as some limitations in the libraries I have chosen. The F.A.Q. covers most of them.
Translations of this page | English | Deutsch | Français |