This project is read-only.
Project Description
Utility for identifying near-duplicate audio and image files.
A small utility to help identify nearly-identical (but not entire-length byte-4-byte) picture and music files.
For audio files (MP3s), it calculates a hash for the audio content, ignoring contents of ID tags, thus allowing to identify duplicates even though a title of a song may've been changed.
For image files, it calculates full content hash as well as (currently limited to) pHash that can be used to identify similar, but not identical files.

Yeti library recently included here is from http://www.codeproject.com/Articles/6826/C-Windows-Media-Format-SDK-Translation.

Last edited Jan 26, 2013 at 7:41 PM by Lonchik, version 4