Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Data Storage OS X Ubuntu Windows

Ask Slashdot: What's a Good Tool To Detect Corrupted Files? 247

Volanin writes "Currently I use a triple boot system on my Macbook, including MacOS Lion, Windows 7, and Ubuntu Precise (on which I spend the great majority of my time). To share files between these systems, I have created a huge HFS+ home partition (the MacOS native format, which can also be read in Linux, and in Windows with Paragon HFS). But last week, while working on Ubuntu, my battery ran out and the computer suddenly powered off. When I powered it on again, the filesystem integrity was OK (after a scandisk by MacOS), but a lot of my files' contents were silently corrupted (and my last backup was from August...). Mostly, these files are JPGs, MP3s, and MPG/MOV videos, with a few PDFs scattered around. I want to get rid of the corrupted files, since they waste space uselessly, but the only way I have to check for corruption is opening them up one by one. Is there a good set of tools to verify the integrity by filetype, so I can detect (and delete) my bad files?"
This discussion has been archived. No new comments can be posted.

Ask Slashdot: What's a Good Tool To Detect Corrupted Files?

Comments Filter:
  • by denis-The-menace ( 471988 ) on Monday May 07, 2012 @03:27PM (#39918567)

    2000-2001 MAF-Soft http://www.maf-soft.de/ [maf-soft.de]
    The version I have is v1.0.3.102

    It can scan single mp3s and entire folders structures for defects and logs everything if you wish. It will give you a percentage of how good the file is.

    Depending on the damage you may be able to fix headers and chop off corrupted tag info with something like a MP3Pro Trim v1.80.exe

  • For JPEGs (Score:5, Informative)

    by Jethro ( 14165 ) on Monday May 07, 2012 @03:30PM (#39918603) Homepage

    You can run jpeginfo -c. I have a script that runs against a directory and makes a list for when I do data recovery for all my friends who don't listen when I tell them their 10 year old laptop may be dying soon.

  • by Volanin ( 935080 ) on Monday May 07, 2012 @03:34PM (#39918661)

    Author here:

    > Last backup August.
    Yes, that was silly of me.

    > Thinks there is a way to detect generic file corruption
    There is no way to detect generic file corruption. But there is a way to detect specific filetype corruption. For example, I already found mp3val, that is able to scan all my mp3 and check for file integrity, and even fix a few kinds of corruption (such as unmatching bytes in the header and sound chunks). Maybe with the right set of tools, I might also detect (or even fix) my corrupted pictures, movies and books as well.

  • by Anonymous Coward on Monday May 07, 2012 @03:36PM (#39918677)

    Tech Tool Pro, over on the Mac side, has a "File Structures" check which looks at a lot of different structured file types to make sure that their internal format is valid.

  • by Bonteaux-le-Kun ( 1360207 ) on Monday May 07, 2012 @03:40PM (#39918739)
    You can just run mencoder or ffmpeg on the mp3 and mov on all the files (with a small shell script, probably involving 'find' or similar), just tell it to write the output to /dev/null, that should go through those files as fast at they can be read from disk and abort with error on those that are broken. For the jpgs, you could try something similar with imagemagick's 'convert', to convert them to whatever format to /dev/null, which also needs to read the whole file content and aborts if they're broken (one should hope). Those converters are really fast, especially ffmpeg, so that should complete in a reasonable time.
  • by quarkscat ( 697644 ) on Monday May 07, 2012 @03:43PM (#39918767)

    Not the BSOD.
    If the OP had used open source "tripwire" on known-good files in each filesystem on his Macbook, and saved the resultant data output to a USB thumbdrive formatted with FAT32, the OP would have had a good chance of determining all corrupted files. In this case, an ounce of prevention would have prevented several pounds of "cure".

    Check out http://tripwire.org./ [tripwire.org.]

  • Re:file(1) (Score:4, Informative)

    by Volanin ( 935080 ) on Monday May 07, 2012 @03:44PM (#39918785)

    Author here:

    At first I thought this idea wouldn't work. As some people have already written here, the 'file' command sometimes just checks for a few bytes. But since it is so easy to implement, why not give it a try? And indeed, for videos it worked quite well. Some of the corrupted MOV files were detected simply as 'data file' or even 'MPEG sequence' and were promptly deleted! Thank you for the idea.

  • Re:md5sum (Score:3, Informative)

    by subtr4ct ( 962683 ) on Monday May 07, 2012 @03:46PM (#39918807)
    This type of approach is automated in a python script here [micropipes.com].
  • by ncw ( 59013 ) on Monday May 07, 2012 @04:18PM (#39919237) Homepage

    That is a good thought, and photorec does an excellent job of finding pictures and videos by searching through your sectors - definitely worth a try.

    http://www.cgsecurity.org/wiki/PhotoRec_Step_By_Step [cgsecurity.org]

  • by rrohbeck ( 944847 ) on Monday May 07, 2012 @04:28PM (#39919381)

    Very few filesystems keep checksums - only btrfs and zfs come to my mind.
    With defective hardware (RAM issues in main memory and disk or controller caches are fun) you can have silent corruption that goes on for a long time. Also bits on disks rot but those should give you a CRC or ECC error.

  • Re:right filesystem (Score:5, Informative)

    by d3vi1 ( 710592 ) on Monday May 07, 2012 @04:36PM (#39919481)

    Two aspects to your problem:

    1) Recovering from the current situation

    If you didn't make ANY changes to the filesystem after it was corrupted, you still have a chance with software like DiskWarrior or Stelar Phoenix. Never work on the original corrupted filesystem unless you have copies of it. So grab a second drive, connect it over USB and using hdiutil or dd copy it to the second drive. Once you do that, use DiskWarrior or Stelar Phoenix on either one of the copies, while keeping the other one intact. Always have an intact copy of the original FS. You might be successful trying multiple methods, so KEEP AN INTACT COPY.

    2) Avoiding it in the future
    NTFS is good at surviving a crash if and only if the crash occurs in Windows. Paragon NTFS for Mac/Linux or NTFS-3G don't use journaling to it's full extent (for both metadata and data). So, if you get a crash while in Mac OS X or Linux, chances are that you get data corruption.

    Same goes for HFS+. While Mac OS X uses journaling on HFS+, Linux doesn't. It's read-only in Linux if it has journaling. Furthermore, the journaling is metadata only in HFS+.

    Now we get to the last journaled filesystem available to all 3 OSs: EXT3. It's the same crap as above.

    Because of the three points above, I have a conclusion: what you're looking for (ZFS) hasn't been invented on any of the OSs that you're using.
    Thus, I have a simple recommendation:
    Use ZFS in a VMware machine exported via CIFS/WebDAV/NFS/AFP to Linux, Windows or Mac OS X. A small FreeNAS VM with 256MB of RAM can run in VMWare Player and Workstation on Windows/Linux and Fusion on OS X.

    ZFS uses checksumming on the filesystem blocks, which lets you know of the silent corruptions. Furthermore, by design, it will be able to roll-back any incomplete filesystem transactions. I've had my arse saved by ZFS more times than I care to remember. The most difficult thing for my home storage system is to find external disk arrays that give me direct access to all the disks (not their RAID crap). A proper home storage system is RAIDZ2 (basically RAID6) + Hot Spare.

    Another way is to have a simple, TimeMachine-like backup solution on at least one of your operating systems. But even that doesn't catch silent data corruptions, let alone warn you. As such, we get back to: ZFS.

  • Re:Your eyes (Score:5, Informative)

    by Score Whore ( 32328 ) on Monday May 07, 2012 @06:36PM (#39921079)

    Well, jpeg files have a structure that will generate detectable errors if it's damaged. So simply opening them with something as simple as djpeg from the IJG and piping the output to /dev/null should give you a pretty good start on damaged images. Something like this perhaps:

    find . -name "*jpg" -o -name "*jpeg" -o -name "*JPG" -o -name "*JPEG" | while read filename; do if djpeg "$filename" > /dev/null 2> then :; else echo "$filename" is toast; fi; done

    You could probably do something similar with mpg123 and mplayer for .mp3 and movies.

  • Re:Your eyes (Score:5, Informative)

    by Zaiff Urgulbunger ( 591514 ) on Monday May 07, 2012 @09:07PM (#39922727)
    Might be better using the "identify" command of ImageMagick. The man page says:

    The identify program is a member of the ImageMagick(1) suite of tools. It describes the format and characteristics of one or more image files. It also reports if an image is incomplete or corrupt.

There are two ways to write error-free programs; only the third one works.

Working...