Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Data Storage OS X Ubuntu Windows

Ask Slashdot: What's a Good Tool To Detect Corrupted Files? 247

Volanin writes "Currently I use a triple boot system on my Macbook, including MacOS Lion, Windows 7, and Ubuntu Precise (on which I spend the great majority of my time). To share files between these systems, I have created a huge HFS+ home partition (the MacOS native format, which can also be read in Linux, and in Windows with Paragon HFS). But last week, while working on Ubuntu, my battery ran out and the computer suddenly powered off. When I powered it on again, the filesystem integrity was OK (after a scandisk by MacOS), but a lot of my files' contents were silently corrupted (and my last backup was from August...). Mostly, these files are JPGs, MP3s, and MPG/MOV videos, with a few PDFs scattered around. I want to get rid of the corrupted files, since they waste space uselessly, but the only way I have to check for corruption is opening them up one by one. Is there a good set of tools to verify the integrity by filetype, so I can detect (and delete) my bad files?"
This discussion has been archived. No new comments can be posted.

Ask Slashdot: What's a Good Tool To Detect Corrupted Files?

Comments Filter:
  • by Anonymous Coward on Monday May 07, 2012 @03:22PM (#39918497)

    Try running "file" from a command line on a few files you know to be corrupt. If the file command tells you the same, you could run a quick bash script to loop through the files and spit out the names of the bad ones. This is all assuming you know what you are doing with shell scripting.

  • No easy answer (Score:2, Insightful)

    by gstrickler ( 920733 ) on Monday May 07, 2012 @03:26PM (#39918557)

    1. Compare to backup, files that match are ok.
    2. AppleScript option others mentioned may help reduce it further.
    3. Backup regularly, and verify your backup procedure.
    4. Anything else will cost you consulting rates.

  • Re:AppleScript (Score:4, Insightful)

    by dgatwood ( 11270 ) on Monday May 07, 2012 @03:31PM (#39918613) Homepage Journal

    But the open usually won't fail. Unless the error is within the header bytes of a movie or image, the media will open, but will appear wrong. Worse, there is no way to detect this corruption because media file formats generally do not contain any sort of checksums. At best, you could write a script that looks for truncation (not enough bytes to complete a full macroblock), or write a tool that computes the difference between adjacent pixels across macroblock boundaries and flags any pictures in which there is an obvious high energy transition at the macroblock boundary, but even that cannot tell you whether the image is corrupt or simply compressed at a low quality setting with lots of blocking artifacts.

    The short answer, however, is "no". Such corruption can't usually be detected programmatically.

  • by Anonymous Coward on Monday May 07, 2012 @03:40PM (#39918727)

    Consider the possibility that the backup already contains corrupted files. I once had defective RAM where only one bit flipped occasionally. The machine was quite stable, so the defect went undetected and over a couple of months it silently corrupted hundreds of files. Unless he finds out what caused the crash, he can't be sure that the backup is alright.

  • by ncw ( 59013 ) on Monday May 07, 2012 @03:42PM (#39918753) Homepage

    I'd be asking myself why lots of files became corrupted from one dodgy file system event. Assuming HFS works like file systems I'm more familiar with, it will allocate sequential blocks for files wherever it can. This means that a random filesystem splat is really unlikely to corrupt loads and loads of files. You might expect a file system corruption to cause a load of files to go missing (if a directory entry is corrupted) or corrupt a few files, but not put random errors into loads of files.

    I'd check to see whether files I was writing now get corrupted too. It might be dodgy disk or RAM in your computer.

    The above might be complete paranoia, but I'm a paranoid person when it comes to my data, and silent corruption is the absolute worst form of corruption.

    For next time, store MD5SUM files so you can see what gets corrupted and what doesn't (that is what I do for my digital picture and video archive).

  • by Calos ( 2281322 ) on Monday May 07, 2012 @03:43PM (#39918761)

    Well...

    My first suspicion would be that the filesystem is messed up, not the actual files. Unless s/he had a lot of pending writes to all of these files, there is no reason that something should have actually overwritten or garbled them when the power shut down. Much more likely was an impending or in-progress write to the filesystem's tables, which has affected where it thinks all the files' pieces are stored. And if that is the case, date modified and size may be irrelevant because those are going to be reported by the filesystem.

    Aside from trying to read back sector-by-sector data and assembling them, however, I don't know that there's a remedy.

  • by Anonymous Coward on Monday May 07, 2012 @03:56PM (#39918959)
    Let me ask a stupid question since I've never run a battery out on a machine running Ubuntu. Why did this happen? Running OSX or Windows, the machine would have hibernated safely before the battery ran out. Does Ubuntu not do this and it just dies? Or is this something you configured to act this way? If it is default behavior in Ubuntu it is something they ought to fix.

The Tao is like a glob pattern: used but never used up. It is like the extern void: filled with infinite possibilities.

Working...