Best Format For OS X and Linux HDD? 253
dogmatixpsych writes "I work in a neuroimaging laboratory. We mainly use OS X but we have computers running Linux and we have colleagues using Linux. Some of the work we do with Magnetic Resonance Images produces files that are upwards of 80GB. Due to HIPAA constraints, IT differences between departments, and the size of files we create, storage on local and portable media is the best option for transporting images between laboratories. What disk file system do Slashdot readers recommend for our external HDDs so that we can readily read and write to them using OS X and Linux? My default is to use HFS+ without journaling but I'm looking to see if there are better suggestions that are reliable, fast, and allow read/write access in OS X and Linux."
HIPAA Constraints? (Score:5, Interesting)
By "HIPAA Constraints" I assume you mean the privacy rule. I would think that this rule would prevent you from using sneakernet to transmit files. Unless you're encrypting your portable disks, and somehow it doesn't sound like you are.
Fun reading:
http://www.computerworld.com/s/article/9141172/Health_Net_says_1.5M_medical_records_lost_in_data_breach [computerworld.com]
NTFS (Score:4, Interesting)
There is NTFS-3G for Linux and Mac OS X [sourceforge.net]
There is also an EXT2 Fuse FS (for Mac OS), and probably many other options.
Having said that, I have never had a problem with Linux's HFS+ write support.
NFS over SSH (Score:3, Interesting)
Just tunnel NFS over SSH. I can't imagine how secure it would be to sneakernet any files around the office. If you need to encrypt the data at rest then either encrypt on the client or leverage an encrypted filesystem of a Decru type appliance.
Re:HIPAA Constraints? (Score:3, Interesting)
Maybe instead of using a portable disk, they could whip up a nettop running Linux and transfer files over the gigabit ethernet...
Then they could do transfers via samba or rsync+ssh , and the nettop could transparently take care of encrypting the underlying FS, whatever that may be.
Performance wouldn't be great... maybe 20MB/s instead of 60MB/s for an eSATA drive, and they'd have to work out a consistent network port / IP across all the sites it travels to. But it might confer some advantages.
Along similar lines, they could put in a small file server at each site, and rotate a removable disk drive between all of the file servers. That way they'd just have a drop box that they could always push files to throughout the day, and let the couriers just grab what's there and deliver.
Re:HIPAA Constraints? (Score:3, Interesting)
Anyway, we only rarely will need to use the sneakernet but need the option. HIPAA is only a minor issue. The biggest issue comes in dealing with multiple IT departments and setting up network access to our materials. Plus our images are so large that for these processed files (not the originals) we are opting for local storage instead of storage managed by our IT staff (who are wonderful but not cheap; we just purchased 4TB of local storage for 1/4 the cost of 1TB from IT).
Re:HIPAA Constraints? (Score:3, Interesting)
Yeah, then it sounds like you're pretty much doing the best you can under the circumstances... I was just trying to think out of the box a bit and turn your filesystem compatibility problem into a file server compatibility problem, since cross-platform compatibility is a much bigger deal in the latter scenario.
One last consideration you might want to try benchmarking is storing your data in an image file, like a zip or tgz or more likely a dmg archive... that way you could probably do transparent compression as well, which might work well on your datasets. If your disks have limited performance from USB or whatever and you have beefy CPUs, that may even increase your overall throughput when making copies to/from fast local storage. Plus some of the archival formats may help preserve some filesystem features / metadata that may otherwise get clobbered by copying through a compatible intermediary FS.
Been There, Fixed That (Score:4, Interesting)
We had almost exactly the same problem. Our fMRI work was done at University of Virginia on a Linux machine. naturally you don't want to tie up a $1500/hour data collection machine doing analysis. Our data was transferred immediately to the Neurological Institute to a multiboot machine. No patient data included at this point, so no HIPPA problems. The receiving box ran Linux initially since the analysis programs from NIH (primarily AFNI) were Linux based. Patient data got added here so HIPPA became an issue. The machine had multiple hard drive bays, all of which were removable, plug-and-play drives made from a kit that provided slide-in rails and a locking mechanism, otherwise were common, commercial drives. Externals would have been easier, but the guy who devised this had a rilly rilly good reason. I remember it was good, but not what it was. Anyway, the machine could boot other OSs, prep the drives, go back to the native Linux HFS+ and transfer/translate to the , it was transferred, the drive removed, packaged, and FedEx'd to the other analysis sites at Virginia Tech, NIH, and U.Va Wise. We were strictly experimental, no direct medical treatment, and so time was not an issue. With OS X being *nix, there's not a lot of reasons to go with one over the other except for convenience when it comes to what your data collection and analysis are running under. Unless yours run fine under OS X, I'd say stick with HFS+, and of course moderate that according to whether you have to share out the data and what those people are running. I wouldn't bother with supporting Windows, as they continually find new problems to have with large files. One comparison test showed no difference in analysis results, but they did have problems with Windows choking on the data files. Their test files were only 1.5 GB. ref: J Med Dent Sci. 2004 Sep;51(3):147-54. Comparison of fMRI data analysis by SPM99 on different operating systems. PMID: 15597820. My experience agreed with their results. As I said we had little call for Macs, so we didn't run enough of that to give a good test of whether it had the same kind of problems. Bottom line, we used what we needed to according to where it was going and what they needed it to be, but for our own use it made no sense to transfer it out of the OS that collection and analysis used, HFS. The system met with the approval of the biophysicist we worked with at U.Va, and he had been a grad student under Peter Fox when the latter developed SPM. OH YEAH: the good reason. If anyone else wanted to work with us, they didn't have to dig too deeply into techie stuff either hardware or software. We could send them a removable-drive kit to install, and send them a drive with bootable Linux, AFNI and data, all plug and play. If that might be useful to you (using externals instead of removables doesn't matter here) that's probably be another vote for HFS.
Re:ext2 works. ntfs works. (Score:1, Interesting)
"Works for me" != "works for everyone". Especially if your usage patterns don't happen to trigger the bugs...
Also, you do not ned a license to use NTFS. ntfs-3g is a free open source implementation based on reverse engineering. It's been around for a long time and Microsoft hasn't shut it down. As far as I know, Apple's NTFS driver is also based on reverse engineering. It's just notoriously hard to implement writing to NTFS; it's a complicated file system.