Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Cloud Data Storage Open Source Privacy Your Rights Online

Ask Slashdot: Building a Personal FOSS Cloud? 189

An anonymous reader writes "Cloud-based personal data management is pretty cool... if you don't mind entrusting the entirety of your personal data to a gigantic corporation. Apart from the risks of their doing unseemly things with your data, also the security of your data is entirely in their unreliable hands. So, is it possible to build my own personal data repository, where for example, I can store my contacts and calendars to sync to multiple devices? This could be hosted on any third party hosting service assuming also that all of my data was encrypted at the data level. So even if the host wanted to look at my data, all they'd see is 1s and 0s. What are the options for the tinfoil hat wearing FOSS folks that want to participate in the cloud age?"
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Building a Personal FOSS Cloud?

Comments Filter:
  • Thanks for sharing (Score:5, Insightful)

    by Anonymous Coward on Saturday July 14, 2012 @01:46AM (#40646517)

    So even if the host wanted to look at my data, all they'd see is 1s and 0s.

    That was the dumbest thing I read all day.

  • by BagOBones ( 574735 ) on Saturday July 14, 2012 @01:50AM (#40646533)

    http://owncloud.org/ [owncloud.org]

    - Calendar
    - Contacts
    - dropbox like storage

    • by SomePgmr ( 2021234 ) on Saturday July 14, 2012 @02:09AM (#40646595) Homepage

      Or use any of the usual storage services that provide a client to maintain a sync'd mount point, and just secure the contents. Jungledisk will do this for you for Amazon or Rackspace backed storage. Google Drive, Dropbox, etc. can be used with your own encryption mechanism.

      For bonus redundancy, sync the local cache to an external USB drive so you don't get caught with your pants down if one of those services botches your remote store.

    • It looks like this could work well with a Synology [synology.com] configured for disk redundancy, plus a home router with VPN (dd-wrt [dd-wrt.com] or tomato [tomatousb.org]).
    • Re: (Score:2, Informative)

      by Bob9113 ( 14996 )

      http://owncloud.org/

      It's pretty cool, but right on the first page it pulls code from googleapis.com. Hit the front page and you send a request with the referrer URL to one of the biggest stalkers. Maybe it's still good, maybe it's not hard to redirect that js link to your own machine, but it just seems like they've missed the fundamental point of not giving your data away.

      • by jcreus ( 2547928 ) on Saturday July 14, 2012 @03:22AM (#40646809)
        It's open source! You either: a) send them a bug report, or b) download it, and change the code to whatever you want.
      • by Anonymous Coward on Saturday July 14, 2012 @03:35AM (#40646835)

        The actual ownCloud application that you setup on your server doesn't have a reference to googleapis. I just checked on my installation.

        For those wondering, the project website links to the jQuery library hosted on Google's server so they don't have to host it themselves.

      • by datajack ( 17285 )

        That's on their site. The one where you download the software from. The point of his question was how to store data on your own site.

        Download and install owncloud, and there's no sign of googleapis.

  • cloud vs server (Score:5, Interesting)

    by O('_')O_Bush ( 1162487 ) on Saturday July 14, 2012 @01:52AM (#40646545)
    At what point does this involve a cloud? Renting a server(providing ftp, for example) is easy, and doesn't require anything from the "cloud age".

    Also, building a server or buying one secondhand is cheap, if you want to DIY.
    • Re:cloud vs server (Score:5, Interesting)

      by ganjadude ( 952775 ) on Saturday July 14, 2012 @02:12AM (#40646607) Homepage
      I was wondering the same, when I first read the headline I had visions of friends setting up partitions on each others hard drives and do cloud storage between mom,dad,sister,brother,grandparents,friends

      redundancy for family photos for instance on all family computers for instance. obviously private storage as well. the odds of all computers going down at once in multiple locations is highly unlikely. p
      • You might want to try tahoe-lafs [tahoe-lafs.org] if you want to share stuff with a fair number of people without giving them default access to the content.

    • I've actually been confused for a while now how the "cloud" is different from just having servers on the internet.

      Is it just the synchronization that makes it the "cloud"?

      • Re:cloud vs server (Score:5, Informative)

        by AuMatar ( 183847 ) on Saturday July 14, 2012 @06:07AM (#40647157)

        Servers are web 1.0. Cloud is web 3.0. Much buzzier and hipper.

      • Abstraction and provisioning. With the cloud, you don't need to worry about where the server really is physically and can alter the configuration at very short notice (It'll be a virtual machine). The 'cloud' term comes from the network diagram use of a cloud to represent internet connectivity: The server is out there on the internet, somewhere, and you don't need to care where. The cloud service operator handles that. So they can juggle workloads around for peak efficiency and thus minimum cost, or let you
        • I'm still not really seeing the difference. I rent a VPS that I rsync stuff to. I don't care where it is physically, it has a domain name and I can reach it wherever it may be, even if it gets relocated somewhere.

          • Then are are just using the cloud, from before someone started calling it 'the cloud.' As I said, the cloud isn't a technology: It's a business model based on not just selling virtual servers but managing them on large resource pools too.
  • by Anonymous Coward on Saturday July 14, 2012 @01:52AM (#40646547)

    You can write "The Cloud" on it with a Sharpie if you absolutely must.

  • We're working on it (Score:5, Informative)

    by wurp ( 51446 ) on Saturday July 14, 2012 @01:55AM (#40646555) Homepage

    https://github.com/wurp/Friendly-Backup [github.com]

    It works now, with some bugs. The first targeted usecase is distributed backup.

    However, it can store arbitrary read-only content-addressed data as well as signed labels that point point to a particular piece of CBA data to emulate mutable data.

    I have a whole slew of plans beyond backup for it, but backup seemed like the thing everyone needs and would most like to have for free on a federated data store.

    • It was my understanding that the original meaning of "the cloud" was file storage using distributed computing. And the whole concept was bastardized when corporations caught on to the popularity of the word and started applying it to offsite storage in a single location. Seriously, the real cloud would be some modified version of bittorent, in which lots of people have encrypted pieces of your files and you can access the whole file as long as you are connected to enough of these people to get every piece.
    • :(

      I wish Linus would take a few weeks off to write a distributed backup system, but he just uses public FTP servers...

      Of course, there're several projects that use git as a backend, like http://www.kickstarter.com/projects/joeyh/git-annex-assistant-like-dropbox-but-with-your-own [kickstarter.com] (already funded; he's also a Debian Developer).

      Since git isn't a backup system, using it as one isn't as efficient as it could be, but it is powerful. Joey's project is an exciting potential Dropbox replacement. He knows what he's

  • I don't get it. (Score:5, Insightful)

    by Nyder ( 754090 ) on Saturday July 14, 2012 @02:00AM (#40646565) Journal

    OMFG, the cloud. I got to have or do the cloud. Magic Ponies in the cloud!!!!

    Seriously, wtf do you really need the cloud for? Is it going to magically sync all your different data together so you can access it all the time?

    No, seriously, do you think it's going to sync all your data so you can use it and access it anywhere?

    No, it's not. Sure, you can access you data anywhere, but duder, we've been doing that for a couple of decades now, way to join the late train.
    Unfortunately, the various corporations don't want to agree to standards, so having docs/apps/whatever working with everything isn't in the "rape as much money as we can" business plan. so nothing is going to change.

    Now let's look at the Megaupload thingy. That was cloud storage, file lockers. It's not around now, is it? That is what happens to clouds, the winds blow them away. The wind? Oh ya, in this case, that's the good old USA Government, working for their Pimps, the Music/Movie Industry. You think that can't happen to any "cloud" servers? Think again. OMG, Terrorist used that server, Child porn was on that server, boom! You're data, which has nothing to do with those 2 things, is gone also. Hope you make a backup. Oh, wait, the cloud was magically supposed to back it up for you?

    Cloud has been around for awhile, but we called it what it was, the internet.

     

    • by thegarbz ( 1787294 ) on Saturday July 14, 2012 @02:33AM (#40646649)

      You want the above? That's easy. Access to email from anywhere, access to my contacts and my calendar, how about access to all my files? Yep got that. Though it doesn't have a fancy name like "cloud". If I were into marketing I'd call it a cloud, but right now I'll stick to calling it an "internet facing linux machine"

      Yeah it's not as exciting, but it does everything the so called cloud has done and it has done it for many years before this mythical cloud has existed. My phone sees the same set of files and emails as my home desktop PC, and there's a web interface to access all the above too.

      Seriously just google "Linux Groupware" and maybe "Linux Web Fileserver" and you'll have everything that the cloud has.

      • It is a fun reaction when I show folks how my netbook storage expands from a measly 250GB to 6.2TB when I'm connected to the Internet.

        SSHFS has been around for years!

        • While I agree on the overuse of the cloud meme, things like sshfs still don't offer encryption of the data at the server, so that someone with physical access can view all the data; it only encrypts the transfer.

          Is there anything that can hold an encrypted data store that is only decrypted by authorized clients on a local basis? Ideally it would be something that gives different layers of access so you can navigate a directory tree only with authorization, and only download files you need.

          The only thing I

          • This is something I have been pondering... i havent found it out in the wild so am tempted to build it myself. Perhaps we are looking for the same thing.

            What i essentially need is an encrypted disk image (crypto-loop based?) that can be read and written to without ever "mounting". When ssh authentication has passed, the SSH key is used as a "pass thru" into the disk image. But it is never mounted or exposed to the host OS, only mounted "virtually" to the userspace process that handles the ssh connection.

            I u

            • Isn't that basically FUSE? Not mounting is one thing, but FUSE mounts per-user, anyway.

              But if you don't trust the kernel to restrict permissions, how can you trust it for anything? You can't not expose something to the kernel--it's the kernel. So encfs may suit this use case. If you need it as a single file, you could archive the directory when it's not in use.

              Perhaps something like a GPG-encrypted tar.bz2 file that's decrypted to a tmpfs would work. But for a stream-like format, rather than requiring

              • Thank you for your insight. I will keep it in mind.

                My problem is that I do not trust root. If your data is on a KVM/XEN VPS somewhere (even with whole disk encryption), root on the host machine still has access if your VM is running. You simply don't know how well that host machine is secured. Someone gains root access to the host, everyone on there is screwed.

                Upon further thinking this through, it would seem (on the surface) I would have to build an SSH server and emulate the FS calls through to an archiva

    • Re:I don't get it. (Score:5, Insightful)

      by The Mighty Buzzard ( 878441 ) on Saturday July 14, 2012 @04:00AM (#40646899)

      Oh give the guy a break. This is exactly the situation the "the cloud" buzzword was created for: people who are scared of the phrase "file server". There is absolutely nothing new about "the cloud" in any way but it's a nice fluffy word that people are comfortable with and it's acceptable to not have any idea what it actually is. I'd change the hostname of my home server to thecloud just for wiseassery's sake if it wouldn't hose my Trek shipname naming scheme.

      • This is exactly the situation the "the cloud" buzzword was created for: people who are scared of the phrase "file server".

        No, they're two completely separate concepts, but you're right in that it seems to be acceptable for people to talk "the cloud" about it without understanding what it actually refers to.
    • by nurb432 ( 527695 )

      Is it going to magically sync all your different data together so you can access it all the time?

      Mine does. But then again, i run my own services at home, and do a regular sync to an off-site data store for backups, that i also own. ( in another state, just in case )

      None are reliant on a 'free' storage provider like megaupload or some other such unpredictable system.

    • OMFG, the cloud. I got to have or do the cloud. Magic Ponies in the cloud!!!!

      I'm seriously tempted to post this on the wall at work.

  • by siddesu ( 698447 ) on Saturday July 14, 2012 @02:04AM (#40646577)
    What's in the cloud that is better?
    • A pretty n00b friendly web interface.

      • by siddesu ( 698447 )
        Better than the unlimited number of SSH clients available? I wish I could see this amazing cloud interface.
    • Re: (Score:2, Insightful)

      Hardware redundancy is the big one. Your server runs as an abstracted VM in a management framework (XenServer, VMWare, etc.) that allows it to be instantly migrated to another machine with no interruptions/downtime if there are problems with the physical hardware it's running on. If you'd been running a real server instead of a cloud-based VM, you'd be down until that server could be fixed.
  • Freedombox (Score:5, Informative)

    by Qubit ( 100461 ) on Saturday July 14, 2012 @02:10AM (#40646603) Homepage Journal

    slashdot ate my last comment, so just check out the link [debian.org]

    • Re: (Score:2, Informative)

      by Anonymous Coward

      For more context: http://archive.org/details/EbenMoglen-FreedomInTheCloud2010

    • Don't know why it's modded "Informative". The link posted has a lot about vision, and freedom but nothing about what functions the freedom box is meant to provide.
  • by swell ( 195815 ) <jabberwock@poetic.com> on Saturday July 14, 2012 @02:24AM (#40646629)

    the safest storage is your own high speed server quality RAID 7 write-only drive

  • SparkleShare (Score:5, Informative)

    by SpzToid ( 869795 ) on Saturday July 14, 2012 @02:49AM (#40646709)

    Try the free open-source SparkleShare software and roll your your own cloud 100%. That would trump any cloud provider option if this is your concern, since all the disks and PCs are under your ownership and control.

    SparkleShare is essentially a DropBox clone in terms of a GUI, which extends to recovering older versions with a right-click. It looks like DropBox, and it works like DropBox too. But it is just a scripted GIT environment. In fact if you already have a GIT Repo hosted on a server (or service) somewhere, SparkleShare is easily configured to wrk with it. Here's how you start from scratch, assuming you already have PGP keys shared with the server:

    At the server, create a new, empty GIT repository:
    git init --bare NEWREPOSITORY.git
    At the workstation:

    Normally, you might use something like the following commands to work with GIT. (these are not necessary if you use SparkleShare)

    git clone ssh://user@example.com:port/home/user/NEWREPOSITORY.git
    cd NEWREPOSITORY.git
    git clone ssh://user@example.com:port/home/user/NEWREPOSITORY.git
    The SparkleShare config:

    Add Hosted Project...

    Address:

    ssh://user@example.com:port

    Remote Path: /home/user/NEWREPOSITORY.git

    This document explains how to add a layer of encryption, (which also works to secure services like DropBox btw: https://github.com/hbons/SparkleShare/wiki/Encrypting-your-files-before-transfer [github.com]

    • by kotku ( 249450 )

      Sparkleshare is nice but not ready for production. The second bug report in the issue tracker has the developers/users sharing thier code by dropbox of all things ironic.

    • by devent ( 1627873 )

      Git is not designed to handle big binary data. Since Git is creating SHA hashs for each file, with a file 500MB and more it will take more time, also it will use up all the RAM to calculate the hash. In addition the size of the repository will skyrocket if you put revisions of big binary files, since you can't easily delete old files in a git repository.

      Git is good for text documents and source code. But since even the Odt documents are binary blobs (the xml data is compressed with zip), you can't use git e

      • by Bert64 ( 520050 )

        There is a variant of ODT which is a flat uncompressed XML file... That works well with git, also there is a plugin for libreoffice which saves your documents directly into a git (or subversion, cvs etc) repository (which i believe stores the data as dirs rather than zipfiles)...

      • What about using git-annexe to store large binaries?

      • by jgrahn ( 181062 )

        Git is not designed to handle big binary data. Since Git is creating SHA hashs for each file, with a file 500MB and more it will take more time, also it will use up all the RAM to calculate the hash.

        Git uses way too much RAM for some reason, but this is not it. Correctly done, it only takes kilobytes to calculate any hash, regardless of the size of the hashed data.

        [...] Git is good for text documents and source code. But since even the Odt documents are binary blobs (the xml data is compressed with zip), you can't use git efficient with open document text files or other documents like Excel, Spreadsheets, etc.

        My solution to that is to avoid file formats and tools which are hostile to version control. We knew decades ago that MS Office was a mistake; why repeat it?

      • git-annex assistant (Score:4, Informative)

        by gottabeme ( 590848 ) on Saturday July 14, 2012 @07:31PM (#40651953)

        This is what we are all waiting for, and it's already been funded! Just a matter of time until Joey finishes it: http://www.kickstarter.com/projects/joeyh/git-annex-assistant-like-dropbox-but-with-your-own [kickstarter.com]

    • I was attempting to install this, when I went with ownCloud instead. The reason? SparkleShare doesn't have Windows-sync clients that work on XP. That's a deal killer. Many small businesses have only XP machines. Yeah, it may be time to let it die, but what good is a syn-client that only works on half of your PCs. It doesn't matter if it is time to move on from XP, what does matter is that lots of people haven't.

      ownCloud, however, was smart enough to make a Windows sync-client that works with XP.

      So far, ownC

  • Real Cloud (Score:4, Informative)

    by PiSkyHi ( 1049584 ) on Saturday July 14, 2012 @02:53AM (#40646719)
    I did misread this. When I think cloud computing, I am coming for a CS point of view, which is that cloud computing is the terms used to describe the efforts to make scalability of software as a service ubiquitous. Basically, the cloud is not a bunch of servers, it is the infrastructure that provides scalable services to an application layer like the web. Amazon pretty much built the best cloud and others are following their lead. So, I have been looking at OpenStack [openstack.org]
    If anyone actually thinks this question is in any way relevant, please let me know if there are other resources.
    • Thanks, I'll add that to the list of definitions of "cloud" that I have heard from computer scientists. "Cloud computing" is an undefined term; at this point, people use it to mean whatever they want. Scalable infrastructure, computation as a utility, storing files on a server, whatever, it's all cloud at this point.
      • I agree that it is a buzzword that is thrown around, but when I compare someone wanting a personal cloud so they can share infrastructure across their personal devices, basically to their own VPS somewhere - syncing their contacts, to the gradual advent of transcending hardware constraints to services by automating the sharded location of data and virtual compute nodes to scale tasks out as if the hardware is just a temporary location for some info passing through it. Servers used to have names, now, in the
        • Servers used to have names, now, in the cloud they are scalable service template instances

          That is only one of literally dozens of uses of the word "cloud." Deploying virtual machines on clusters or grids is great, I agree -- but calling it "cloud" is about as useful as calling it "a thing."

          It's not a buzzword, it's quite a complicated thing that has arisen from the abundance of hardware as a unit and the requirement that none of it be solely relied upon just to provide services

          So now it is redundancy? Virtualization?

          People who work with cloud infrastructure

          Everyone works with "cloud" infrastructure today, because it is trendy...

          This is why it's a cloud. It's not a buzzword after all.

          In other words, because you think "cloud" should mean the thing you are using it to mean, it must not be a buzzword.

          • Everyone works with cloud infrastructure because its trendy ? Don't try and learn anything, it could be painful. Keep it trendy, that way you don't have to pay attention.
  • pull it out fo your phone and plop it into another device to import? If you're gonna pull all this retarded effort into the "cloud" why not just set up VNC and log into your computer at home and grab the contacts? You know something thats been available for over a decade.

  • by Anonymous Coward

    *snort* 27 posts so far and no one seems to really have addressed the poster's real question. (Instead, all I've read is basic suggestions like a file share, VNC/SSH, or OpenStack; all of which seem to ignore the main point: "is it possible to build my own personal data repository, where for example, I can store my contacts and calendars to sync to multiple devices?")

    I've been looking for something like this for a while now, actually. From my research, I think the best way to solve this problem is to set

    • by jbolden ( 176878 ) on Saturday July 14, 2012 @04:08AM (#40646921) Homepage

      I tried something like this last year using Linuxy solutions. For a midsized setup (30k users in groups ranging from about 30-500). For personal though I'm not sure it doesn't make more sense to just treat calendar and disk storage as two totally distinct problems and thus simplify the solution. Pick any of a dozen different internet calendar / scheduling services and do storage by itself.

      But if you want to know the lay of the land as far as groupware:

      1) I didn't go with Zimbra because at the time they were focused heavily on the rack server space and their longer term direction scared me. The cost per user was high for the commercial version and I did want commercial version features.

      2) Scalix was really good 4-5 years ago. But is essentially now unmaintained. If you can live with broken compatibility and FireFox 3 for less than 10 users it is free. It has a very advanced calendar and an easy to use but powerful administration system. Really nice but I'd have a hard time going with a product that is now essentially dead.

      3) OX (http://www.open-xchange.com/home.html) has what you are looking for. But understand that for whatever reason the app is not written MVC gui code is completely intermixed with functionality. It is effectively not much more changeable than a closed source program. They were working on this and by 2014 or so that likely will be fixed.

      There were some others I experimented with if this is the sort of information you are looking for.

    • At least I told you when I mentioned Openstack, I was talking about cloud services and not the failure of all mobile vendors to implement SyncML.
    • by Bert64 ( 520050 )

      I looked at funambol, but didn't like the idea of having to install a client on each device.

      However, I do something similar with Zarafa...

      Their old web ui was pretty ugly, but the new one is much improved...
      It supports caldav (which many desktop clients and ios devices support by default).
      It also supports activesync through the z-push plugin, which ios/android/webos/etc all support by default, and which will sync mail/contacts/calenders.
      And there's another plugin i recently installed to get carddav support,

  • It runs linux, you can ssh into and install or compile whatever you want, comes in upto 4 gigs and i think they just got a dual drive one. Use the built in internet access to the twonky server or install some free cloud software.
  • Grab an old box, stick some hard drives in it with some sort of RAID, encrypt the partitions and use rsync or similar for backing things up. Want extra redundancy? Use a USB drive or buy a cheap old tape drive off ebay.

    Forward SSH to it and you have "Cloud Storage". This really isn't a new concept.

  • What is the cost of a roll-your-own cloud solution? Most discussions about the cloud miss out on the most important element, which is the cost. People use Google because it is essentially free, and gives you very decent reliability. I know you can make your own home server super reliable, but in aggregate, if 1 million people were running their own servers, compared to 1 million on google, I would bet that the 1 million on Google's cloud would do better on uptime in aggregate. The cost of trying to get to G

  • Run your own fail safe data repository. Companies have been doing this for ages and it isn't that hard nor expensive to implement it at a smaller scale for your own needs. No cloud needed;-)

    Just use rsync, and something similar to mysqldump and mysql replication along with 2-4 linux nodes ideally hosted on different network/providers. You can host the nodes in VMs connected to regular consumer grade DSL or cable modem connections. You could make peering agreements with friends and relatives, I host your nod

  • most of us nerds have been doing cloud computing with our own *NIX on x86 boxes for years, running home servers with lamp + SSH.

    then there is this pogo plug thingy which does the same thing but for newbs who don't want to do the setup, and for cheap.
  • Doesn't this question get asked here like every other week now?

  • by jon3k ( 691256 ) on Saturday July 14, 2012 @10:04AM (#40648165)
    "Personal Cloud" is a misnomer, at best.

    Using the wikipedia definition:

    "Cloud computing is the delivery of computing and storage capacity [1] as a service [2] to a community of end-recipients.".

    The whole point of a cloud is to abstract a massive underlying infrastructure to deliver some type of computing service (PaaS, IaaS, SaaS, etc ad naseum) to a large group of users and to be able to scale that infrastructure seamlessly. A "personal cloud" is an oxymoron.

  • In a word ... Citadel. [citadel.org] (Disclaimer: I am a developer on this project, and yes, I'm flogging it here.) Contacts, calendars, notes, documents, email, etc etc. One single installation without a zillion dependencies.
  • If you're more of the DIY type, like myself, I'd suggest building your own from scratch. Remus [wikidot.com] is an excellent choice for a high-availability environment. Admittedly, it's still a relatively young project, but as of Xen 4.2 (currently the unstable branch), it's been largely stable and easy to work with. You can even use DRBD as the storage backend (currently it's using a modified DRBD with a new "protocol D" synchronization method, but prot D is going to be integrated into the main DRBD branch as of DRBD

"No matter where you go, there you are..." -- Buckaroo Banzai

Working...