Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Software Upgrades Technology

Subversion as Automatic Software Upgrade Service? 41

angel'o'sphere asks: "I'm working on a contract where the customer wants a automated, Internet-based check-for-updates, update and install system. So far we've considered a Subversion based solution. The numbers are: a typical upgrade is about 10MB in size. Usually it's about 30 to 50 new files (which have an average size of about 200kB) and 2 database files (which can be anywhere from 500MB to 2GB) that change regularly. Upgrades are released about every 3 months, and this will probably become more frequent as the system matures. The big files are the problem as we estimate about 100-300 changes in every file. The total user base is currently 2000 users, creeping up to probably 5000 over the next year, and might be finally end up at some 30,000 users. Any suggestions from the crowd about setting up a meaningful test environment? How about calculating the estimated throughput of our server farm? Does anyone know of projects that have tried something similar using an RCS or a configuration management system?"
"We want to support as many concurrent users as possible (bandwith is not an issue). We use an Apache front end as a load balancer and as many Subversion servers as necessary on the backend. My largest worry, from my calculations, is disk access on the Subversion server. We could not run meaningful tests, because a typical PC kills itself if you try to run more than 4 or 5 parallel Subversion clients doing an upgrade (due to insanely high disk IO, and high seek times)."
This discussion has been archived. No new comments can be posted.

Subversion as Automatic Software Upgrade Service?

Comments Filter:
  • Rsync? (Score:4, Informative)

    by Karora ( 214807 ) on Friday September 16, 2005 @05:45PM (#13580519) Homepage

    Wouldn't Rsync be better for what you want? Why do you need to be able to choose different versions to fetch?

    If the files contains parts that are constant along with parts that vary then rsync will in many cases only transfer the partial file. With Subversion that won't apply for binary files, but rsync will still recognise partial matches even on those.

  • times two (Score:3, Informative)

    by Lord Bitman ( 95493 ) on Friday September 16, 2005 @05:48PM (#13580551)
    remember that svn always uses more than double the actual space required to hold the files for a "working copy". For "one-way" updates, svn is _NOT_ the answer.
  • Agreed, rsync rocks (Score:2, Informative)

    by Anonymous Coward on Friday September 16, 2005 @06:27PM (#13580869)
    I have several apps like this. One is deployed to more than a dozen locations around the country, each having roughly 5000 users. It's a mod_perl app on BSD.

    My general routine: I have a "development server", and a staging farm (set up exactly like one of the customer's locations, right down to the network hardware). After changes are made and unit-tested, the changes are pushed to the staging servers using rsync. When all the various remaining tests pass, the software is pushed out to a customer's location (if they need to review the changes), or out to all locations.

    Note that I use rsync to PUSH changes on a regular schedule. The apps do not ever "phone home".

    My rsync script basically copies all the files except for unit tests, photoshop files, data, all that stuff, just the stuff it needs for run-time. It depends on an SSH key (which exists only on two machines and has a passphrase, so a key agent is required). It has a "fan-out" setting which allows up to N machines to be done in parallel.

    Also, my app is completely relocatable and cross-platform. I can check it out in any directory on any Mac, BSD, or Linux box and get to work. I can then push my changes directly from that development area to the staging server if needed. I use CVS and Darcs but that's not important, except to note that the rsync script needs to skip those "CVS" or "_darcs" files.

    Works great, very powerful. Of course I am leaving out details like choosing CVS tags, database schema migration, restarting/upgrading/installing daemons (hint, if you don't use daemontools, your apps will never be reliable), handling 3rd-party open source packages, pulling in changes that were made on the customer's machine (in an emergency for instance) etc., etc. But rsync is the core of it.
  • Disk Accesses (Score:2, Informative)

    by Anonymous Coward on Friday September 16, 2005 @07:13PM (#13581219)
    My largest worry, from my calculations, is disk access on the Subversion server.

    Put enough ram in your server, and the changed portion will likely fit in cache. If that's not an option, use RAID to speed up disk accesses.

    Others have mentioned rsync. You might also consider xdelta.
  • Re:rsync (Score:4, Informative)

    by commanderfoxtrot ( 115784 ) on Friday September 16, 2005 @08:10PM (#13581548) Homepage
    Subversion uses binary diffs in a similar way to rsync. The original poster pointed out bandwidth was not an issue- therefore any bandwidth advantages rsync gives (and yes, there are plenty) are meaningless.

    Subversion gives excellent control (tags anyone?) of binary installations. We use it at for things way beyond the usual source code storage.

    I have also found disk IO is the main killer. I would suggest looking in to caching. The subversion client sends straightforward HTTP commands to the server. I have a custom PostgreSQL backend which does some caching- in his place, I would have a Squid set up to cache some basic data fetches- obviously, you need to be careful to not cache old data but that's not hard.

    So yes, Subversion is excellent for this, and with a little thought, the heavy disk IO can be reduced. Cache, cache, cache.
  • perhaps (Score:3, Informative)

    by /dev/trash ( 182850 ) on Friday September 16, 2005 @08:47PM (#13581710) Homepage Journal
    rdiff-backup
  • cfengine (Score:1, Informative)

    by Anonymous Coward on Saturday September 17, 2005 @09:25AM (#13584083)
    First of all, it's obvious you are not using enough RAM on the servers. Get 8 GB. Don't do the balancing with Apache. If you are using Linux, resort to IPVS instead. For the large database files you'll want to use rsync. After the transfer, though, most likely you'll still need to perform the actual update. That's where cfengine comes in. You set it up to run rsync every N hours, then perform operations (restarting programs, cleaning up, whatever) when there's new data. You can also use it to restart dead istances of your application, etc.
  • First of all, thanks for so many replies!

    First I like to clarify a bit, probably my original question was not clear enough!

    The clients of the system are customers. They have Windows PCs as the software runs on windows. On the server side we need to be able to authenticate every client as there are several region and user level restrictions about who may access which file.

    You can assume there are simply 5 to 10 user levels, where a user on level 10 may access everything and a user on level 5 only a subset.

    So far SVN looks good:

    * authentication via the Apache front end, probably via a LDAP server

    * structuring the "download area" into directories with user level appropriated content

    Regarding, rsync:

    * first off all, I did not know about it :D

    * my first investigation indicates several draw backs

    It seems not to run on Windows (without Cygwin), users need to be unix/linux users on the server, building a distribution seems "more complicated" than making a tag/version with SVN.

    Please consider: from the point of view of the service provider the system is just the same like hosting a hugh pile of sourcecode. The starting distribution probably has 3000 files and is about 2.5 GB big.

    The users need to have the ability to fall back on a later revision in case of errors during distribution.

    Users need to be able to upgrade to the latest HEAD (there is only one main thrunk anyway).

    Regarding performance of SVN, yes we are clear we need to put a lot of RAM into the servers. But we cant get rid of the disk IO it seems as SVN does not cash requests (in this case all clients allways want the same release to upgrade to, and most of the time they either have the previous or the second oldest release installed)

    However: alternatives to SVN are very welcome! I only wanted to make clear why we considered DVN in the first place.

    angel'o'sphere
  • by eklitzke ( 873155 ) on Monday September 19, 2005 @03:06AM (#13594072) Homepage
    You may be interested in the Unison project. More info can be found here: http://www.cis.upenn.edu/~bcpierce/unison/ [upenn.edu]

I tell them to turn to the study of mathematics, for it is only there that they might escape the lusts of the flesh. -- Thomas Mann, "The Magic Mountain"

Working...