Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Linux Software

Swap Performance in Linux 62

GizmoDuck writes "I'm working in a computational chemistry lab, and we find ourselves using memory and CPU hogs like Amber and Gaussian. The CPU hogging isn't a problem, thanks to Condor, but when submitting one of the jobs that request (and pretty much require) all the physical RAM in the machines, Linux promptly starts swapping so hard that the mouse pointer in X stops moving, NFS and NIS halt, and things don't get back to normal for five minutes. I've tried toying a bit with the settings in /proc/sys/vm/kswapd to no avail. I've done some poking around on the 'net looking for answers. Faster disks and swap partitions at the beginning of the drive aren't really an option at this point. I haven't found a good solution yet. I was wondering if the /. community has any input on how to keep the system from locking during periods of necessarily high swap activity?"
This discussion has been archived. No new comments can be posted.

Swap Performance in Linux

Comments Filter:
  • Have you actually tried without a swap partitioun? I don't know it it makes a change here, but i think it's worth a try if you've got enough RAM in you machine!
    • if you have enough RAM, it won't even use the swap. so if it is using the swap, and you take away the swap, you'll likely just run out of memory.

  • by Anonymous Coward
    2.4.x

    I still have no idea why Linus used 2.4 as a development tree. Go back to 2.2.x, no swapping problems going on there.

    By the way, does anyone know the command to flush the swap partition?
  • preempt ? (Score:4, Informative)

    by raulmazda ( 87 ) <adam@laz[ ]org ['ur.' in gap]> on Tuesday March 12, 2002 @04:45PM (#3151877)

    Maybe try out the preemptible kernel patch?

    My personal experience is that it has helped my workstation's interactive performance noticeably for big ass c++ compiles and periods of lots of disk activity (big apt-get dist-upgrades). Thankfully, I'm no longer doing the big ass c++ compiles, so it's not as big of an issue as it used to be :)

  • by PaulBu ( 473180 ) on Tuesday March 12, 2002 @04:46PM (#3151888) Homepage
    It should improve interactive performance (i.e., your mouse will start moving again :) ) when load is high. Also, running your background process nice'ed will be helpful.

    You might also consider a crazy idea of having swap file on NFS -- you'll get (if your network is decent) almost the same bandwidth as you get when accessing (older) disk, but much higher latency (this will put your background process in disadvantage compared to your interactive processes).

    Hope this helps.

    Paul B.
    • You might also consider a crazy idea of having swap file on NFS -- you'll get (if your network is decent) almost the same bandwidth as you get when accessing (older) disk, but much higher latency (this will put your background process in disadvantage compared to your interactive processes).

      A neat idea, but wouldn't that just migrate the problem to the NFS host? I'm too lazy to try it myself.

      • A neat idea, but wouldn't that just migrate the problem to the NFS host? I'm too lazy to try it myself.

        Sure it would, but:

        interactive performace on NFS server might not be that important

        it might have faster disks

        and, finally, the swap hog program will slow down due to network latency, creating less load on NFS server than it would on the workstation.

        Paul B.

    • That sounds nice until you realize it's your X server and xterms that'll get swapped out as soon as you leave your workstation for a moment. When you have to wait for *those* to come in over the network, you'll be crying for local swap once again.

      Network swap is really only a useful option for diskless workstations.

      --Joe
  • by Skapare ( 16644 ) on Tuesday March 12, 2002 @04:48PM (#3151910) Homepage

    If your program(s) push Linux to the point where it actually runs out of available RAM faster than it can free it up, then "all hell breaks loose". It has to swap something out, and just about every program is eligible to be swapped out. That includes GPM (if you are on a virtual console) or X (if you are in X Windows). You need to account for all of these things to determine your RAM needs. Add up the memory usage of all your active programs, plus the buffer demands they have doing disk I/O, plus the kernel, and you need that much RAM. If the program is doing a LOT if disk/file writes, you can expect the buffer demands to be the majority of this, too (because the kernel believes what you just wrote you might soon want to read back, so it tries to keep lots of it in RAM even if that means swapping out GPM and X).

  • by OneFix ( 18661 )
    You didn't actually say, but I would assume that you are not using EIDE or have a relatively high amount of RAM (512M+) in these systems. Otherwise, have you recomplied the kernel to have only what you need? Compressing the swap partition probably wouldn't give anymore performance (as it would be wasted on CPU Time.

    What are the system specs?
  • FreeBSD (Score:3, Interesting)

    by paul.dunne ( 5922 ) on Tuesday March 12, 2002 @04:54PM (#3151973)
    Is switching to FreeBSD an option? The virtual memory management there is much better than in Linux under stress.
    • Re:FreeBSD (Score:4, Interesting)

      by Aaaaaargh! ( 466118 ) on Tuesday March 12, 2002 @05:59PM (#3152455) Homepage
      Is switching to FreeBSD an option? The virtual memory management there is much better than in Linux under stress.

      I'd have to agree. The author should look into using FreeBSD. A GIS project I'm currently working on allocates 3GB of RAM at startup. Until we get the rest of the funding for our SunFire solution [sun.com], we're using what we have available, which is (was, actually: we've replaced the OS with FreeBSD) a P4 Linux box with 2GB of RAM, a 9GB SCSI drive for swap partition and a 36GB SCSI drive for everything else.

      I'm not a Linux expert, but the techs in the department are. After a few weeks of their tinkering, it did pretty much the same thing as you're experiencing. I have a small development system at home (P3, 1GB RAM, 4GB SCSI swap, 40GB IDE for all else) running FreeBSD. Installed the software, and it runs like a charm. X works beautifully, Apache still serves up pages (of course, it doesn't get much traffic at home) and the program never chokes the system. Granted, with only a gig of real memory, it spends a fair amount of time accessing the disk (about 30 seconds every 2 minutes), and it steals almost all the cycles from dnetc [distributed.net]!
  • by haplo21112 ( 184264 ) <haplo@ep[ ]na.com ['ith' in gap]> on Tuesday March 12, 2002 @04:58PM (#3152008) Homepage
    The best way to handle this(or at least the best way I handled a similar situation) is to combine Robert Love's Preempt patch and Ingo's Scheduler.
    They will significant increase high load user performance, keeps the system from running away with itself. If your feeling really, adventuresome you could also throw in Rik's Rmap VM...I have done very little testing with it, but I hear alot of reports that it helps.
    there are all available in the authors respective directories on Kernel.org [kernel.org] riel,rml,mingo
    • In particular you might try patch-2.4.18-pre9-mjc2.bz2 [kernel.org] , which include O(1) scheduler, preempt, and Rik's Rmap vm (among other things), and has been working solid for a number of people. At least, it is worth testing out to see if it helps any.

      To build it, get the linux-2.4.17.tar.gz [kernel.org] kernel, patch [kernel.org] it to linux-2.4.18-pre9, then patch again with patch-2.4.18-pre9-mjc2. Then build and use the kernel. Check recent (ie. 2002 ) kernel archives to read discussion of this and other related patches, if desired.

  • You're out of luck (Score:5, Interesting)

    by afay ( 301708 ) on Tuesday March 12, 2002 @05:07PM (#3152087)

    Unfortunately, you're out of luck. The current linux VM (in later 2.4 series) is fine for low to medium load systems but falls apart on high load systems. The previous VM (early 2.4 series) is a good design but isn't really ready for production.

    I would suggest buying more RAM (it's cheap) if you aren't already maxed at 4 gigs (x86). Alternatively switch to FreeBSD which has a very stable efficient VM. Any source should recompile without too much trouble and it can run linux binaries at almost full speed!

    • Would seem to be broken to, the VM refuses to work with less than 8Mb of physical RAM.
    • The VM in the early 2.4 kernels would grossly lock up when it was out of memory. I was told this was due to the fact that the design assumed you had at least as much swap space as RAM. It could not handle the case of (memory need > swap) even though (memory need < swap + ram). I have several systems which have lots of ram and no swap at all, and they would die quickly. And it wasn't because I was overusing memory with the processes. This would happen even if the ram got used up when writing data to a file larger than ram space. The later 2.4 VM fixed that. Hopefully when Rik's VM is cleaned up, it should solve the problem with lack of (or small) swap.

  • First of all, I would recommend trying the preemptible kernel patch and even the low-latency patch. It seems like an obvious enough suggestion, but some will tell you that these patches should not be used in servers where throughput is important, and that is correct... in some cases. It has been shown, however, that in most cases the preemptible patch increases performance and throughput. I have not heard of any such testing on the low-latency patch, as I am new to it.

    In my testing, these two patches have been a big help, especially on my P166 system with 48MB RAM.

    Also, you say "faster drives" and repartitioning are not feasible ATM, but how about multiple small drives? As shown in this howto [linuxdoc.org], the linux kernel has support for striping data to swap disks, just by specifying multiple swap entries in fstab.

    Then again, if you're not on SCSI, trying to stripe to the swap drives won't be much help anyway, as RAID over IDE for _speed_ usually is just crap.

    That last suggestion may not be for you, but definitely try the two patches. It should also be noted that preempt is a compile-time option, and there is also a compile-time option to control the low-latency patch through /proc/sys/kernel/lowlatency. An additional patch may be required to allow these two to work together, but I am unable to locate it currently.
    • Raid over IDE performance is crap unless it's Raid-0 (striping) whereupon the more disks you can throw at it (up to the controller's physical limits) the faster it goes. You could have a screamin' swap partition setup on a separate Raid-0 (no mirroring 'cause it's swap only).
      ('Course swap is how many millions of times slower than RAM...? :)
      And if you want that performance with redundancy then do Raid-0/1 with separate controllers for each 0-set (you want physical redundancy anyway) and your performance hit from mirroring actually won't be too bad.

      My $.031475

  • just a shot in the dark here, but can you just give a lower priority to those applications in order to keep the workstation usable while doing this work?

    I can't recall the command line option off the top of my head but I know using Gtop, you right click the app, and pick renice, then set it to 1 instead of 0.

    • Done already. Running the apps at +5 or even +10 doesn't seem to do much. I even risked my cajones and nice'd kswapd and bdflush, which did nothing.
    • That probably isn't going to help matters. Each process needs a certain amount of memory space. No matter how nice it is, it will still require so much memory. The only thing nice will do is maybe cause a small increment in how after the programs get data swapped in and out. It would be very small since the data is going to have to be swapped in/out eventually anyway.

      BTW, the command is "nice" or "renice" if its already running. Pretty tough to figure out.
  • Are you running a recent kernel? It's got a lot better in the newer 2.4 series. We replaced the original kernel in RedHat 7.2 with the errata kernel, and it is much better!
  • Linux 2.4.x VM (Score:3, Insightful)

    by Trevelyan ( 535381 ) on Tuesday March 12, 2002 @05:20PM (#3152181)

    Did you miss all the 2.4 Linux VM Stories?

    I suggest build/installing the latest kernel with the aa VM (the default VM, since 2.4.10). If you still have VM (Swap) problems then go get the latest rmap VM patch and try that.

    The kernel VM (Virtual Machine) is what manages memory and sawp, btw.

    And if u did miss all the VM stories, a summery:
    at the start of 2.4 a new fancy mv was put in to action, using something known as reverse mapping. this was very clever but it wasn't quite ready and there were teathing troubles then suddenly (2.4.10) Linus switched VM to one similar to that of 2.3 (with some updates and a few features from the previous 2.4 VM) This started a big fight, which caused concerns (such that it may split the linux comunity)

    which is better i dont know some swer by one other swer the other. but unless ur using RH 2.4.9 kernel i would not recommend a pre 2.4.10 kernel.

    however you may need to experiment which is best the VM now in 2.4 (to stay) or rmap, u should try both and see

    steps
    Install 2.4.[17,18,19] [kernel.org]
    try it
    if it fails u try the rmap patch [surriel.com]

  • Shut down X? (Score:5, Interesting)

    by swillden ( 191260 ) <shawn-ds@willden.org> on Tuesday March 12, 2002 @05:41PM (#3152312) Journal
    Do you really have to be using the machine while it's calculating? If not, what about shutting down X and any other memory-hogging system components? Unlike on Windows you do have the option of turning off that expensive GUI.
  • What I'd like to see is something along the lines of some kind of LRU which gently starts swapping data back into memory from swap when memory becomes free. There's nothing like having VMWare sitting in swap since you stopped using it an hour ago to do some other work and then jumping back and having to wait the 5-10 seconds of heavy disk activity to resume work there.

    As for those saying "don't use swap at all" -- that's crazy talk. I'd rather have an app or two go to swap instead of being outright killed by the VMM when it needs an extra meg or so. If I'm not mistaken Linux tends to pick the big memory eaters to dump to swap over the little guys so if you start a compile... there goes VMWare... or your IM client... or Konqueror... lots of fun. :-)

    • That's because if the process is sleeping, Linux keeps it in swap so it can use your memory for things that _do_ help you, _right_now_: Disk cache.

      You may disagree with the sentiment, but you get benefits fro it all the time. All those gettys that you're not using? Init? Portmap? Pump? devfsd? lpd? atd? cron? xinit? Those are swapped out, and they won't be wasting your RAM until you want to use them (and then, it's not "wasting").

      But yes, for extremely large programs, it'd be swell if Linux (or anything) could predict you'll want something before you actually want it.
  • ... since I always get flamed or moderated down when I say things like this on Slashdot, but what you're looking for is FreeBSD [freebsd.org] (with Linux emulation if you need to run closed-source stuff).
  • by leastsquares ( 39359 ) on Tuesday March 12, 2002 @06:37PM (#3152728) Homepage
    Why don't you submit yhis query to the computational chemistry mailing list (see CCL [ccl.org])

    Those people may be able to give you some sensible suggestions, especially with respect to those particular peices of software.

    I believe that you can restrict the amount of memory that Gaussian uses via its keywords. When it requires more, it will handle the dumping of data to disk itself. Read the manual - I haven't used gaussian since g94 was the current version so can't remember..

    How big is your AMBER simulation? I think I would run a smaller system... or even better... buy some more RAM given that it is dirt cheap nowadays.

    AMBER's memory use is a bit heavy - you may have better luck with another MD package. Maybe NAMD? (Although I'd still vote for the "buy more RAM" option)
  • by Spoing ( 152917 ) on Tuesday March 12, 2002 @07:07PM (#3152985) Homepage
    I'll second the other comments already made. In addition, sometimes the simplest ideas are the most valuable, though I'll assume you can't just drop in more RAM.

    With that as a given, if your app needs all available memory, run top and lsmod to see what's using your memory and remove everything you don't need (usually by deleting the links to those processes in the /etc/???/rc5.d directory).

    If you can't remove it, scale it down. For example /etc/inittab lists off the different virtual terminals that appear when you press ctrl-alt and a function key. If you never use this feature, try reducing this down to 1 or 2 terminals. Leave some behind just in case you need them later. To do this, just comment the higher numbered lines that look like this;

    1. 6:2345:respawn:/sbin/mingetty tty6

    (NOTE: Removing these lines might not make any difference -- it all depends on the distribution.)

    As for X (assuming you need it and are using XFree), try removing any Load lines in the modules section that you don't need and scaling down the display size, background images, and color depth. Another big area of savings is changing the window manager. FVWM usually is installed, and while it is ugly it is also fairly light weight when compaired to KDE, Gnome, and other popular full-featured WMS.

    While these steps alone won't eliminate the speed problems -- the other comments might solve that -- the time you spend waiting might be cut way down.

    • why not run this simulation on a dedicated system, then ssh in, so you let the x-forwarding handle the display. No X running on the simulating machine, and more RAM for the app, and no unresponsive swapping issues.

      Of course, it's a work-around, this dosen't actually *solve* your problem, but you'll have to talk to Linus and the rest of the kernel hackers to get a real solution.
  • i'm one of those people who don't use swap space, my rationale is this, its sped up my computer activity a lot, i have a relatively slow computer (350) and i forked out $35 for another 256meg ram, now i have 384 and i feel its the best thing i've done. sure i don't use my entire ram on one program, but for general acitivity i find i only end up using about 128meg ram (lotsa programs working) and i get a lot of hard drive cache space so those mp3s keep rolling in. for a few dollars if i can be a lot more comfortable with my systems performance, why not? i have a 60 gig hard drive, but i bought it to store data not as temporary memory.
  • If you already are maxed out on ram then "nice" is your friend. You can try to squeeze out as much performance as you want, but if you don't have enough ram or ram is not an option then you just have to deal. Also make your swap partition bigger if it's getting full to quick, if 8 people are running gaussian, amber. Then obviously you need more than the "recommended" swap even with large amounts of memory. 4 gigs of memory seems like alot until you start doing heavy shit.. thats where you get a nice cheap 40-60-100 gig drive and make the whole thing primarily for swap. Problem solved. You might also want to write yourself a daemon that nice's based on order.. from -20 to 20. First in gets the highest priority.. subsequent processes get a lower priority and get reniced to a higher priority as the first process finishes.. This way if it's only a dual system even though linux is pretty good with multi-processor support you get even more efficient scaling. Worst comes to worst lobby for a couple of blades or netras or something and stack em.
  • A Better Way (Score:1, Interesting)

    by Anonymous Coward

    /etc/security/limits.conf

    I use this method. I specify default values for nice levels, amount of CPU time, amount of memory, etc.

    This is a much better way. I will set up accounts with these restrictions. That way processes are running at a nice level of e.g. 5. X will be running at level 0 by default. This insures that you can always get back into X even if the app(daemon) goes nutty with e.g. a memory leak.

    No messing with command lines etc. The defaults have already been set.

  • unmask interrupts (Score:3, Interesting)

    by Wills ( 242929 ) on Tuesday March 12, 2002 @08:44PM (#3153630)

    You could try hdparm -u 1 which unmasks interrupts when the disk interrupt service routine is active. This often allows your mouse to continue moving even if the disk is busy dealing with swap. It's not perfect but it helps a lot. As others have suggested, also try the preemptible kernel patch but keep backups!

  • 1) I can't seem to get on the CCL list. I couldn't find automated instructions and when I sent an e-mail to chemistry-request, nothing
    happened.

    2) We're already nice-ing things up the yin yang and using the 2.4.18 kernel with pre-empt patch with no noticeable results.

    3) The machines must stay useable as they are also analysis and server machines in addition to computational boxes.

    4) Machines are dual P3 1400s. Unfortunately, disks are EIDE and RAM is 256MB in the process of being upped to a gig. However, this doesn't change the fact that we'll be running some calculations that will use all of that.

    4) We're not so anxious to buy 4GB of RAM for each machine until we're sure what kind of Beowulf cluster we're constructing and hence how much of our money goes to it.
  • FreeBSD (Score:2, Interesting)

    by DiSKiLLeR ( 17651 )
    The memory manager in Linux has lots of problems (as previous posters have pointed out).

    Have you tried FreeBSD? Apart from being a better OS all round, the 4.x series has a brand new revamped VM subsystem that handles high memory loads very efficiently. I never have a problem with swapping on any of my machines (which range from 32mb, 64mb, to 512mb ram machines).

    This isn't a troll. Sometimes a certain OS isn't the best solution for a job, and a different OS should be used. I use Linux for GUI/X type things, FreeBSD for heavily loaded servers (since it handles much better), and even Windows 2000/XP for other things. If those programs you use are linux binaries, FreeBSD can easily run them. If you have source, all the better. Recompile with all the specific optimizations for your hardware. (-O3, -mcpu=pentiumpro, -march=pentiumpro, etc)

    D.
  • Is the kernel tuned to match the hardware correctly? For example, if running with IDE drives have you run hdparm to optimise the UDMA transfer mode and other IDE parameters? Is the kernel optimally compiled for your processor. Depending upon your distribution and hardware details you may need to change the kernel for maximum performance.
  • Simple solution to your problem. Put your swap partition on a RAM disk!

    Performance problem solved :)
  • Why don't you switch to a Windows based OS. At least that way when your PC's lock up for 5 minutes nobody will notice.
  • IMHO, the your box is underspecced (ram, ide harddisks) for
    the job you are doing.
    Of course, you can try read:
    /usr/src/linux[name]/Documentation/sysctl/v m.txt
    for some tunable /proc parameters (eg. /proc/sys/vm/ overcommit_memory should be 0 (zero))

    Since you are using ide disks, 'man hdparm' is your friend.
    Check your kernel config for dma support of your mobo chipset.

    Daniel Robbins (from gentoo linux) has written an interesting
    article "Maximum swappage" http://www-106.ibm.com/developerworks/library/swap tip2.html

    Linux allow you to parallelize swap, just like a RAID 0 stripe

    /etc/fstab:
    /dev/hda2 none swap sw,pri=1 0 0
    /dev/hdb2 none swap sw,pri=3 0 0
    /dev/hdc2 none swap sw,pri=3 0 0

    Eg.: spread your swapfile on two disks, with equal priority.
    That way, you should in theory, double RW access speed for the
    swap. Also, some gains could be gained, if the swap partitions
    were moved from disks, that the OS and apps writes to.
    But read the article.

  • i know little to nothing about freebsd memory management, so i can't comment there...but IMHO Irix was a breeze to tune for both CPU and memory intensive applications. the machines i worked on did HUGE finite element analysis (dyna3d, hypermesh, etc) and after tinkering a little with kernel parameters (low/high water marks, etc) i was able to squeeze a lot more performance out of the boxes.
  • Your system is thrashing. The folks that claim that not running X or nice'ing non-essential processes are just plain wrong. The small amount of memory freed up will not help you and nice'ing other processes will not help when the system is spending > 90% of its CPU time swapping or looking for swap victims. What you need is more RAM, boatloads of it. Max out your system and if that still isn't enough than get systems that can take more RAM. Upgrading to FreeBSD can help if you are close to having enough RAM and just need a little better efficiency to get you by. Another advantage to FreeBSD over Linux is that generally when Linux starts thrashing it never comes back; it thrashes itself into oblivion until you reboot while FreeBSD recovers after the memory hogs finally finish running..

What good is a ticket to the good life, if you can't find the entrance?

Working...