Forgot your password?
typodupeerror
Hardware

What Do You Need To Watch For In A Linux SMP System? 23

Posted by Cliff
from the two-procs-is-almost-always-better-than-one dept.
thefin asks: "My research group has finally received funding (~$200K) for a single SMP box. I've looked over the offerings from SUN, IBM and Compaq. I was wondering what others think of the SMP offerings and in particular, opinions of Linux as a SMP OS. What should we know before purchasing such a machine? What should we look for and what should we avoid? We will be using the box for large, individual based ecological modeling efforts. These models are CPU intensive, and make heavy use of inter-node communications. In particular, inter-node communication can be a serious bottle-neck in our models."
This discussion has been archived. No new comments can be posted.

What Do You Need To Watch For In A Linux SMP System?

Comments Filter:
  • by milgram (104453) on Wednesday January 10, 2001 @03:04AM (#517323)
    Why are you buying a single box? Is it necessary? You can find a lot of information about distributed computing at the aggregate [aggregate.org]. You could use the money to build a computer as fast, and have computers to use for other things when the computer is obsolete (6 months). :)
  • by Aunt Mable (301965) on Wednesday January 10, 2001 @03:50AM (#517324) Homepage
    Firstly, imagine a beowulf cluster of these.

    Secondly, if internode communication is as high as you say and you can't parellelise your code/algorithms you made a good choice with SMP (as opposed to beowulf). Avoid x86, go for Alpha. Avoid Linux or *BSD, go for the native OS which scales to multiple CPUs much better.

    If you can parrellelise your code the obvious thang is beowulf and cheap x86 boxes.

    -- Eat your greens or I'll hit you!

  • by Aunt Mable (301965) on Wednesday January 10, 2001 @03:53AM (#517325) Homepage
    To avoid the troll calling: Linux and *BSD are good on x86 but in SMP they don't compare to Alpha's official OS.

    -- Eat your greens or I'll hit you!

  • but it isn't for massively parrellel boxes. The kernel doesnt scale well, and its not optmized for the boxes. You would be much better off buying your {alpha,sun,ibm,sgi} and running {tru64?,solaris,aix,irix}.

    If you mean to consider a beowolf or MOSIX cluster of linux machines, then linux is a decent choice, but a cluster does not function the same as a shared memory supercomputer, and you will have to write programs in a different way to be able to use the clustering. This might be a problem, or it might not be.

    /*
    *Not a Sermon, Just a Thought
    */
  • Above all, use the 2.4 kernel whenever posssible if you do SMP. The 2.4 kernel shows noticable improvement over 2.2 even on a 2-processor machine, and while optimized for 8 or less processors, has been reported to run on 64 processors (though I can't remember just who did this, might have been Linux Care but I really doubt this). It doesn't have NUMA yet, though this will probably be the next major change to Linux the way things are looking now. At any rate, just stick to 2.4 if you do use SMP. It scales far beyond its predecessor, to an extreme. Just review the list of changes and you'll see what I mean. (Gawd, I'd love to see some 'Mindcraft' benchmarks now!)

    Also, check your drivers for SMP compatibility, some are better than others and have reputations for it. On the extremely rare occasions that you can get a manufacturer-supplied driver, you should be ok if it's hardware meant for big boxes like this.

    Lastly, don't forget clusters. (warning: a long rant on clustering follows, if you aren't interested, this is the time to hit 'Back')

    Linux is probably the ideal solution for clusters because of its efficiency and reliability, and the fact it hoses just about everything that is benchmarked against it on lower-end (ie. two processors or less) machines. If you have a lot of clients on the network that perform mundane tasks (ie. word processing, most spreadsheets, etc. that don't fully utilize system resources), utilizing idle cycles ala SETI@Home, combined with a network renovation to 100TX or better (not very expensive these days, but you'd want to eliminate the use of hubs for any machines involved, since bandwidth is at a premium) could yield a lot of horsepower that, at the very least, could supplement your big box (especially in the off hours provided the computers are left on). The money invested in this may give you more bang for the buck than buying new equipment, and if you want to do clustering smaller machines instead of one big box, you can use the remaining money to buy a set of dedicated machines and string them together on gigabit fiber-optic or something, and then bridge them to the rest of the network. x86 machines are great for clustering, PPC is a little overpriced for the power it gives you in many standard configurations(especially compared to Athlon processors if you do floating-point-intensive stuff) unless you use Apple's Altivec, where PPC tends to kick butt, and as for SPARC, ALPHA, MIPS, and anything I forgot to list here, you're on your own since I've never fooled with any of these. If you want to cluster, I'd stick to whatever the rest of the network uses so you only have to code for one processor.
  • Wether to choose SMP or to cluster, all boils down to how you can parallelize your simulation. Heavy inter-nucleus dependencies do not rule out the one nor the other.

    Three things are critical when parallelizing work:
    • Raw CPU power (per money) - here x86 are not too bad, neither are Alphas. Sparcs are less powerful here (but much better in bus throughput).
    • Throughput - how much data has to be moved to/from the node for each iteration.
    • Latency - how long do the single iterations run on a node - and how long does the node have to wait for answer from the controlling server.


    SMP machines usually have better throughput and latency than raw CPU power - whereas clusters have more CPU power and weaker throughput and even worse latency.

    The major advantage of cluster is that you can re-use them in multiple configurations (tree, (hyper)cube, pipeline, flat neighbourhood) - whichever meets your needs best.

    As for cheap clustering: maybe you have enough clients/workstations that can be (mis)used as cluster during nighttime (and NICEd down during working hours) for a first test run?

    All boils down on how (good) you can parallelize your simulation. If you cannot, multiple processors - in either configuration - won't help you.
  • I have two smp boxes, both are dual PII/III's.
    So far I've tried Linux, FreeBSD, BeOS, and win2k on them. I usually use Linux, I like it, and it runs well, for the most part, under smp. There are a few issues though with linux SMP. For starters, before the 2.4 kernel FreeBSD was much faster. I'm not sure if the 2.4 kernel is faster than FreeBSD, I havn't run numbers yet.

    From my experience I'd say, if your going to use linux, make sure the drivers for your hardware all work under SMP. I've had some scsi cards, and some ultraDMA 66 ide cards whos drivers are terribly unstable under SMP. (You might have to read the code for the drivers and look at the comments)

    Of course EVERYTHING I've done with smp has been on Intel hardware, and while I'm sure you can pick up 200K smb Xenon boxes, there probably not the most bang for the buck. If you really want to run linux SMP on a machine of this size, you'll probably have to go Sun, SGI or Alpha. Linux is pretty good on all three platforms there, but I'd stay away from RedHat on anything but intel hardware (ok me personally, even on intel).

    So maybe the steps you should take will be,
    1 Decide if you need 32 or 64 bit machines.
    2 Pick up a dual proc workstation with similar processors, to develop and test on. Lets say you think you want an e450, so pick up used dual spark station from one of the many used dealers that cater to ISPs.
    3 Make SURE all the drivers and code work well on it.
    4 Buy the HUGE machine.
  • While I love Linux a lot (cfr. my sig), and have a quite stable SMP machine currently running 2.4.0 at home, I indeed have some reservations about recomending for the kind of setup under discussion here. Maybe I'm just overconservative, but a premature move to Linux would close a lot of doors. Keep in mind that in a professional environment everything counts: it doesn't matter if teh Linux machine gives very good performance (especially for the buck), if the sysadmins end up having to deal with some obscure NFS problem (it's just an example, OK?) every other day.

    On the other hand, I use HP-UX machinery at work that is very similar to what is called for here, and can assure you that:

    • Except maybe for really floating point dominated work, nowadays PC processors are quite a bit faster than currrent HP processors when compared on a 1-1 basis.
    • Not all industrial strength unix SMP machines scale very well. The expensive ones do, but not all of them.
    • One of our HP-UX machines goes down every so many days due to SMP locking/racing problems. It's been doing that for months now, and none of many OS patches that we have installed has solved it, despite claiming to address problems that look exactly like what we're seeing. Seems to be a rather buggy OS (at least on that specific type of hardware, our other HP SMP machines don't have this problem), industrial strength or not.

    --

  • by Anonymous Coward
    I think it was in the decmber Linux Magazine in the UK. But Compaq have built a NUMA sysstem containing 4 cubes of 8 733Mhz Alpha cpus I beleive that they have got linux running on this beast with a LOT (64 or 256 GB) or Ram... OK it isn't SMP but some companies are making huge headway into the kernal and with 2.4 SMP is a dreamy thing. Sorry for being anon i am using a mate aol account
  • Get an E450 (or two). These are wonderful boxes, and Linux is well supported. I have to admit I would personally run Solaris in nearly any conceivable situation on this hardware though...
  • by MemRaven (39601) <kirk AT kirkwylie DOT com> on Wednesday January 10, 2001 @07:48AM (#517333)
    And ignore Linux. This seems to be a free-from-zealots conversation, so let me point out that for $200k, you're buying a LOT of CPU. Linux won't scale to the levels of CPU that you're getting. The use of signal-level threading and other kernel details limits linux on a really big SMP box.

    SGI's systems are really designed for this. The NUMA architecture is a way to mix what you're doing (i.e. lots of cpus, lots of memory) with the ability to make some things "closer" to others. If you're able to think of your app at least slightly NUMA like, it works well.

    Otherwise, let me recommend the Alpha boxes. Tru64 UNIX is a phenomenal operating system, and it scales beyond your imagining. It really is that good.

    The other thing to think about is what kind of CPU usage are you doing. Assuming that you're using floating point computation, you need to immediately discount the Intel architecture. It's STILL hobbled by the terrible FPU that it's had for years, which is why the Athlon kicks it around on this stuff. Alpha and MIPS have the best FPUs implmented (SPARC is okay, but definitely not as good as the Alpha).

  • What does your app need? Here's the questions I'd look at first, and a few thoughts.

    First, is it fpu or integer intensive? If integer, intel can make sense. If fpu, can you use SIMD (MMX, 3dNow, alta vec, etc)? If you can, intel can still work, but if not, RISC is the only choice.

    Second, how much memory bus does it really need? Ok, a lan will not cut it, but could you live with everything (cpu-cpu or cpu-memory) going over 1 100Mhz 64 bit bus (ok, I think the 8 way boxes have two busses, 4 cpu on each)? If so, an 8 way intel box may be the right choice. Test to see if you need 2MB cache, or if everything is good with 1MB or 512k, you can save some money. I don't know how the RISC boxes are with regards to bus, but they are better by a huge margin than intel. If you are going to wait a bit, the AMD 8 way boxes may be availible, you'd get stunning integer performance, ok fpu, and good memory bus performance, it is the same bus the Alpha uses.

    Disk I/O, how much? are you loading from disk, and just chewing on it for a few days? If so, disk io doesn't much matter, but if it does, linux may not be the best choice.

    I'm guessing that network io does not matter.

    For this kind of task, I bet that linux will work just fine (spliting the multi threaded or multiple process app over multiple cpus). I'm guessing that if a big intel box is not suitable, a big alpha box is the best choice.

    How much ram does the box need? If memory speed is the limiting factor, RISC usually has better memory designs.

    $200k is not much money, really. I'm guessing it will buy you a fully tricked out 8 way Xeon Intel box, or somewhat less RISC. Of course, if you get more $, you can add more CPUs to your box, but not true for intel.

    Hmmm, spend $100,000 on an intel box now, and $100,000 when you can get an 8 way AMD Athlon? Might be best of all worlds for your money.

    Better questions people!
  • Strongly consider Alpha for your SMP box. I have never used it but I hear it is better designed. What I'm going to say is only relevant for i86 platforms from here on in.

    Skip the Celeron SMP machines. You have the money, you can afford better SMPness. Look at the Intel Xenon CPUs which are the best for SMP at the moment. You cannot use P4's in an SMP configuration but VERY SOON, you will be able to use AMD Athlons to provide SMP. These will likely be your best option, I would think, for i86 platforms. The AMD SMP bus seems better designed than Intel's, though perhaps AMD won't allow more than two CPUs at once?

    Make sure if you use Linux, you are using the latest stable 2.4 kernel. The difference on an SMP machine compared to a 2.2 kernel is astounding.

    You are going to have some hard questions to answer about memory. DDR SDRAM is almost always the clear winner over RDRAM but perhaps things work differently in an SMP environment, I'm not sure. RDRAM, of course, is Intel's favourite while DDR SDRAM will be what Athlon SMP machines can use.

    Anyway, I hope this helps. Remember, though, that you have the money to spend. Do some more research. Alpha architecture may well be the far superior platform for you, though it will cost lots more money.

  • The most significant and compelling reason I can think of to not go with x86 architecture is that every PC MP setup besides AMD 760 MP (which isn't out yet, and will only support two processors to begin with anyway) has a truly pathetic bus architecture. If you're doing something where bandwidth makes a difference, which you would seem to be if you're getting a single SMP box rather than contemplating clustering, the bus bandwidth is going to figure in, too.

    As you probably know, in an intel-based solution, all CPUs sit on the same bus to the so-called "memory masters [anandtech.com]";

    A memory master is anything that accesses the main system memory. Your CPU, your AGP card, and your PCI devices (actually the PCI bus itself) are all memory masters.

    Meanwhile, the AMD CPU bus architecture is more closely based on the Alpha EV6 bus, and each CPU has its own bus to sit upon.

    Since you can't go with a AMD 760MP solution, you should get a real live unix box; Something slick from Compaq/DEC, Sun, IBM, or even HP (though I don't recommend it.) If you're looking for raw power and the ability to handle large numbers quickly, Alpha is probably your daddy.

    Side note: Apologies to Busta for the title of my comment. I couldn't resist.

  • A couple of things:

    • What sorts of applications are you running? I've seen a MUCH bigger difference for multithreaded applications (on the order of 25% or more)
    • Athlons would be nice if there was a chipset which supported more than 2 of them (or even if THAT was available). Considering that we're talking specifically about SMP, Athlons are irrelevant.
  • For that kind of money, you can afford to get a big box and an industrial strength unix like HP-UX, Solaris, or AIX. Linux is great if you don't have a lot of money and aren't going to need lots of processors and lots of RAM, but I don't think it's a good fit for what you're talking about.
  • have you looked at the price of alphas lately ? a decent 4 CPU compaq alpha is about all you can buy for $200K with RAID. 8 or 16 CPU alphas go for well over 700K$'s.
  • To avoid the troll calling: Linux and *BSD are good on x86 but in SMP they don't compare to Alpha's official OS.

    On my measurements they do compare. It's about a 5% hit for Linux vs. Tru64, and another 5% hit for the Compaq Tru64 compiler being better integrated into the OS than the Compaq Linux/Alpha compilers. So that's about 10% slower overall which isn't bad at all... It's true that gcc/egcs and it's math library is pretty lame on Alpha compared to Compaq compilers, but the Compaq compilers are free-ish so it's not a problem.

    This is running MPI codes on 4-way Alpha/ev67/667 es40's with 8G mem and 2.2.16 Linux - so 2.4 Linux would be better - leaving only about a 5% difference between Linux and Tru64. Most of that is in page colouring (people guess) which Linux doesn't support.

    The big thing that the Linux/Alpha version of the Compaq compilers doesn't support is OpenMP style SMP programming - so you'd have to do this by hand with threads or MPI or buy a different compiler.

    My advice is to go with SMP Athlons as these are pretty much a match for Alpha in Floating Point these days - primarily 'cos Alpha MHz's haven't grown much for years. Athlons are also way cheaper so you can buy many more of them :-)

  • More interesting, though, is that this doesn't hold up past 4-way boxen.

    I know that the REALLY big Unisys/Compaq boxen (32-way P-III Xeons) use what's known as CMP, and I think that it's the basis in a scaled down form for most 8-way intel boxen. The acronym is for Cellular Multi-Processing, and the basic idea is that you have NUMA without calling it NUMA. Each "cell" is some number of processors (up to 4), some number of DIMMs, and a connector to a crossbar which connects everything.

    I know that this is a very common architecture for cheaper SMP type boxes. Crossbar complexity grows exponentially with the number of things attached to it, so you take some buses and do a crossbar between THOSE, rather than just a huge crossbar. And it works as long as you're not able to max out any particular bus.

    The core difference between something like CMP and something like the SGI NUMA boxen is that the SGI boxen have crossbars within each unit. So if you have a processing unit of 4 CPUs and 4 DIMMs, there's a crossbar between each of those, and a whole separate crossbar connecting each processing unit.

    Connecting more than 4 of anything on a bus is worthless. I can max out the bus on a dual-P-III pretty easily. Give me two more and you'll start to see declining performance as everything waits for memory to be copied.

  • Forget about Linux bias, what about UNIX bias?
    Of course Alpha's official OS is OpenVMS! VMS Clustering puts all others to shame IMHO.

    Has anyone else heard of Compaq's "Galaxy" technology? This must the best-kept secret around. Basically a Galaxy is a set of OS instances running on a single SMP box, with resources able to be redployed from one instance to another _on_the_fly_!.
    Sysadmin: "hmm, that database server is chugging a bit, might just steal a CPU from the webserver for a bit"...*clickety-click*..."that's better."

    /Perthling
  • While Linux SMP has made leaps and bounds recently, it is still a newcomer compared to the established UNIX vendors when it comes to big SMP boxen. Linux will eventually do this too, but for now the established vendors are the safest bet.

    Of course, I'd be very keen to boot 2.4 on our 16 CPU machine and see how well it performs in comparison. :)

    Xix.
  • What sorts of applications are you running? I've seen a MUCH bigger difference for multithreaded applications (on the order of 25% or more)

    I'm running C + MPI scientific codes. The MPI (lam mostly) calls map through to shared mem operations. My codes aren't super-heavy on the communications though - maybe 5% or 10% of the time is in comms in a mix of many small operations and some big bandwidth limited ops - so yeah - it's possible that an app which used GB/s of bandwidth between CPUs would hit more limitations in the Linux SMP implementation compared to Tru64.

    As always it's a good idea to try your specific app and different OS's before buying the machine.

  • if your going to end up doing a lot of number crunching, you should really look into some form of multiple cpu alpha. the alphas absolutley fly when it comes to floating point calculations. by the sounds of it, you would almost definately benefit from having some form of alpha solution. you will be better served to use compaq's tru64 unix than linux, im pretty sure that any alpha with 16+ cpu's would definately laugh at linux.

    .brad


    Drink more tea
    organicgreenteas.com [organicgreenteas.com]

Never invest your money in anything that eats or needs repainting. -- Billy Rose

Working...