Forgot your password?
typodupeerror
Education

How Many Boxes In A Decent Beowulf Cluster? 18

Posted by Cliff
from the how-many-MS-managers-does-it-take-to-screw-in-a-lightbulb dept.
Rick the Red asks: "I'd like to build a mini-supercomputer for our High School (and perhaps another for the Junior High). Given that cost is an issue, but given that effective, hands-on demonstration of the principles of parallel processing and clustered computing is the primary goal, how many nodes would it take to make a reasonable demonstration Beowulf cluster? Two? Eight? Sixteen? Has anyone else here done this, and if so are there resources (lesson plans would be nice) on the net for teachers? Also, has anyone networked a room full of computers such that one minute they're individual PCs available for the usual classroom stuff and the next minute they're a Beowulf cluster? If so, how did you do it? And for the final, far out question: Has anyone ever used VMware to create a Beowulf simulator, clustering virtual PCs running on one physical box?"
This discussion has been archived. No new comments can be posted.

How Many Boxes In A Decent Beowulf Cluster?

Comments Filter:
  • by Anonymous Coward
    What?! VMWare for Beowulf testing?! Are you insane?! You'd need quite a bit of resources just to run three virtual machines, let alone an entire cluster! And even then, it's just a single box - which will most likely be swapping like mad.
  • Lets not forget the cost of shipping or local tax...
  • /.
    The inestimable Donald Becker & droogies have formed a company called SCYLD (if you don't get it, all I can say is "we whupped Grendel, we whupped Grendel's mom, how big a deal can one dragon be?") to produce just what you are asking for.
    Use any search engine to find SCYLD BEOWULF, or just click on over to Scyld.Com... I bought six CDs for 3 bucks apiece.
    It's nice to see an Open Source company that sells software for fair value (as opposed to overpricing OSS CDs in an attempt to recover development costs overnight, which generally results in piss-poor sales, right Red Hat?) so I bought more than I needed. I haven't regretted it.
    --Charlie
  • Take a look at http://www.phy.duke.edu/brahma/ is has lots of good information about beowolf in general and hybrid COW/Beowolf specifically.
  • sycld "made" beowulf 2. i.e. beowulf next gen. There are places that sell the bootable CD, but they charge an arm an a leg for shipping.

    email me - loraksussr@hotmail.com, and I will iso it and put it up (as long as you keep me posted about the project - I want to do the same thing eventually.)

    You put the cd in, and hit space bar if it is to be the "host", if not, it boots into "slave" mode.

    Yay.

    I have a shotgun, a shovel and 30 acres behind the barn.

  • I though i rememberd a story like this not too long ago. Here's the link [newsforge.com].
  • What do you a call a beowulf cluster of beowulf clusters anyway? I'm sure there's some wiz-bang, aint-it-cool catchy name.
    "Me Ted"
  • Use MOSIX. [mosix.org] It is a single Linux patch ( to 2.2.18 or 2.2.17 ) which
    allows clustering of normal applications.
    Beowulf apps have to be specially written to take advantage of the cluster, with MOSIX,
    any process or thread can be scheduled on another cluster member.

  • This intended as (bad) humor and not as infringment on those fine assholes at MasterCard. "...Here's what my system consisted of: $72 Motherboard (Intel D810EMO SB 128 Sound, AGP) $42 CPU (Celeron 500) $18 RAM (64MB SDRAM DIMM) $100 HDD (IBM 40GB) $34 FlexATX Case $9 CD-ROM (12x Generic)..." A truly free Linux-based OS, Priceless.
  • Take a look at the Linux Terminal Server Project (http://ltsp.sourceforge.net/ [sourceforge.net]). I've created bootable CDs that go over the Ethernet to boot Linux. Shouldn't be too hard to adapt this so each node that gets booted becomes part of the cluster.
  • Using the existing machines (assuming they're running Linux or another sufficiently useful operating system) is probably the best use of resources. If you're going to be putting big burdens on the network, it's probably worth putting a bridge between you and the rest of the school, just so you're not all on the same collision domain. Ethernet-based Beowulf clusters are much happier with switching hubs in any event, especially as the number of nodes increases. You could get a couple of 16-port switches for less than a single node would cost, and the system would perform much better without burdening the rest of the network.

    If you wanted to build an actual dedicated cluster, I've been looking at hardware lately, and one could build a 4-node cluster for around $2150. Feel free to cut lots of corners on HD (you don't need much space), video card (cheapest you can find, try at ham swaps), don't bother with monitors, don't even bother with CD-ROM (you can install over the network from a boot floppy). You certainly don't need sound cards. You can skip mice completely, and you only need keyboards for booting (depending on your BIOS). A KVM switch (and one monitor) may make it easier to bring all of them up and troubleshoot. The single monitor may make it easy for the students to see the action, but you may just prefer to do remote display. If you're going to spend more than that, look into low-end SMP hardware. They're less than twice the price of single-CPU nodes ($3200 for a 4-node dual-CPU cluster), and you can more than double your performance (depending on the application), since the network is typically the biggest bottleneck. Then again, this may go against the lesson of "not needing special hardware".

    Especially for Jr. High students, it's probably necessary to run something more interesting than heat transfer and fluid dynamics simulations. I'd recommend checking out PVMPOV [www.luga.de]. Then again, I may be biased. ;-)

    If the point is to demonstrate the usefulness of clusters to students, you'd be cutting yourself off at the knees by using VMWare, since I don't think your performance would improve by adding more virtual machines (unless you have them each set up to use a small fraction of the resources of the host machine). It may be useful as a development environment, but only so long as you're checking correctness. For profiling and general study of performance characteristics, you need the real thing.

  • http://beowulf.alignment.net/ [alignment.net]

    Me and a few friends built a Beowulf out of 10 486/66 and P100 machines the school had laying around. There were more than that, but they all had some broken part...we managed to build these 13 machines out of the pieces, though.

    We tested the speed with PVMPov. The scene took 12:20 to render on my K6-2/400. The cluster took just over 7 minutes.

    Full details are on the page, reply here or email me if you want more info =)

    --

  • imagine a beo.....Ohh. Err. nevermind.

    --
    Spelling by m-w.com [m-w.com].

  • Damn, you beat me to it:)

    As for the number of boxes, I'd say as many as money will allow, but the thing is, unless I'm mistaken, it's hard to judge without knowing the relative specs. I mean a cluster of say P300's will be far superior to a cluster of 486/33's... Of course, given the price old computers go for these days you could probably easily sling together some cheap boxes and build quite a big cluster.

    Good luck.

    ---

  • The idea of dual purpose machines is a good one.. Using PC's both as workstations and nodes of a cluster not only shows that clustering doesn't really require any special hardware, but it makes it easier to justify the money for somewhat decent machines. There's no real difference in clustering dedicated nodes or making a "Cluster of Workstations"... essentially instead of headless boxes on a shelf, you're just using complete PC's (monitors, keyboards, etc). It's easy enough to set the PC's up for dual-booting, say between Linux and some other (*cough*) operating system, and since recent versions of Red Hat (and others) now support clustering right out of the box, most of the work has already been done for you. As for how many boxes you need, you can prove the concept with as little as two or three.. adding more later on is no problem at all. You will want good networking between the machines.. definitely 100Mbit switched or better. Point your favorite search engine towards the phrase "Cluster of Workstations" and you're bound to find plenty of examples. Good luck and don't forget to post your results soomewhere !
  • I have been researching this possibility for my school as well, So far the best soloution i see for quickly setting up a beowolf cluster in a highschool computerlab, where the computers still must be used on windows 98, would be to create a class set of bootable beowolf cd's.

    Is there a project already in the works for a cd bootable beowolf distribution? Any that are able to read FAT32 or even better NTFS. Pop in the cd reboot, and bam you have a beowolf node, useing a folder in the windows partition to store data.

    If anyone knows of something like this that is available, please give me a hollar, It would be a great help!

    -Windchill2001

    The One, The Only, The Cold...
  • by martyb (196687) on Saturday March 10, 2001 @01:42PM (#372558)

    Others have posted good ideas on the HOW of setting up a Beowolf cluster. I'll leave that to those who are better qualified to comment on that facet.

    But, what I would advocate, is coming up with some task(s) that are interesting to the students and that would benefit from a cluster. I can think of none better than cryptography.

    Start off with the students solving simple single-letter substitutions, manually. Tedious, time-consuming, boring.

    Now make it a competition. Who can solve a cryptogram the quickest? Have some token award for the winner. (Consider each student competing individually, or having the class split up into teams.) I kind of like the team concept... I can just imagine a bunch of teens getting excited and screaming out ideas and answers as they go!

    Maybe have each team come up with an encoded message that the other team needs to try and break. Go through a series of, say, 5 or 7 "competitions". Now you've set up the real-world competition of what cryptography is all about.

    NOW, it's time to bring in computers and show how brute-force efforts can help. Basic concepts are to have a program that takes the name of a file containing the cyphertext and a key and spits out a result. For example:

    decrypt101 --cypher_file foo.txt --key ZABCDEFGHIJKLMNOPQRSTUVWXY | spell | wc -w
    decrypt101 --cypher_file foo.txt --key YZABCDEFGHIJKLMNOPQRSTUVWX | spell | wc -w

    This sends in a key to be used to try and decrypt the message, sends the possible plain text to stdout, pipes it through a spell checker, and then wordcount tells how many misspelled words there were. Granted, this is not terribly great performance-wise, but it has the benefit of being understandable. (I Hope!)

    Then lead them to more and more complex cyphers. Shortly, they will discover that brute force on a single box is "not fast enough". Simple, use a faster, single box. (This is a simple way to justify getting a super speedy desktop for your own use each year. <grin>) Finally, you'll all reach a point where it is apparent that one box isn't enough... what if there were a way to get all of the boxes to work together? Hmmmm? Bingo! THIS is the time to introduce clusters!

    Sure, it's going to take some work on your part, but I can think of no better way to facilitate the introduction of so many valuable real-world concepts at the same time:

    • Statistics - On "average" how long did it take to decode the XYZ cypher?
    • Security - No longer just a vague concept, they'd get to see what governments do to keep sensitive information secure and hidden from enemies. And what individuals would have to do to protect themselves, too.
    • Teamwork - Can be FUN if you are excited about what you are doing!
    • Computers - Practical insights into what they can do easily, and what is difficult.

    Okay, I could go on, but I've got errands to run. I hope this gives you some ideas, and I certainly wish you the best on this project. But, and maybe most importantly, HAVE FUN! There's nothing that so hooked me into computing than seeing others ENJOYING what they were doing with them! (And, just think of what YOU could do with a super-computer available at your finger tips! ;)

  • by ClubPetey (324486) <clubpetey@yaho[ ]om ['o.c' in gap]> on Saturday March 10, 2001 @01:55AM (#372559)
    I'm still on the quest for the perfect MP3 player. One of my attempts was a "cheap" computer running win98 (connected to the TV) with a special shell. While I canned the idea, I discovered in the process that you can build computers VERY cheaply just out of "normal" parts.

    Here's what my system consisted of:
    $72 Motherboard (Intel D810EMO SB 128 Sound, AGP)
    $42 CPU (Celeron 500)
    $18 RAM (64MB SDRAM DIMM)
    $100 HDD (IBM 40GB)
    $34 FlexATX Case
    $9 CD-ROM (12x Generic)
    $275 Total for system

    Ok, the HDD is excessive for you, and this system doesn't have a monitor, but as you can see you can build a computer VERY CHEAPLY through normal mail order.

    As for how many, I'm not sure of your budget, but the IDEAL way to demonstrate this would be one for each student, that way you could have a particually complex task run by all student individually, and then again as a cluster. If you are not set on Beowulf specifically, I have some software I wrote for win98 than does clustering (in the distributed.net sense). It was made as a demo for one of my previous jobs, but you're welcome to use it for educational purposes. One nice thing, the source code shows that cluster coding is not that much more complicated.


    --
    He had come like a thief in the night,

Genius is ten percent inspiration and fifty percent capital gains.

Working...