Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Microsoft

How Well Does Windows Cluster? 665

cascadefx asks: "I work for a mid-sized mid-western university. One of our departments has started up a small Beowulf cluster research project that he hopes to grow over time. At the moment, the thing is incredibly weak... but it is running on old hardware and is basically used for dog and pony shows to get more funding and hopefully donations of higher-end systems. It runs Linux and works, it is just not anything to write home about. Here's the problem: my understanding is that an MS rep asked what it would take to get them to switch to a Microsoft cluster. Is this possible? Are there MS clusters that do what Beowulf clusters are capable of? I thought MS clusters were for load balancing, not computation... which is the hoped-for goal of this project. Can the Slashdot crowd offer some advice? If there are MS clusters, comparisons of the capabilities would be welcome." One has to only go as far as Microsoft's site to see its current attempt at clustering, but what is the real story. Have any of you had a chance to pit a Linux Beowulf cluster against one from Microsoft? How did they compare?
This discussion has been archived. No new comments can be posted.

How Well Does Windows Cluster?

Comments Filter:
  • Licensing (Score:3, Informative)

    by CodeMonky ( 10675 ) on Thursday February 21, 2002 @01:34PM (#3045600) Homepage
    Licensing would seem to be the first thing that comes to mind.
    Software costs for 100 linux machines are close to nill.
    Software costs for 100 Windows machines probably won't be.
    Granted I have read the licensing on the MS Clustering link but if it like anything else you'll need either a license of some kind on every machine.

  • Take a look at (Score:5, Informative)

    by wiredog ( 43288 ) on Thursday February 21, 2002 @01:36PM (#3045627) Journal
    Windows Clusters [windowsclusters.org].
  • Re:Licensing (Score:5, Informative)

    by CodeMonky ( 10675 ) on Thursday February 21, 2002 @01:37PM (#3045635) Homepage
    Followup:
    From reading the MS Site it looks licensing is based of the EULA of the software being used, so if you are using win2kpro you have to have a copy of win2kpro for each machine etc etc.
  • by GreyPoopon ( 411036 ) <gpoopon@gmaOOOil.com minus threevowels> on Thursday February 21, 2002 @01:38PM (#3045641)
    The windows was purely load balancing.

    From Microsoft's site: "The Computational Clustering Technical Preview (CCTP) toolkit is used for creating and evaluating computational clusters built on the Windows® 2000 operating system."

    Obviously, they are now attempting to compete with projects like Beowulf. It's probably all part of the M$ aggressive stance on Linux (and other competitors). The real question is, has anybody downloaded this kit and played with it. It's just a technology preview, so how mature is it in comparison to Beowulf or other clustering technologies?

  • Here's the deal: (Score:5, Informative)

    by Null_Packet ( 15946 ) <nullpacket@doscher. n e t> on Thursday February 21, 2002 @01:39PM (#3045652)
    MCS (Microsoft Cluster Services) are designed for load balancing and fault tolerance, as where Beowulf Clusters (AFAIK) are more for distrubuted processing load for performance increases (massive threading). MCS works quite well, especially well on Fibre Channel and Brand Name Hardware such as Dells and Compaqs.

    Simply put, it works well (but the cost is often an issue due to the cost of hardware in an enterprise) but it is not the same clustering you see with the Unices. E-mail me at my account if you have more specific questions.

    My intent is not to start or participate in a flame war, but the term clustering simply implies different things on different OS'.
  • by merlin_jim ( 302773 ) <.James.McCracken. .at. .stratapult.com.> on Thursday February 21, 2002 @01:42PM (#3045685)
    Hello,

    We run a MS cluster here. VERY big app... so big, I am loathe to name figures, because that would identify to MS just who is talking here...

    But, we use MS clustering for our web app. Our setup is that we have a database server with 4 procs, and a growing array of web servers with 1 proc each, all of which use disk space on a SAN. W2K clustering manages the load balancing as well as allocating disk space out of the SAN to virtual partitions as needed. The original poster is correct; MS clustering is for load balancing, not computation. I have seen many times Microsoft sales reps don't have a clue of what they're trying to sell; they're just told from on high to replace Linux with Microsoft wherever they can. I think this is clearly a case of that.

    My advice? Ask the sales rep to demonstrate how MS clustering will solve a common comp-sci problem with more MIPS than each box alone has. Point out that you're not running a web server or any such service on these boxes, but that they're for raw computation. Even better, see if he'll let you talk to a technician on how W2K clustering can meet your 'unique' (at least to MS) needs.

    Now, for everyone else... Don't get me wrong. W2K clustering is a great technology for building highly performant, highly reliable, highly scalable applications quickly and easily. But it scales in the direction of millions of users, not millions of computations.
  • by DeMorganLaw ( 543089 ) on Thursday February 21, 2002 @01:42PM (#3045689)
    Clustering for windows requires Windows 2000 Advanced Server, and a great deal of patching. And with old hardware you are out of luck trying to run Windows 2000 Advanced Server.


    Distributed computing for Windows has been around for a while though, Seti@home has been doing it for years.
  • Stability issues (Score:5, Informative)

    by The Panther! ( 448321 ) <panther&austin,rr,com> on Thursday February 21, 2002 @01:45PM (#3045726) Homepage
    At my last job, we had a COW (Cluster of Workstations) running all sorts of operating systems. Except Windows. Why? Because they won't run in a production environment for more than a few days without freezing or crashing, and the system administrators refused to babysit them. With Windows 2000, I've had my home machine run for upwards of 28 days without a reboot, but only if all the video drivers are stable and the machine is not doing too much at any given point (say, burning cds while watching movies and keeping my net connection above 200k/s). But so help you if a driver freezes. There's no way to reset them. Your hardware will play into your decision as much as the operating system, I believe, due to stable driver support.

    In terms of performance, Windows kernels have pretty good latency compared to 2.2.x linux kernels, so running a full screen dos app might give very good performance, but there's a lot of overhead munching into your RAM, which is likely to be an expensive premium on older hardware.

    Lastly, with Windows, I've never heard of doing channel bonding for ethernet (3 100TX cards ~= 1 gigabit), nor diskless booting that I know of. These can be really necessary for large clusters to keep maintenance down and performance up without buying higher end equipment.

  • First Hand Info (Score:4, Informative)

    by GeckoX ( 259575 ) on Thursday February 21, 2002 @01:51PM (#3045769)
    We researched MS Clustering very extensively. We're already an MS shop and even still it was cost prohibitive.

    Notes from experience:

    1) Clustering with Windows requires one of the following OS setups: Win2K Server WITH MS Application Center, OR Win2k Advanced Server. (Similarly with the XP platform)

    2) OS Licenses therefor will run between $1000-2000 _per-machine_!

    3) If you need Application center, which you likely will, you're talking (If I remember correctly) about another $1g per.

    4) Of course MS is just getting into this so don't expect it to be easy, well documented or stable.

    Finishing Notes:

    Obviously, Linux would be mucho cheaper

    Easiest, and still cheaper than MS would be the Plug-n-Play Mac solution!
  • by crimoid ( 27373 ) on Thursday February 21, 2002 @01:56PM (#3045823)
    Apparently you (and most everyone else) didn't take the time to even look at the link provided. Microsoft DOES have computational clustering, not just "traditional" clustering.

    MS Computational Clustering [microsoft.com]
  • by Oestergaard ( 3005 ) on Thursday February 21, 2002 @01:59PM (#3045851) Homepage

    For a computational cluster, the OS itself shouldn't really matter. What matters is, do you have the tools you need, and does the environment allow you to work with the cluster in a flexible way.

    For a typical compuatational cluster, what determines the performance will be the quality of your application. Only if you pick an OS with some extremely poor basic functionality (like, horribly slow networking), will the OS have an impact on performance.

    People optimize how their application is parallelized (eg. how well it scales to more nodes). The OS doesn't matter in this regard. They optimize how well the simple computational routines perform (like, optimizing an equation solver for the current CPU architecture) - again, the OS doesn't matter.

    So, in this light, you might as well run your cluster on Windows instead of Linux, or MacOS, or even DOS with a TCP/IP stack (if you don't need more thatn 640K ;)

    However, there's a lot more to cluster computing than just pressing "start". You need to look at how your software performs. You need to debug software on multiple nodes concurrently. You need to do all kinds of things that requires, that your environment and your tools will allow you to work on any node of the cluster, flexibly, as if that node was the box under your desk.

    And this is why people don't run MS clusters. Windows does not have proper tools for software development (*real* software development, like Fortran and C - VBScript hasn't really made it's way into anything resembling high performance (and god forbid it never will)).

    Furthermore, you cannot work with 10 windows boxes concurrently, like they were all sitting under your desk. Yes, I know terminal services exist, and they're nice if you're a system administrator, but they are *far* from being usable to run debuggers and tracing tools on a larger number of nodes, interactively and concurrently.

    Last but not least, there are no proper debugging and tracing tools for windows. Yes, they have a debugger, and third party vendors have debuggers too. But anyone who's been thru the drill on Linux (using strace, wc -l /proc/[pid]/maps, ...), and needed the same flexibility on windows, knows that there is a world of difference between what vendores can put in a GUI and what you can do when you have a system that was built for developers, by developers.

    So sure - for a dog&pony show, windows will perform similar to any other networked OS with regards to computational clusters. But for real-world use ? No, you need tools to work.

  • channel bonding (Score:3, Informative)

    by No-op ( 19111 ) on Thursday February 21, 2002 @02:05PM (#3045894)
    pretty much all of the Intel server cards as well as several of the desktop cards support channel bonding. all compaq server NICs support this as well, and it works great.

    however, I would take issue with your assertion that 3 100mbit cards are roughly equal to a gigabit card. while it's true that something like 4 100mbit cards will give you close to the real performance of a gigabit card when used on a low end PC, there is much to be gained by using actual gigabit (use of giant frames, better latencies, etc.)

    if you're going to build a cluster, and you actually have a budget, you're going to buy decent yet cheap server boxes. these will most likely include 64bit PCI slots, and there lies your motivation for gigabit. the performance there is unparalleled when using a real wirespeed switch, without using faster technologies of a proprietary nature.

    my 2 cents.
  • Re:Point? (Score:2, Informative)

    by linzeal ( 197905 ) on Thursday February 21, 2002 @02:05PM (#3045895) Journal
    Are you only going to run the cluster for 30 days? Those are trial versions of the operating system easily the most expensive part of the equation as you scale.
  • Re:Point? (Score:1, Informative)

    by Anonymous Coward on Thursday February 21, 2002 @02:07PM (#3045919)
    Did you notice the part of your post where you typed "evaluation version" and again "evaluation version" and later "Trial Version"????

    So unless you can finish you project in what - 30days, 60 days??? then I guess this would be useless licensing.... Some projects actually take longer to finish one job then what your

    FREE LICENCES
    FREE DEVELOPMENT TOOLS

    last... SO STOP YELLING D#MB F#CK!
  • MS HPC vs. Beowulf (Score:2, Informative)

    by sjvn ( 11568 ) <(moc.1anv) (ta) (nvjs)> on Thursday February 21, 2002 @02:12PM (#3045971) Homepage
    Been there. Done that.

    MS clustering is for load balancing and stability. As such, it does a reasonable job. Beowulf is for high-end scientific computing and does a resonable job.

    To really do clustering well, you don't want either. You want AIX or Solaris, but you probably can't afford them. But, Linux clustering in load balancing style is developing quickly with IBM, TurboLinux, VMWare and Intel doing many interesting things. Beowulf is still cheaper.

    In any case, though, for your situation, there's only one solution and that's Beowulf. Besides afford MS' licensing fees, you mentioned that you're running on older equipment. I sincerely doubt that those servers could run W2K Server in its standalone mode, much less in clustering.

    Steven

  • by Paladine97 ( 467512 ) on Thursday February 21, 2002 @02:14PM (#3045990) Homepage
    He doesn't mean MP3, he means MPEG video! Everybody knows it doesn't take long to encode MP3. It takes hours to compress videos. So therefore to cluster would be worthwhile, since you could have it done in a much shorter time.
  • not fun (Score:2, Informative)

    by teknogeek0 ( 264455 ) on Thursday February 21, 2002 @02:17PM (#3046009)
    Yeah I was helping a grad student the other day setting up a Windows cluster.. its not fun.. not terribly hard, but some of the requirments, and that stupid ass HCL that they have make things a bitch. I would think it would be more likely used for load balancing than anything else, sure you could use it for computational things, but the overhead that windows needs for the OS means that you would get less CPU power than if the system was a Beowulf one
  • by spongman ( 182339 ) on Thursday February 21, 2002 @02:19PM (#3046029)
    Microsoft has a few types of clustering:
    1. Failover clustering. This is an OS service that servers like SQL Server and Exchange plug into that allows Active/Passive or Active/Active clustering over a shared SCSI/Fibre bus. In theory you could write your app to use this service but I think it would be overkill.
    2. Network Load Balancing. This is just a software version of the standard kinds of NLB found in cisco boxes.
    3. Component Load Balancing. This is the most suitable. It's provided by Application Center and it allows you to deploy COM+ objects on a cluster of machines and have the calls distributed according to the load on those machines. You can control the threading and lifetime of the objects and view the status of the machines pretty easily using the Application Center MMC plugin (or SNMP, I believe). You'd have to wrap the computational part of your application into one or more COM objects. Once you've done that then you can create and call those objects in the cluster as if it were one machine - the clustering is transparent to the client application. I played around with AC a bit when it was in beta for a project that I was working on. We didn't go with it in the end because the design of our application ended up not requiring it (we just went with hardware load balancing), but it seemed like pretty cool technology - if you're into the whole COM thing. It has a really cool rolling deployment feature where you can redeploy your components (and/or IIS application if you have one) to your cluster incrementally while it's still running.
    Here's some links to docs on MS's site:

    Introducing Windows 2000 Clustering Technologies [microsoft.com]
    Application Center home page [microsoft.com]
    Component Load Balancing [microsoft.com]

  • by Anonymous Coward on Thursday February 21, 2002 @02:20PM (#3046033)

    Back when Beowulf started there were a few universities that had dual boot systems. I think they ran Linux half day, Windows the other. Windows NT and above have had implementations of MPI and PVM that have worked quite well for a while. MPICH and PVM are both freely available, the problem is that these systems do not integrate well with Visual Studio.You end up needing cygwin/gcc, which is basically Unix on Windows


    Most clustering software today is either open source or based on open source and made for the Unix environment. So besides the obvious license and stability questions others have brought up Windows has limited tools and libraries.


    ALSO:

    • There are problems with remote administration, being tied to graphical interfaces for such simple nodes. Wasted efficiency by running the GUI.
    • Most computational scientists are used to the Unix platform not Windows.
    • Windows tools have a tendency to require upgrading of themselves, other tools, or the OS. Most Linux tools are pretty interoperable version to version.
    • Hardware such as myrinet works on both, but common Linux features like channel bonding are hard to do on Windows.
    • When you look at the big HPC systems all the way to the little ones you will see them all happily running Linux

    Finally Microsoft has a VERY limited knowledge base for this application of Windows nor do many HPC people know anything about Windows.


    As you can see from the above, Linux on HPC is basically able to take those same horrible excuses for running Microsoft on the desktop and shove it down their throats.

    Yes, Windows can do HPC, but why would you want to?

  • by stereoroid ( 234317 ) on Thursday February 21, 2002 @02:20PM (#3046035) Homepage Journal

    A few points:

    • It's only available with Advanced Server, which means extra cost.
    • Nearly all applications & services (daemons) will be running on one node at a time. If they are set up correctly under Cluster Administrator, they still run on one node at a time, except that they can fail over.
    • A Cluster Group is the unit that runs on one node at at time and fails over, so it will contain applications and the resources those applications needs.
    • During a failover, resources in a cluster group are taken offline by order of dependency (unless the node crashed!), and brought back online also by dependency. So, if an application depends on a disk, the application goes offline before the disk, but the disk comes online before the application (logical).
    • Multiple groups run on multiple servers at any time, so if you spread them out, machines aren't sitting idle.

    You can set up any application or service to cluster & fail over if required, as long as:

    • It stores all its live working data on shared storage,
    • You correctly place it in a logical cluster group that includes the resources your app needs, and specify those dependencies (e.g. my app needs to use the disk and IP address in Cluster Group X, so it must be in Cluster Group X), and
    • You can specify what Registry keys (if any) need to migrate between nodes.

    Active/Active mode is more complicated, meaning instances of an application running on different nodes, all accessing the same data on disk. Only certain applications can do this successfully, e.g. Oracle, which does so by using a custom file system and effectively bypassing the Windows Cluster Service. Windows & most apps will normally throw a fit if there are clashing file requests from multiple nodes, since Windows caches file tables in memory and can thus lose track of the real situation on disk (bad news). I've seen it BSOD in such cases.

  • by merlin_jim ( 302773 ) <.James.McCracken. .at. .stratapult.com.> on Thursday February 21, 2002 @02:28PM (#3046092)
    I must now put on the traditional monkey hat of shame, for the naysayers are quite correct. There are TWO microsoft products called clustering. One is used by Windows 2000 Advanced Server to do load balancing, and is, in fact, split into two parts, the first called Clustering, the second Network Load Balancing... see this page [microsoft.com], which includes the statement "Both [of the Windows 2000 Advanced Server] Clustering technologies are backwards compatible with their Windows NT Server 4.0 predecessors". The other is High Performance Clustering (HPC), in its current form called Computational Clustering Technical Preview (CCTP), which I am certain has nothing to do with the previous Clustering technology... I doubt it was available for Windows NT 4.0, among other things (thus the Technical Preview status).

    Notes for any and all interested in this; it's a technical preview, which any other company would call a pre-Beta or an Alpha release. The only way anyone sane would use this in a production system would be as an Early Adoption Partner...
  • by Anonymous Coward on Thursday February 21, 2002 @02:30PM (#3046100)
    Tell the Microsoft sales rep that you are using Linux because that's where many of the advances in clustering technology are being developed [globus.org]. In fact, they recently switched from using Windows as the basis of their development to using Linux, and one of their primary sponsors is Microsoft. Since Linux is clearly Microsoft's first choice for a clustering platform, yours should be too. After all, noone ever got fired for doing what Microsoft told them to!
  • by jspaleta ( 136955 ) on Thursday February 21, 2002 @02:31PM (#3046108) Homepage
    about that mac solution....
    Yellow dog linux sells a cute little piece of hardware designed for clustering around PPC. very cute...maybe the best balance of cost effective and easy in terms of clustering that ive seen.

    http://www.terrasoftsolutions.com/products/briQ/ hp c.shtml

    -jef
  • by sgoggin ( 121149 ) on Thursday February 21, 2002 @02:33PM (#3046125)

    MS Application Center is not bad. It is web focused, but with webservices alot of things can be web systems.

    Good Features.

    1. Easy to use GUI
    2. Definition and replication of application file, database and IIS settings.
    3. Collects problem and performance data from all the applications into a main console.

    Problems

    1. Focused on the Microsoft was of doing bussiness, for example easy ASP replication harder JSP replication.
    2. A little buggy. Sometimes losses internal replication password, no way of dumping application setup, built-in fallover technology does not handle more than a class C.
    3. Sort of expensive $3000 per CPU.
    4. Windows is still not as reliable as Linux/Unix systems.

    It has advantages over the Linux systems I have see with the GUI and aggregation of preformance data. The GUI is useful because you can not delegate tasks to junior staff, if they do not understand it. I have 60 web servers many running Linux and some W2K and a few have been trying app center for a while now and need performance information to know when to add and hopefully remove machines. I have run mainframes, but there are less software problems with popular systems like intel Linux and Windows machine so zSeries and big Suns which I have used in the past our not the magic bullet.

    Seán

  • by kinkie ( 15482 ) on Thursday February 21, 2002 @02:33PM (#3046127) Homepage
    Sorry for the incomplete post. I'll continue here.

    I used to have some WLBS (Windows Load Balancing Services) systems (NT4's idea of load balancing cluster).
    They worked, more or less, most of the time (about 4 reboots/day on average I think). The problem was, the thing was IMPOSSIBLE to debug and troubleshoot, for the simple reason that it was impossible to know where the problem was. WLBS did terrible layer 2 trickery to route requests around, and as a result it didn't work well with anything more complex than a hub.
    Luckily it's now gone and not missed.

    Disclaimer: the opinions here expressed are of course my own and do not necessarily reflect any organization's
  • by BWJones ( 18351 ) on Thursday February 21, 2002 @02:38PM (#3046173) Homepage Journal
    Try looking at Pooch from Dean Dauger. http://www.daugerresearch.com/pooch/whatis.html

    This would allow you to use the Macs (OSX UNIXY goodness too) individually as personal workstations (for writing, graphics, computation, surfing the web) while at the same time using them in clusters for compute intensive work. This makes for a doubly productive machine and one that is much cheaper as more work can be accomplished with it than simply using it as a dedicated node.

    Mac clusters are easy peasy to set up (even junior high students are doing it) as the one page instructions should indicate and Applescript'ability. Also pretty damn fast given the built in Gigabit of G4's and the Altivec (if taken advantage of like in Apple's version of BLAST).

    Finally, the other item of interest. You can use any Mac you have. G3's, G4's of any model and speed as one does not have to balance everything like on typical clusters where all of your hardware has to be exactly alike. The Macs in your cluster can even have iMacs on the secretaries desk involved!
  • Re:No command line (Score:1, Informative)

    by Anonymous Coward on Thursday February 21, 2002 @02:49PM (#3046291)
    another person who wil criticise Windows without enough knowledge...

    Windows does have the powerful command line - (no, not command.com), its called Windows Scripting host (WSH) and runs practically any script language (though typically you stick with VBScript).

    I combination with WMI (which is Windows' imlementation of WBEM), you can do far more than ksh, and it is relatively easy to use (once you know what to do).

    Cheers.

  • by Brynath ( 522699 ) <Brynath@gmail.com> on Thursday February 21, 2002 @02:54PM (#3046332)
    Well Get IBM to hook you up with Linux, they dont have all those Linux commercials for nothing.
  • by djtack ( 545324 ) on Thursday February 21, 2002 @02:55PM (#3046352)
    My advice? Ask the sales rep to demonstrate how MS clustering will solve a common comp-sci problem

    This is a great idea. Scalapak benchmarks are a popular choice. Also think about what are you really getting for your money (license fees)? I work with a modest Beowulf (~50 cpus) using Linux and I have no doubt that it would be technically possible to use Windows... but you would spend a lot of time installing kludgy ports of unix tools: cygnus wintools, PBS, rsh, perl, etc. At the very least the two most popular message passing libraries (MPI and PVM) both rely on rsh.

    All the tools that make a Beowulf what it is are free software, there is really NO added value by running them on Windows.
  • MSCS (Score:2, Informative)

    by hutzut ( 560777 ) on Thursday February 21, 2002 @03:00PM (#3046391)
    In my previous company I had the dubious pleasure of setting up MSCS. It was a two-node active-passive cluster. The two nodes were identical and shared a fibre-channel disk array. Here are the specs:

    Quad Pentium 500 MHz Pentium Xeon's

    1 GB memory

    6 disk array, three logical mirrored drives.

    MS Windows NT Server 4.0 Enterprise Edition

    MS SQL Server 7.0 Enterprise Edition.

    It should be noted that you must have two NT Server licenses and two SQL Server licenses. If you want to do an active-active cluster it requires four licenses. The Enterprise Editions of these software packages was much more expensive than their standard counterparts. You can not use standard editions for clustering.

    Installing cluster services was very easy. The Cluster Manager app was OK outside of the occasional hangs. (Although the manager app hung, the operations were completed, such as failover, failback.)

    In order to do active-active clustering you must have two shared storage devices; the active node will only be able to access the shared storage it "owns".

    SQL Server installed all right if you followed the MS White Paper exactly. I don't know why, but installation order was important; if you didn't follow it it didn't work.

    Applying service packs was extremely painful. The instructions were straightforward but did not work. MS provided us with a program that backed out the SP snafu, which worked somewhat. If it weren't for google we'd have been dead.

    MS support is useless IMO. No contracts just pay-by-incident. Have a credit card handy before you do any upgrades of any kind. You will most likely need it.

    As long as the cluster was just doing SQL server, it worked great. Failover was seamless. Given the proper hardware, Windows behaved well. Make sure that you only attempt this with certified hardware. Very important.

    Once we started adding third party reporting software things started to go bad. Adding it to the cluster services was remarkably easy. However, even though the servers had quad procs and a good amount of memory, simultaneous report requests ground the system to a halt. SQL Server behaved well, around 25% of CPU at most even in heavy load. The reports (JRE) would take up over 50% of the CPU in light load. Very bogus IMO.

    A lot of third party apps do not support MS Clustering. Lot's of tweaking to get them to work.

    If I were to do it again, I think I would not have used MSCS, but instead have two distinct systems that had some kind of data replication software.

    This configuration is also limited to a two node cluster. Although you can run an active-active cluster the instances of SQL Server would be seperate. The data storage areas cannot be shared between the two nodes.

    Although I prefer UNIX I try not to be an MS bigot. It does certain things well. I hope that clustering has improved with w2k.

  • by fitten ( 521191 ) on Thursday February 21, 2002 @03:26PM (#3046614)
    A company out there provides commercial MPI libraries for a variety of operating systems, Windows included

    http://www.mpi-softtech.com/

    A couple years back, I was one of the MPI writers/maintainers for a number of platforms (worked with MPICH). As far as performance went with real applications (as well as synthetic benchmarks), Windows and Linux on the same hardware were pretty much the same. Typically computational problems were faster on Windows though because of the better compiler support at the time. Communications performance benchmarks were interesting.

    Also, the at the URL above you can find cluster management software (batch scheduling and stuff).
  • by chavo valdez ( 206049 ) on Thursday February 21, 2002 @03:28PM (#3046635) Homepage
    I don't run explorer.exe on my windows box at all. I use an open source 32-bit shell called Litestep. It is infinitely configurable and themeable. There are tons of themes to download, or you can dive right in and edit the rc files yourself. You can make it look like any Linux WM or desktop environment. I love desktop-click popup menus, which is one of the countless modules available. The main litestep.net site is down right now, but checkout Shellfront [shellfront.org] for info and links on Litestep and a few other replacement shells for windows. If you know Win32 programming, grab the source and dive in, the dev team is in a bit of disarray at the moment.
    chavo
  • Re:Licensing (Score:4, Informative)

    by maitas ( 98290 ) on Thursday February 21, 2002 @03:32PM (#3046665) Homepage
    For raw MPP numeric processing, W2k is too dam slow. You can boot Linux in 4MB of RAM and less than 64MB of disk, then, just load the libraries you need and nothing else, and you will have a preety decent system. Try thining W2K down and you will have a huge problem there. You can use Sun's GridEngine for Linux (http://www.sun.com/software/gridware/gridengine_p roject.html) and best of all, it's open source!
    At the end, it all comes to your soft, if you develop a highly scalable, almost share nothing algorithm, Linux Clustering is the way to go. For fail-over Linux you have tha HA Linux project, once more, Open Source!
  • Re:Point? (Score:5, Informative)

    by Red Avenger ( 197064 ) on Thursday February 21, 2002 @03:33PM (#3046677)
    Microsoft answers your question here [microsoft.com].

    "Q. How does a Windows-based supercluster compare with one running UNIX or Linux?

    A. In short, there's very little substantive difference, but owners of existing UNIX-based solutions will face changes that will cause them some work and discomfort (less for users than for their current administrators and support staff). These are offset in part by lower costs of ownership (technical skills required), breadth of applications and support tools, vendor support options, and commonality with the constantly improving desktop environment.

    From a hardware perspective, there's very little difference seen by the application. In the past, UNIX-based hardware environments had better floating-point performance, but that's been offset in the last few years by Moore's Law curves for large-volume products that have advanced faster than specialty products have, as well as the price and support cost differentials between these vendors' products.

    From a software perspective, Windows is a markedly different environment, designed with priorities set by a much different market segment than traditional science and engineering. Windows NT® and now Windows 2000 were designed to meet the needs of those ISVs building products for businesses that are unable or unwilling to dedicate their best people to support their infrastructure (versus focusing on building solutions for their business mission), as well as the needs of a hardware community that required continuous integration of new devices and components."

  • Re:MSCS (Score:2, Informative)

    by daveman_1 ( 62809 ) on Thursday February 21, 2002 @03:41PM (#3046738) Homepage
    The process for a two node cluster set up in Win2k is nearly identical to what you have described. Although I think the licensing is a bit different than when you built that cluster... Lowest version you can buy to install a cluster is Win2k Advanced and you also need the enterprise version of SQL 2k to make it work. When you are looking at a failover type setup (active-passive), you are still looking at a fairly sizeable chunk of change to put this together. If you are frugal, the hardware is gonna cost you $30,000. For licensing, you are looking at very least an additional $30,000, likely more. I was appalled to see how much this was actually going to cost.

    And yes, you have to follow many sets of conflicting directions to a friggin' "T" or else it won't work. And do yourself a favor and firewall the hell out of the boxes. Installing service packs on such a mission critical set up just doesn't appeal to me for some reason.

    By the way, you say:
    "...If I were to do it again, I think I would not have used MSCS, but instead have two distinct systems that had some kind of data replication software."

    I have no idea how you intend to accomplish this with a database, utilizing a MS solution, but I'd certainly like to hear about it!
  • by Nickodemus ( 529872 ) on Thursday February 21, 2002 @03:44PM (#3046760)
    Microsoft Cluster services is designed for one thing: High Availability (little or no down time / load balancing). Beowulf clustering is designed for one thing: Parallel Processing (data analysis / number crunching). They are two different types of clustering. The debate on cost is a waste of time. While Linux is as capable of high availability clustering as Microsoft is, it has little cost. With Microsoft you have to buy a license of Advanced server for each cluster node and then have licenses for each application as well. For cluster aware Microsoft apps that means Enterprise editions. Advanced Server costs in the $4000 range. SQL 2000 Enterprise Edition cost in the range of $11,000 per node. If you are backending a website with a SQL cluster, just for SQL you are looking at around $20,000 per processor . If you are looking for a cluster to be online 24x7 then you go with Microsoft (and pay the additional money for support). If you are looking to predict weather patterns, analyse ocean currents, or predict the lottery, use Red Hat and Beowulf (and pay the additional money for support).
  • BSODs are an issue (Score:2, Informative)

    by Sxooter ( 29722 ) on Thursday February 21, 2002 @03:51PM (#3046807)
    BSODs are an issue for Windows and not Unix / Linux primarily because in Unix video / device drivers don't run in the inner ring of the kernel, and can't bring the whole box to it's knees because of a minor bug in a driver or a hardware failure in (what should be) a seconday I/O device.

    Windows NT 3.51 had the video drivers (and most other drivers as well) in the outer ring of the kernel where they couldn't down the whole machine, just certain services. I've seen Win3.51 boxes with horribly buggy video drivers just keep right on running when the video would lock up. NT 4.0 and above aren't the same.

    The decision to move the drivers into the inner ring of the kernel is why BSODs are a Windows issue. Blaming the user for not setting up his box just right doesn't solve the real issue, poor OS design.

    A real OS (unix/linux/OS390/VMS/even NT3.51) doesn't have these problems.
  • Shameless plug (Score:3, Informative)

    by Leimy ( 6717 ) on Thursday February 21, 2002 @04:12PM (#3046935)
    Since you asked come visit MPI Software Technology Inc. [mpi-softtech.com]

    We have been very successful in Windows clustering efforts and offer a professional MPI implementation for windows platforms. Give us a shot I am sure we could set up an evaluation of some sort.

    That said, we have the following self-kudos:

    CORNELL THEORY CENTER'S VELOCITY CLUSTER MAKES THE TOP 500 LIST (June 16, 2000)
    "Our relationship with MPI Software Technology, Inc. has been extremely valuable," says Cornell Theory Center associate director for systems Dave Lifka. "Good job scheduling, resource management, and reliable MPI are the primary pieces of any high performance computing environment. MSTI has made the extra effort to make sure MPI/Pro and Cluster CoNTroller are ready for a production quality environment. The utilization and stability the AC3 systems is directly related to the quality of their software."

    World's Largest NT Cluster Goes Live (August 25, 1999)
    The Advanced Cluster Consortium (AC3), which includes Cornell University, Intel, Microsoft, Dell, Giganet, and MPI Software Technology, Inc., announced on August 12, 1999, that it had completed the installation of a 256-processor high-performance computer cluster using Windows NT 4.0. AC3's cluster bests a University of Illinois 192-processor NT cluster, which Windows NT Magazine covered in June 1999.

    As you can see we've been at it a while! :)
  • by JamesGreenhalgh ( 181365 ) on Thursday February 21, 2002 @04:35PM (#3047130)
    Having seen first hand how poorly the following setup ran, I'd say steer clear of Microsoft until they admit that reboots are not normal:

    2 x HP Netservers, both dual p2 Xeon, 1gb ram, and a small raid shelf with 8x 9gb disks. Both NT4 installs with the correct patchlevels.

    One machine ran oracle, the other IIS, these were clustered so that one would take over the task of the other, should there be a problem.

    Problems:
    1) Crashing (daily at least)
    2) Slow (astonishingly poor, disk defrags once a week helped this)
    3) Sometimes one host would freeze, and the other wouldn't actually notice
    4) Often a shutdown of one node would move the services across, but upon rejoining the cluster - the node with both services would refuse to give one back.
    5) Often, IIS would stop talking, and neither node would actually realise.

    The attempted solutions:

    1) Replaced CPUs, memory, disks, eventually nodes
    2) Reinstalled clustering software, eventually total clean installs of operating system and applications
    3) Support from Microsoft, and Oracle, and HP who made the (certified) kit. Oracle+HP both pointed the finger at the OS, Microsoft simply failed to help, when we got any response from them at all.
    4) (this helped) I used one of the spare HP9000 servers to monitor them remotely by trying test transactions - it alerted people when they fucked up.

    I think the above says it all really. Standard software on correct hardware - it just didn't work properly. Microsoft can stick their clustering "technologies" where the sun don't shine.
  • Re:channel bonding (Score:1, Informative)

    by Anonymous Coward on Thursday February 21, 2002 @04:41PM (#3047173)
    Actually, the frames on gigabit ethernet are extremly small. Not the usual 46-1518 bytes of data. I can't remember the actual data length, but it is indeed possible to have gigant frames, but the usfulness is limited since GBE is designed for small packages.
  • Windows HPC Cluster (Score:2, Informative)

    by lifka ( 560834 ) on Thursday February 21, 2002 @04:45PM (#3047214)
    Yes you can build a Windows Beowulf cluster. They work very well. Thomas Sterling just came out with a book describing how: "Beowulf Cluster Computing with Windows" Its a Scientific and Engineering Computing Series book from MIT Press.
  • by ADRA ( 37398 ) on Thursday February 21, 2002 @05:41PM (#3047650)
    Erm, that is not exactly true.

    Yes, Microsoft puts the drivers right in the kernel, but other OS's end up with similar results anyway.

    For example, Linux does have kernel level drivers(DRI) for most common graphics cards nowadays simply to increase performance and to allow for features User space cannot perform. Plus, there is a growing wave of frame buffer kernel drivers which give users space a blank frame buffer to work with, so that programs like X, gnomefb, kembedded don't have to worry about writing to every hardware platform, just the abstract one that the kernel produces, pretty much the same as what windows uses.

    This may not be or ever be a standard in all UNIX's, but anything that needs performance out of graphics, or any non-core perpheral, they will be going into the kernel, or they will have special hardware.

    Done.

  • by tdelaney ( 458893 ) on Thursday February 21, 2002 @05:51PM (#3047732)
    Of course it's possible. It's not *nice*, but it works.

    For the record, I maintain a headless NT 4.0 web/database server at work for one of my projects (requires disabling the mouse driver to avoid error messages at startup) controlled via PC Anywhere and a headless Win98SE machine at home as my internet gateway (running SyGate NAT 3.0 and SyGate Personal Firewall) controlled via VNC.

    Why NT 4.0? Mandated at the time (the main servers are in the US maintained by an external group - we're in Australia with the admin server for the same system).

    Why Win98SE? I tried various linux and bsd distributions on the machine, and couldn't get any to work - Pentium 60MHz, SCSI, plain IDE (not ATAPI) so had to install an I/O card to get a CD-ROM to work, old intel ethernet cards, etc. I've configured it to reboot every night so I don't have stability problems ... ;) It's not fast, but since all it does is pass packets through (cable modem) and block incoming packets it doesn't need to be.
  • MS parallel tools (Score:5, Informative)

    by ajv ( 4061 ) on Thursday February 21, 2002 @08:02PM (#3048446) Homepage
    Getting past what are the wrong tools first: Beowulf is an architecture to do massively parallel computation, so we can eliminate two of the best known HA tools. Microsoft Cluster Service is two or four node high availability, similar to HA Linux's efforts. NLBS is a software form of a hardware load balancer, similar to Cisco Local Directors and only really good for web farms. So what does MS provide to do similar stuff as Beowulf?

    COM+ and Queueing Components. AppCenter.

    The way it works is this. You write a COM+ component that is transactionally queuing aware. Each component takes a work unit in, processes it, and then sends the result of the transaction to the queueing components for reassembly or re-issue (if a node fails to submit a result, for example, good for checkpointing).

    You can use normal Windows 2000 Professional boxes for the worker bees, and use a few Windows 2000 Server boxes to co-ordinate the issuing of jobs and control, and munging the result sets coming back in.

    If you need to submit a wide variety of jobs, obviously the COM+ components will be changing regularly, it'd be a good idea to go to AppCenter so that you can treat a bunch of machines as single whole. This allows you to upgrade or deploy an app in a few mouse clicks to literally thousands of machines in a few seconds. AppCenter also has pretty good resource management, something that might be necessary if multiple jobs are running at the same time.

    The cool thing is the development environment is really friendly and you can make COM+ components pretty easily and test them locally (for the n=1 case) before deploying to the farm.

    There are also specialist MP libraries for the Win32 platform, such as PVM or MPI (WMPI). These have the benefits of re-using the knowledge and API's that users might already be familiar with - one of the biggest thing when a place converts from one supercomputer to another is rejigging and reoptimizing the code for the new architecture.

Today is a good day for information-gathering. Read someone else's mail file.

Working...