Follow Slashdot blog updates by subscribing to our blog RSS feed

Ask Slashdot: Best Bang-for-the-Buck HPC Solution? 150

Posted by Soulskill on Saturday July 18, 2015 @05:03PM from the 14,000-raspberry-pis dept.

An anonymous reader writes: We are looking into procuring a FEA/CFD machine for our small company. While I know workstations well, the multi-socket rack cluster solutions are foreign to me. On one end of the spectrum, there are companies like HP and Cray that offer impressive setups for millions of dollars (out of our league). On the other end, there are quad-socket mobos from Supermicro and Intel, for 8-18 core CPUs that cost thousands of dollars apiece.

Where do we go from here? Is it even reasonable to order $50k worth of components and put together our own high-performance, reasonably-priced blade cluster? Or is this folly, best left to experts? Who are these experts if we need them?

And what is the better choice here? 16-core Opterons at 2.6 GHz, 8-core Xeons at 3.4 GHz? Are power and thermals limiting factors here? (A full rack cupboard would consume something like 25 kW, it seems?) There seems to be precious little straightforward information about this on the net.

This discussion has been archived. No new comments can be posted.

Ask Slashdot: Best Bang-for-the-Buck HPC Solution?

Load All Comments

Search 150 Comments Log In/Create an Account

Comments Filter:

Look for other users of the S/W for advice (Score:5, Insightful)

by peterjt ( 50113 ) writes: on Saturday July 18, 2015 @05:13PM (#50136949)

Why not start with looking at what S/W you plan to run, and then see what advice is available from them (and from other users) as to what H/W they would recommend.

Share
twitter facebook
- Re:Look for other users of the S/W for advice (Score:5, Insightful)
  
  by JamesTRexx ( 675890 ) writes: on Saturday July 18, 2015 @05:22PM (#50136977) Journal
  
  Precisely this. Do not look at the hardware for hardware's sake, look at the needs to run the software as best as you can. Does it benefit from parallelism? Throw tons of Opteron cores at it. Does it benefit from speed? Get the fastest Intels. Can it do everything in RAM? Stuff the servers with it, etc. etc.. Also, if it is built to scale, start with one or two servers, then see what kind of load it causes and base the next nodes you add on that data. You might even want to consider starting off with a virtual environment for portability to other hardware or cloud providers.
  
  Parent Share
  twitter facebook
  - Re: (Score:1)
    
    by Tough Love ( 215404 ) writes:
    
    Seems like mainly a way of avoiding the real question. It's pretty obvious what software the OP wants: PC server stuff. Any ideas, or did you just intend to hijack the thread?
    - Re: (Score:1)
      
      by Anonymous Coward writes:
      
      Spoken like someone who has no idea what the fuck they are talking about. Building for the solution is the right thing to do since the OP was so vague. They need to figure that out before even looking at the hardware. They don't know they are doing and shouldn't be handling this project at all. If anything, find someone who does and pay them before fucking up a purchase and looking more the moron than they already do by having to come to Ask Slashdot for advice.
      - Re:Look for other users of the S/W for advice (Score:4, Interesting)
        
        by Anonymous Coward writes: on Saturday July 18, 2015 @06:37PM (#50137227)
        
        this this this!!!!
        For example the work I do with a HPC would need a monster DB able to handle millions of inserts a day. Which needs bottom rack intel video chips but monster data interconnects (think 40gb per sec and up). But someone doing oil topographical analysis or making a movie may want top of the line quadra nvidia cards and lots of memory and minimal disk space.
        A HPC runs the gamut of what is out there.
        For about 50k I am sure you could build something from HP or Cisco that is in the 100-200 cpu range. But what are you going to do with it? What sorts of network interconnects are you looking at what sort of storage do you need? If you need say 500k sustained IOPs per second 50k will not cut it (start thinking in the 400-1million range).
        That just gets you the hardware. Do you need a particular bit of software? What is that going to cost and ongoing cost? For example something like splunk can costs several hundred k per month in the right environment.
        Without the specs of what you are doing I would be randomly guessing what you need.
        My advice? Start with a prototype of bottom run 'crap' 'costco special' hardware. Work your way up and decide what you need. Perhaps hire someone who knows how to plug this all together. Having done this a few times it can be a challenge just to manage 5000 bits of hardware all showing up one day and getting it all put together. Finding a location and power sources can actually be a challenge. Depending on how big it is you may not be able to plug it into your buildings mains. I suggest a high level design then work your way down to lower designs. But most of all HIRE SOMEONE who knows this stuff. There are thousands of people out there that need a job that can do EXACTLY this sort of thing.
        
        Parent Share
        twitter facebook
        
        Re: (Score:2)
        
        by Gazzonyx ( 982402 ) writes:
        
        Agreed. I'd like to see the look on their face when they realize they have enough power for racks of servers but not nearly enough cooling. We've actually got a staging area at work where all of the power comes into the building, but it's only got enough cooling for a couple of racks (and not dense racks, at that) full time and then a couple of racks meant for short tests. Running everything full for any lengthy amount of time probably wouldn't fly even in the Minnesota winter with the receiving garage d
      - Re: (Score:2)
        
        by Thumper_SVX ( 239525 ) writes:
        
        I wish you hadn't posted AC, and I wish I had mod points!
        This is the right answer. The workloads aren't clear because we don't know what OP is trying to accomplish with this setup. Is he building an HPC cluster to do engineering analysis, or is he building it because he has convinced his management that it's cool? If the latter... well, he'll be looking for a new job after he builds it and it does nothing to help his customer (his employer).
        Start with the application. What are its workload characteristics?
        
        Re: (Score:2)
        
        by postbigbang ( 761081 ) writes:
        
        You're right.
        On on plane of the graph is CPU family. The next one is speed and cache management. IO periodicity considerations is another vector. Which freaking OS, and if it's scaled up/out or is static through its lifecycle.
        Containers? One fat app? What does what talk to what, via hypervisor, container-hosting, or linear OS? How much network, how often, and with what concurrency to which apps/VMs/containers, etc? Quiet or aperiodic duty cycle? Transaction processing? Must be highly parallel/available? Tal
        
        Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        Is this a software question looking for hardware, or a systems question looking for efficiency, budget constraint, or just sexy buzzwords?
        The first two and not the third, quite obviously. OS is easy: Linux. I vote -1 for containers. One fat app sounds like a bad idea, except certain hot spots, but that's not what the OP asked, is it?
        You're getting warm with the speed and cache management. Now add a cost axis and you're addressing the original question. I hope.
        
        Re: (Score:2)
        
        by postbigbang ( 761081 ) writes:
        
        There's really insufficient info lent to what the app is. Canned? Scale? Lifecycle? Only HPC. Connects to what. Needs what IO.
        There are fat apps that may be more than sufficient for what's needed without VM/container/walls overhead of any kind.
        Many variables are unstated and someone asking for the behavioral characteristics of processor families that Intel makes vs big hulking hardware generic platforms.
        I can see wanting to use scale up/out ideas, but this is far too nebulous to call this a nail, as in the
        
        Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        Wrong. The OP stated fluid dynamics. Read again please.
        
        Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        I wish you hadn't posted AC, and I wish I had mod points!
        This is the right answer.
        
        No it isn't. You conveniently forgot that OP clearly stated "FEA/CFD", narrowing down the appliction area and available solutions more than sufficiently. Perhaps you focused on jumping on the loudmouth wagon instead of actually considering the question that was asked?
        Could have been a great thread if not for you guys.
        
        Re: (Score:1)
        
        by nikkipolya ( 718326 ) writes:
        
        You conveniently forgot that OP clearly stated "FEA/CFD", narrowing down the appliction area and available solutions more than sufficiently
        I worked for a major CAD/CAM/FEA software maker and they have only recently added distributed computing capability to their software and it is still very limited in its capabilities. They still do not use GPU capabilities beyond rendering. While I know from my erstwhile colleagues who are now working for another major FEA/CFD maker, and their software has cluster computing capabilities for many years now, their software utilizes GPU too. So I agree with the its important to understand the capabilities/requi
    - Re: Look for other users of the S/W for advice (Score:2)
      
      by kenh ( 9056 ) writes:
      
      Seems like mainly a way of avoiding the real question. It's pretty obvious what software the OP wants: PC server stuff. Any ideas, or did you just intend to hijack the thread?
      What? You don't build a High Perfomance Computer (HPC) to run 'PC Server Stuff'...
      But let's say he does, he wants to build a monster PC Server to run 'PC Server Stuff', wouldn't a file server be different from a VM Host, a database server different from a compute server? And what determines how you build up the server? The software you
      - Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        PC server stuff = OS that runs on a PC. Apps that run on a PC. Middleware that runs on a PC. Sad that I had to spell it out.
    - Ah - the "why is the sky blue" question (Score:2)
      
      by dbIII ( 701233 ) writes:
      
      Seems like mainly a way of avoiding the real question.
      A good analogy of what is going on here is if the question was "why is the sky blue" and you have answered "dust" while the other has mentioned rayleigh scattering and a variety of other factors.
      
      While "get some servers" is correct it's not exactly a useful answer is it? The above poster is right, for some stuff you want speed and for others you want as many cores as you can afford and don't give a shit about the speed, and without knowing what the subm
      - Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        A good analogy of what is going on here is if the question was "why is the sky blue" and you have answered "dust" while the other has mentioned rayleigh scattering and a variety of other factors.
        Your analogy fell down and can't get up. The OP asked about cost effective hardware for fluid dynamics. You wrote some poetry that doesn't even rhyme.
        
        Re: (Score:2)
        
        by dbIII ( 701233 ) writes:
        
        With respect, the analogy was a polite way to point out that your "get some servers" was missing the point of the question entirely, which was about what type of "servers", and without knowing what sort of methods are being used to solve the fluid dynamics problems then it's hard to work out what sort of machines - eg. a few fast cores, lots of machines with fast cores or lots of slow cores in machines with a lot of memory that it can get to a lot more quickly than lots of machines with fast cores. Simple
        
        Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        ..the analogy was a polite way to point out that your "get some servers" was missing the point of the question entirely, which was about what type of "servers"...
        You don't need to tell anybody what the question was, it was plainly stated, e.g., "Is it even reasonable to order $50k worth of components and put together our own high-performance, reasonably-priced blade cluster?"
        Maybe try answering that instead of twisting more. Not sure why you're putting in so much energy trying to find reasons to be irrelevant. For example: "there's no such thing as a commodity blade."
        
        Re: (Score:2)
        
        by dbIII ( 701233 ) writes:
        
        Perhaps you should try reading the part of my post starting with "without knowing what sort of methods are being used to solve the fluid dynamics problems" and you'll get some idea about what everyone else is discussing and why "just use a computer" is not an answer of any value.
        
        Re: (Score:2)
        
        by dbIII ( 701233 ) writes:
        
        For example: "there's no such thing as a commodity blade."
        Who are you quoting and what does it have to do with my post or even the topic?
  - Re: (Score:2)
    
    by rwa2 ( 4391 ) * writes:
    
    Yes, look at software requirements first. FEA and CFD software can be extremely hardware specific. Cant they make use of powerful GPGPUs? Most server chassis will have great CPU/RAM but crap in the way of PCIe slots and especially GPU power plugs. What OS will the SW need to run? HP doesn't even certify "consumer grade" OSes on much of their rackmount lineup, and if you use Windows Server 20XX you often can't get the latest certified GPU drivers on the Server OSes, so you may well lose product support
  - Re: (Score:2, Insightful)
    
    by Anonymous Coward writes:
    
    I will third this. I will also state that I was directly involved in building a home grown cluster that was highly ranked in the Top500 List a little over a decade ago.
    You MUST begin with needs analysis and that goes WAY beyond just looking at research domains, in this case FEA and CFD. You have to know what software you want to run. You must also research and find out if there are alternatives to what software you currently run (or are initially planning to run) that may have modern competitors that run mo
    - Re: (Score:2)
      
      by mlts ( 1038732 ) writes:
      
      I will add another voice into this list in agreement. The problem is that what is needed is so vague.
      There is just no way to recommend hardware. Do you need a lord-king-God-Almighty interconnect backbone switch between all nodes so they can push 40 gigs/sec between each other? A blade/enclosure is a must. Do you need I/O performance above all else, or CPU performance? It might be cheaper to buy a ton of 1U ProLiant G7s with HBAs[1] and 10GigE cards.
      Oracle RAC? Again, need a hefty SAN connection, perh
  - Re: (Score:3)
    
    by sumdumass ( 711423 ) writes:
    
    Just wanted to add, don't stop at the recommendations the software suggest.
    I had a client who decided to go with the hardware recommendations provided by the software vendor against my objections. Six months after we were up and running, the software which was the entire point of the ordeal released an update that slowed everything down enormously. Turns out, their "recommended" hardware specs were slightly better than their minimum specs on the new version of the software and the server had also been purpo
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  Sure you do business with CDW+G, talk to your rep, have him put some consultants on the phone with you. Over year head? OPT in for professional services...
  not a CDWG employee, but have used their consulting arm in the past with good results.
  - Re: (Score:2)
    
    by KGIII ( 973947 ) writes:
    
    I made use of their services quite some years ago. This is a seconding for them. They became our go-to for hardware and hardware recommendations even while we were mostly a Sun shop. The reps were knowledgeable and polite. The service was top-notch. The after-sale support was surprisingly good. I have been out of the loop for about eight years as I am now retired but I keep my ear to the ground a little bit and have not heard anything that would make me inclined to believe they have changed.
- Re: (Score:2)
  
  by ShanghaiBill ( 739463 ) writes:
  
  Why not start with looking at what S/W you plan to run, and then see what advice is available from them (and from other users) as to what H/W they would recommend.
  Bingo. The correct HW depends entirely on the SW. Depending on the SW, the GPU likely matters more than the CPU.
  You should also consider just renting the HW on demand from AWS. Unless you are going to run your rig 24/7 (you won't), it is likely that AWS will be cheaper, and you won't be stuck with outdated HW in a few years. If you need results quickly, just spin up additional instances.
- Re: (Score:2)
  
  by LWATCDR ( 28044 ) writes:
  
  Exactly. You have a specific task and probably specific software for that task. If the software supports CUDA then you might want to spend money on Tesla cards over CPUs. Does it use Open CL? Then you might want to look at AMD GPU compute cards.
  Do you need a large memory space?
  Do you need a lot of threads or just a few really fast ones.
  If you have 50k for the system then I suggest you spend a little of it on someone that really knows this subject.
  It may make more sense to just use Amazon E2C.
Supercomputers are very workload specific (Score:2)

by Michael Tiemann ( 3136525 ) writes:

You mention you are interested in CFD. Intel Phi processors have been known to do well here: http://www.cfd-online.com/Foru... [cfd-online.com] . In that linked story, a single Intel Phi processor beats a 1024 core cluster. Moreover, Thinkmate is literally giving away Intel Phi processors: http://www.thinkmate.com/syste... [thinkmate.com] . But not all workloads fit the Phi, so you really need to do some benchmarking before you buy.
- - Re: (Score:1)
    
    by Michael Tiemann ( 3136525 ) writes:
    
    May be I am wrong, by I will try compare results. There is some data
    http://www.hector.ac.uk/cse/di... [hector.ac.uk] and from topic starter
    Xeon Phi for 50 time steps
    grid size - 90^3 - 175^3
    best time - 200s - 1500 s
    Hectors 4 core of AMD 2.8GHz dual-core Opteron 5 time steps
    grid size - 100^3 - 200^3
    time - 795s - 8800 s
    Hectors 1024 core of AMD 2.8GHz dual-core Opteron 40 time steps
    grid size - 200^3
    time - 1490 s
    So, single Xeon Phi card for OpenFOAM is compatible with 1024 core cluster (for this benchmark)
    - Re: (Score:2)
      
      by Cytotoxic ( 245301 ) writes:
      
      I don't work in this area, so I wouldn't know..... but are the different grid sizes significantly different? I would assume that going from 175^3 to 200^3 could be a major jump - the sort of thing that imposes big costs for handling exponentially increasing amounts of data.
Get some quotes (Score:3)

by hawkeyeMI ( 412577 ) writes: <brock@brocELIOTktice.com minus poet> on Saturday July 18, 2015 @05:18PM (#50136963) Homepage

Disclosure: I have worked for Penguin Computing in the past, though I currently have only a customer relationship with them (we use their Penguin on Demand service). I strongly recommend you talk to a few of the HPC vendors out there about your needs and get a few quotes. Obviously Penguin is one I recommend, I'm not sure who else is still in the business, I think at least one of the major ones I've gotten a quote from in the past went under. Just do a little googling. They are probably familiar with your applications and can get you a turnkey solution that's well-suited for your application.

Share
twitter facebook
- Re: (Score:2)
  
  by Tough Love ( 215404 ) writes:
  
  And what OP wants most probably would be as commodity as possible. What about cheapo meta shelving with minitowers carrying dual socket server boards with mid range 8 core machines? Density low enough to cool with built in fans and ambient air.
  - Re: (Score:2)
    
    by hawkeyeMI ( 412577 ) writes:
    
    Plenty of HPC folks out there selling rebranded supermicro gear, including Penguin, with a variety of cluster management systems on them, open source and proprietary. That's pretty commodity.
- Re: (Score:2)
  
  by wezelboy ( 521844 ) writes:
  
  The cloud is the last place you want to do CFD.
Why not rent the time? (Score:4, Informative)

by plopez ( 54068 ) writes: on Saturday July 18, 2015 @05:24PM (#50136983) Journal

You haven't said anything about your application. Do you run it continuously? Sporadically? Will the machine be sitting idle much of the time? Do you have the staff to support it? What about networking and storage? Do you have the ability to rapidly move and store data as the actual computing is only part of the story.
It may make sense to rent the time due to lower storage and maint. costs than to actually buy and maintain the infrastructure.

Share
twitter facebook
Haswell-EP Xeons (Score:3)

by coats ( 1068 ) writes: on Saturday July 18, 2015 @05:24PM (#50136985) Homepage

I would go with Haswell-EP Xeons -- probably 2697v3 (14 cores @ 2.6-3.6): a two-socket motherboard gives you 28 physical cores per board, for prices in the $12K range. Just one of these is quite a powerful system. If you can get by with a 2-node system, then 10GE interconnect is good enough (AND MUCH CHEAPER); for more nodes, you will need Infiniband (since 10GE does not scale well). The 4-node/IB cluster will be on the order of $60K, and will offer more performance than a $160K solution of a couple of years ago.
These will offer far better performance than the Opteron solution.
Can you compile your own application? If so, use the Intel compilers, and make sure you compile targeting the Haswell instruction set (-O3 -Xhost -march=corei7-avx2 -mtune=corei7-avx2 if I recall correctly): the full AVX2 Haswell instruction set is rather more powerful for your app than the predecessor "AVX" SandyBridge/IvyBridge instruction set, which is far more powerful than the previous Nehalem/Westmere SSE4.2 instruction-set, which is somewhat more powerful than a simple "-O3". If you can't compile on your own, try to make sure the vendor's executables target AVX2; the right compile-flags will double your performance over "-O3"...

Share
twitter facebook
- Re: (Score:2)
  
  by Tough Love ( 215404 ) writes:
  
  ...If you can get by with a 2-node system, then 10GE interconnect is good enough (AND MUCH CHEAPER); for more nodes, you will need Infiniband (since 10GE does not scale well)...
  Useful commentary for the most part, but this bit is just wrong. Nobody needs Infiniband. If you think you need Infiniband then get some RDMAoE instead. Save yourself some money and some grief.
  - - Re: (Score:2)
      
      by coats ( 1068 ) writes:
      
      You've never run CFD,have you? ...and don't know its message-patterns, either. That's what the OP said he was doing (though he didn't say whether he was running someone else's "canned" code, or was compiling his own). Unless it's specifically compiled for Haswell (which is unlikely,but sad), the "canned" code will not take proper advantage of the Haswell AVX2 instruction-set.
- - Re: (Score:2)
    
    by jlehtira ( 655619 ) writes:
    
    The original poster did say it's for FEA/CFD. You may not know what this means, but that doesn't mean there's no basic information. Incidentally, the suggestion for 28 core Xeon nodes is exactly what we got for weather prediction this year (and CFD would be similar - need for huge number crunching power, with lots of communication between threads every timestep). The only obvious alternative would be a GPU-based solution, but to my knowledge most existing FEA & CFD codes don't use GPUs.
Amazon AWS (Score:5, Interesting)

by Cyberax ( 705495 ) writes: on Saturday July 18, 2015 @05:30PM (#50137001)

Unless you need to transfer A LOT of data from your cluster, Amazon AWS will probably be cheaper than dedicated hardware. Especially if you can use spot instances (that are 5-10 times cheaper than the regular Amazon EC2 instances).

Share
twitter facebook
- Re: (Score:3)
  
  by kimanaw ( 795600 ) writes:
  
  This. AWS has a GPU tier (kinda pricey, but probably cheaper than standing up an equivalent on your own). I'm guessing your FEA/CFD will probably need GPUs. $50K will rent a lot of GPU time. Not sure how available the spot instances for them are.
  otoh, if you're looking to use regular CPUs, Azure has an infiniband tier that may be a better interconnect for HPC purposes than AWS's 10 Gbps VPC's.
- Re: (Score:1)
  
  by Sebo ( 94385 ) writes:
  
  As suggested, take a look at http://aws.amazon.com/hpc/getting-started/. If nothing else, it gives you an option that's faster to implement, reduces CAPEX, and doesn't leave you with physical infrastructure that is immediately depreciating and becoming "obsolete".
" There seems to be precious little..." (Score:2)

by Glasswire ( 302197 ) writes:

"...straightforward information about this on the net." Because it's not straightforward. If you are have enough grasp on your requirements to understand the apps you want to use and you are using commercial CAE / CFD codes, your ISVs should be able to give you some guidance about what typical customers are running (how many cores over how many nodes configured with how much ram and storage with what kind of cluster interconnect and MPI message passing etc) for workloads similar in size to yours. If you'
See what you can do with leasing or cloud (Score:3)

by garyisabusyguy ( 732330 ) writes: on Saturday July 18, 2015 @05:32PM (#50137009)

There are plenty of costs beyond the actual computer, including power, power conditioning, battery backup, heat removal, etc... that make up most of the cost.
If you still decide to build your own hardware, then pay close attention to
1. Compatibility with your chosen software, i.e. the best system in the world is worthless if it does not run the software that you want. You may be building your own software, then you will still need to consider OS, compiler, libraries, etc
2. Ability of the operating system to provide enough resources to your software, in the 'good old days' Windows only provided a limited amount of RAM to processes, even in today's world Windows system swap aggressively and may not give you the RAM performance that you may see in the Enterprise *nixes
3. Internal bus structure of the system you choose, The biggest growth in PC hardware has been the internal bus width and speed. Look around, but for cost's sake, you will probably be using a variety of PCIe from Intel. You will probably also see better integration with the PCIe bus with Intel chips. If you are using GPU accelerators, that is a whole 'nother kettle of fish that will affect your other decisions above and below
4. Methods provided for disk access, used to be the Fibre-Channel was the King, but times have changed with iSCSI making inroads, and local disk architecture provides the greatest bang for the buck with SATA starting to edge out SCSI. If you go the SAN or iSAN routes, it will have additional costs for rackspace, power and cooling.
5. Disk system that you choose, most people would suggest butt-loads of local SSD, after RAM, solid state drives will probably be your highest costs
Just my two bits, plus I completely ignored tape system vs spinning-disk hard drives for backup, which would add more rack space, power supply and cooling to anything that you try and put together. Try and put together a realist estimate for purchasing and supporting your hardware for a couple of years and compare it to cloud cost for similar resources

Share
twitter facebook
small cluster: performance/price metric (Score:5, Interesting)

by lkcl ( 517947 ) writes: <lkcl@lkcl.net> on Saturday July 18, 2015 @05:35PM (#50137019) Homepage

i did this before, on a very small scale, for GBP 1,000 about 10 years ago. sales teams kept offering me 2ghz dual-core machines at GBP 300 each and i had to tell them this:
"look, i have a budget of 1,000 GBP. you're offering me a 2ghz system for 300. so i can only buy 3 machines, right? so that's a total of 6 ghz of computing power. on the other hand, if i buy this GBP 125 machine which has only a 1ghz processor, i can get 8 of those, which gives a total of 8 ghz of computing power. so _why_ would i want FASTER?"
so i bought qty 8 of motherboard, CPU, 128mb RAM, low-cost case containing a PSU already, and accidentally included a 3com network card because i didn't realise that the built-in ethernet on the motherboard could do PXE boot..... but still, all-in that was 125 GBP and each one took 15 minutes to assemble so it was no big deal. got myself 8ghz of raw computing power, which was the best that i could get for the money that i had.
and that's the question that you have to ask yourself. what's the highest performance / price metric that can be achieved?
the highly specific problem that i was endeavouring to parallelise was a very small memory footprint non-I/O-bound task: running the NIST.gov Statistical Test Suite. i booted all 8 machines off of my laptop, over PXE boot with an NFS read-only root filesystem. had to wait 30 seconds between each because my 800mhz P3 laptop with 256mb of RAM reaaallly couldn't cope with 8 machines hammering it... not over a 100mbit/sec link, anyway.
once started, i wrote a script that ssh'd into each and left them running the STS for a day at a time. very little actual data was generated: a report.
but the issue that you're solving may involve huge amounts of disk I/O, it may involve huge amounts of inter-connectivity (inter-dependence between the parallel tasks). you may even have to use a GPU (OpenCL) if it's that computationally expensive... ... and that's where anyone's advice really ends, because unless you know exactly what it is you need to do - in real, concrete terms of I/O per second, GFLOPs/sec, GMACs/sec, inter-communication/sec, you really can't and shouldn't even remotely consider spending any money.
so please consider writing a spreadsheet, based on the performance/price metric, extending it to the domain(s) that you're interested in optimising. then the answer about what to buy should be fairly self-evident.
oh and don't forget to include the power budget (and cooling) because i think it will shock the hell out of you. remember you need to include the maximum specs, not the "average" or "scenario design power".

Share
twitter facebook
You are going to need more detail. (Score:2)

by fuzzyfuzzyfungus ( 1223518 ) writes:

Your options really depend on what sort of 'high performance' you have in mind. When it comes to performance per core, Xeons typically crush Opterons; but the pricing reflects that, especially if you need the 4-8 socket support and RAS features. If what you need is large amounts of RAM with the lowest possible spending on the system around it, Opterons have tepid performance per core; but are likely to be the cheapest option that still supports ECC, more than one socket, buffered DIMMs, and any other niceti
What can the software you will use do? (Score:2)

by Kjella ( 173770 ) writes:

So you're using workstations now, my first order of business would be to figure out how you'd work together on a server or cluster. Does your software and workflow actually support that or will it just be like a super high end workstation. Once you've got that done, you can start working on what is it your workload actually needs. How many nodes, CPU, RAM, network and so on. In general if your software scales well, more and less powerful nodes will do the job cheaper. Quad-core systems are expensive and sho
Real HPC next to your desk (Score:2)

by deadline ( 14171 ) writes:

Take a look at Limulus systems from Basement Supercomputing: http://www.basement-supercompu... [basement-s...puting.com]
These are fast low power/noise/heat systems with a fully installed open source HPC software stack.
Good DIY Option (Score:1)

by Zorlon ( 181163 ) writes:

I've used supermicro in the past. Very cost effective if you like to assemble your own systems. I have no financial relationship. This is just an unsolicited testimonial. http://www.supermicro.com/prod... [supermicro.com]
Raspberry Pi cluster (Score:2)

by ArcadeMan ( 2766669 ) writes:

Because Soulskill recommends it. Fourteen thousand Raspberry Pi, in fact.
Core considerations (Score:2)

by whoever57 ( 658626 ) writes:

And what is the better choice here? 16-core Opterons at 2.6 GHz, 8-core Xeons at 3.4 GHz? Are power and thermals limiting factors here? (A full rack cupboard would consume something like 25 kW, it seems?) There seems to be precious little straightforward information about this on the net.
There is another factor to consider. If you ever license software that is priced using a per-core model (for example LSF), you will find a great advantage in going with the Intel solution.
- Re: (Score:2)
  
  by Cassini2 ( 956052 ) writes:
  
  Per core or Per CPU software pricing can dominate the cost calculation. We have a CFD application, and we were considering boosting the hardware. One look at the software costs discouraged us.
  A costly complication is that 3-D CFD (or FEA), is an O(n^3) problem. Doubling the mesh density means 2^3=8 times the CPU time. An increase of 10 times in the mesh density requires 10^3=1000 times the CPU time. If you are pushing the extreme, small changes in the mesh density have significant cost impacts.
  It mak
What CFD software, Interconnect, Storage.... (Score:1)

by jason.stover ( 602933 ) writes:

Really... it depends. What software are you using (starccm+? openfoam? custom + mpi?)? Planning on InfiniBand interconnect (if so QDR? FDR? ... DDR I guess?)? 10GbE? What about storage for the cluster file system? Lustre? NFS? How much sustained IOPs are you expecting to need? How much RAM per Core / or overall RAM is required for your application? Without more info it's impossible to give a good answer. -J PS. Not advertising, but I do actually work for a company that sells CPU time on HPC clusters. And with
*LOTS* of info on the net (Score:3)

by gavron ( 1300111 ) writes: on Saturday July 18, 2015 @06:41PM (#50137243)

The problem is that you don't know what you're looking for so you're not asking the right questions.
- Power is a factor. You mention 25KW. Wrong units. You should look for KVA. You'll never know what the wattage is until you know the power factor (PF) and you won't know that until you populate the device with spindles and fans (which have a different PF than CPUs, GPUs, PSUs,) and then run it under load and measure.
- 25KVA is a medium rack. 35-50KVA is a dense rack. How many racks you choose to have is up to you, but the "25" number is not a good random one to shoot for. If you search for "30KVA" and "High density rack" you'll get an idea of what servers do populate such things.
- You won't be running anything of this magnitude at your deskside, unless you are in Alaska or Siberia and have no other source of heat. Also most businesses don't like running 4 30A 3-phase 208VAC to employees' desksides. Just sayin'... And again, if you're not Alaska or Siberia with an open door and window, you won't move enough air through your office to cool that beast. (Air mass is directly related to cooling, and unless you're doing dielectric-immersion cooling, the sheer amount of air requires massive fans and lots of space.)
- Two other responses said "See what your software vendor says." Software is abstracted by compilers. The real question is "how much CPU, GPU, DISK, or other IO does it do" and plan for that. That will also change the PF and the KW and the heat load.
There's a reason nobody builds deskside compute servers with today's technology. Density, power, and cooling.
Keywords to google: KVA PF KW, high density rack server, PUE (PUE is the inverse of PF and is applied to an entire data center which includes cooling.)
Other places to look: look up abstracts for talks at Data Center World.

Share
twitter facebook
- GPU (Score:2)
  
  by witherstaff ( 713820 ) writes:
  
  If you're considering GPU look through any bitcoin mining forum. Setting up a reliable GPU farm is a nice tech challenge if you really want to grow your own. A few years ago I ran out of power at 50 GPUs, they're hogs. Heat is a whole other problem as the previous posted made note of. For a business use you may want to just use AWS or GPU appliances.
- Re: (Score:2)
  
  by Thumper_SVX ( 239525 ) writes:
  
  There's a reason nobody builds deskside compute servers with today's technology. Density, power, and cooling.
  And the fact that a deskside system is highly unlikely to be utilized 100% of the time... probably more like 10% of the time. In that case it's more cost-effective to farm it out to a bigger cluster in a server room, or run it in some AWS/Azure nodes for the time it needs and then shut it down.
  The fact that there are many more high performance computing resources available relatively cheaply is as good a reason as any not to do deskside compute on a large scale.
- Re: (Score:2)
  
  by pnutjam ( 523990 ) writes:
  
  I currently have three HPC clusters I manage. Those things definitely move some air. If I open the mesh rack door, without holding it, the air flow will slam it into me pretty hard. It's just a light door, but it's full of holes and the air still pushes it hard.
Easy (Score:2)

by nospam007 ( 722110 ) * writes:

"Or is this folly, best left to experts? "
Since you don't seem to be or have experts nor money, it's obviously folly.
project destined for disaster (Score:2)

by bloodhawk ( 813939 ) writes:

I regularly work with large corp's and governments building large HPC, mainframe replacements, large clusters and you appear to be falling into the same trap a lot of them do. As others have said, it isn't the hardware, anyone that tells you what is best based on your summary doesn't have a clue as it is all about the SOFTWARE. I recently watched an organisation spend the best part of a million dollars on high core count machines only to then find the app doesn't perform well in parallel and in fact scales
- Re: (Score:2)
  
  by pnutjam ( 523990 ) writes:
  
  There are so many components to manage, working with individual vendors, who will just blame each other, is a nightmare. You definitely need to have a comprehensive vendor who has the clout to manage this. HPC clusters that are used, require constant attention.
Red Barn (Score:2)

by Theovon ( 109752 ) writes:

Got two Ivy Bridge dual-socket 12-core Xeon boxes a couple of years ago. I called up Red Barn. They helped me figure out what hardware would give me more bang for my buck (two dual-socket Ivy Bridge blades got me more cores than one Sandy Bridge with four sockets), built it up for me, installed the OS, and delivered it. Smooth as butter. IIRC, the whole deal cost me around $24000, for one compute/server node and one compute node. For $50K, if prices have scaled similarly to Haswell Xeons by now, you'd
You need to gather more info (Score:2)

by excelsior_gr ( 969383 ) writes:

You need to look into the problem at hand more closely! The software plays a very important role. Perhaps it can benefit more from a GPU cluster rather than a CPU cluster? Can it benefit from the instruction set of the latest Xeons or will the older (and now cheaper) generation suffice? CFD simulations are quite memory-hungry, so 3 GB per core is pretty standard. Also, you need to make sure that the cores can talk to the RAM efficiently, so definitely pick a CPU with 4 memory channels. After 6 cores per cpu
GPUs? (Score:2)

by Chewbacon ( 797801 ) writes:

Was just reading about a 25 GPU cluster for brute forcing passwords. You can use them for supercomputing too. You could probably homebrew one with used equipment and save some cash. Anyway, here is some inspiration: http://arstechnica.com/securit... [arstechnica.com]
If you go Supermicro (Score:1)

by Khyber ( 864651 ) writes:

They will build the system for you. In fact, they require that they build the system for you in order for you to get ANY warranty service.
It's worth it. For just a bit over $100K you can have a 192 thread 1TB RAM 12GPU quad-cluster loaded with 24 SSDs.
- Re: (Score:1)
  
  by Khyber ( 864651 ) writes:
  
  For reference: A system from them I had made.
  http://i.imgur.com/d4gPjNM.png [imgur.com]
  And actually that's just over $91K
  So even less than what I was saying initially.
  - Re: (Score:1)
    
    by Khyber ( 864651 ) writes:
    
    Also, go with Xeons. AMD has been lagging behind and HBM isn't going to help lose the gap very much.
Obligatory car analogy (Score:1)

by wheelbarrio ( 1784594 ) writes:

You have asked "What is the best car I could buy? Also, should I build it myself or get one from the showroom?"
As many other posts here suggest, the first question is kind of meaningless without knowing what you want to do with said car. Is it for trips around town? To carry 7 kids? A lean mean street-fightin' machine?
As for the second question, if your budget is $50k, then I suggest neither. You cannot (should not try to) build a general-purpose HPC solution and its infrastructure for that kind of
Useful links (Score:1)

by LStewart2 ( 4189605 ) writes:

You might look at http://limulus.basement-superc... [basement-s...puting.com] for a concrete example of the sort of system you are talking about. Also, http://www.beowulf.org/ [beowulf.org] is the home of the community of people who build compute clusters from any old hardware and run open source software on it.
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
Yet another voice of experience (Score:2)

by ihtoit ( 3393327 ) writes:

Find a consultant and talk to him about precisely what software you want to run, so he can work out as future-proof a solution as possible for you.
There is no way I can recommend hardware without this basic information. Do you want to run a Quake server? Video streaming? Virtualisation? HD encoding? CGI rendering? Studio or OB relay broadcasting? Teleconferencing? CRM? Wiki/bulletin board/IRC/SL? Multi-point data aggregation? Physics simulations? All have specific and wildly different hardware requirements
- - Re: (Score:2)
    
    by ihtoit ( 3393327 ) writes:
    
    what, finite element analysis and computational fluid dynamics? Finite difference methods, finite element methods, finite volume methods, polynomial fitting, spectral methods, boundary element methods, iterated function systems...? Which? All? They each have different requirements, and you're talking about Big Iron to deal with them all - not something you're ordering from Dell using their web configurator.
Silicon Mechanics! (Score:2)

by Ydna ( 32354 ) * writes:

http://www.siliconmechanics.co... [siliconmechanics.com]
They take all that commodity hardware and figured out how to make nice high-density systems for you. Stop trying to figure out how to do it yourself. They've done the hard work and (in my personal opinion) have absolutely outstanding support (before and after the sale). I'm just a happy customer.
- Re: (Score:2)
  
  by iggymanz ( 596061 ) writes:
  
  indeed the smart solution is to pay for services, buying anything will just mean solution will be totally obsolete in 18 months
Well ... it depends ... (Score:2)

by golodh ( 893453 ) writes:

Sorry, but it really does. The right answer depends on you and your application. And previous posts to this effect are right: start with the software that solves your problem.
You see, if you (the one who posted the question) were a numerical mathematician or a computational physicist and looking for adequate performance in a research setting at rock-bottom cost, I'd say:have a look at GPU's (see e.g. here http://www.nvidia.com/object/c... [nvidia.com] ) and e.g. the Navier Stokes solver from Stanford U. (see here: htt [stanford.edu]
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
Depends on a lot of things (Score:2)

by dbIII ( 701233 ) writes:

Depends on a lot of things - if you have something massively parallel to do then core per $ is going to matter and speed is nowhere near as important. For other things speed is going to matter so you'll probably end up with a mix.
Memory is bound to be something that will drive the design. Do you want a LOT of shared memory (which means a few huge machines) or can you parcel it out to nodes on a much cheaper cluster? A network is NEVER fast enough for some things when the alternative is a big pool of memo
What about the interconnect? (Score:2)

by Terje Mathisen ( 128806 ) writes:

In pretty much every HPC cluster I've seen or been personally involved with (mostly oil/seismic processing or crash simulations), the type of CPU is only one of the cost drivers!
Typically you end up spending about as much on fast interconnects as you do on motherboards/cpus/ram etc. The main exception to this rule is when you have an embarrassingly parallelizable workload, with small memory footprint and no need for cross-system communication, i.e. like a Monte Carlo simulation or password cracking.
For oil
The machines may need to be heterogeneous (Score:2)

by Marrow ( 195242 ) writes:

You may need a (screamin) front-end machine that splits up the work and hands it off to multiple multi-core machines. These multi-core machines may only be available at lower clock-speeds.
Dont just "look at your application". But look at what parts of your application are subject to parallelism and what parts must stay single threaded. You may need a special single-thread machine that can keep the other ones fed.
Here is what I did... (Score:2)

by Taz1672 ( 69910 ) writes:

My company decided it wanted a new FEA machine. They decided to stay with the existing software company, so I called up the company and explained the situation and asked for the department that provided pre-sales support, specifically hardware recommendations. Turns out they had a strong bench of people ready to help with that and detailed Known Good configurations for each major hardware company. We simply decided looked at the software licensing costs, the hardware costs and how long our average scenario
- Re: (Score:1)
  
  by Taz1672 ( 69910 ) writes:
  
  Forgot to mention that if you use the software vendors presales support team and one of their Known Good configurations support will be a LOT easier to get from them.
Microway have been doing this a long time (Score:1)

by Tesseractic ( 1890790 ) writes:

You could do a lot worse than to buy one of these:
http://www.microway.com/produc... [microway.com]
Microway have been supplying high quality, high performance systems for decades and
they should have figured out how to do it right by now. If I had the spare money, it's what
I would choose.
For your CFD software, you might consider the open source OpenFOAM system.
Be careful with the memory subsystem to select DIMMs that will run at the maximum
rate of the system. Quad-rank DIMMs typically run slower. This may mean you can't
u
My Experience (Score:1)

by psgodwin ( 4192491 ) writes:

I was once tasked with the same scenario when I was an Engineering IT Manager for an aerospace startup. I would first ask the definition of "Bang for the Buck". Are we referring to Max TFLOPs/Hardware cost, HPC Utilization/Total cost of ownership, Value to business (Time to market, minimizing prototyping and tooling costs, material optimization) / Total cost of ownership, or something else entirely? The best bang for the buck is usually to use a managed cloud HPC provider but based on your post it sounds
- Re: (Score:3, Funny)
  
  by lenart ( 582259 ) writes:
  
  Save your money and use it to move somewhere without Fag Marriage.
  You can marry a cigarette where you live?
  - Re: (Score:2)
    
    by funwithBSD ( 245349 ) writes:
    
    Or a bundle of sticks.
  - - Re: (Score:1)
      
      by KGIII ( 973947 ) writes:
      
      ...leave this HPC shit to people that know how to cover there ass.
      You know, this HPC stuff really does not require the gay. That is optional.
- Re: (Score:1)
  
  by AlabamaCajun ( 2710177 ) writes:
  
  Time to bump an AC for a good solution.
  At 100 to 150 US Dollars the parallella provides 18 cores per Raspberry pi sized board.
  Run them with the deadhead distro and use a small Linux PC for the head.
  Speed is impressive for the cost.
- Re: (Score:2)
  
  by pnutjam ( 523990 ) writes:
  
  There is some truth in this, but it will really hurt you in ongoing support and any upgrade paths. Although, most HPC's should be run in a silo'ed environment and replaced rather then upgraded piecemeal.
- Re: (Score:2)
  
  by pnutjam ( 523990 ) writes:
  
  Yeah, this way he can burn through his $50k in a month or two.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Look for other users of the S/W for advice (Score:5, Insightful)

Re:Look for other users of the S/W for advice (Score:5, Insightful)

Re: (Score:1)

Re: (Score:1)

Re:Look for other users of the S/W for advice (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: Look for other users of the S/W for advice (Score:2)

Re: (Score:2)

Ah - the "why is the sky blue" question (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2, Insightful)

Re: (Score:2)

Re: (Score:3)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Supercomputers are very workload specific (Score:2)

Re: (Score:1)

Re: (Score:2)

Get some quotes (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Why not rent the time? (Score:4, Informative)

Haswell-EP Xeons (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Amazon AWS (Score:5, Interesting)

Re: (Score:3)

Re: (Score:1)

" There seems to be precious little..." (Score:2)

See what you can do with leasing or cloud (Score:3)

small cluster: performance/price metric (Score:5, Interesting)

You are going to need more detail. (Score:2)

What can the software you will use do? (Score:2)

Real HPC next to your desk (Score:2)

Good DIY Option (Score:1)

Raspberry Pi cluster (Score:2)

Core considerations (Score:2)

Re: (Score:2)

What CFD software, Interconnect, Storage.... (Score:1)

*LOTS* of info on the net (Score:3)

GPU (Score:2)

Re: (Score:2)

Re: (Score:2)

Easy (Score:2)

project destined for disaster (Score:2)

Re: (Score:2)

Red Barn (Score:2)

You need to gather more info (Score:2)

GPUs? (Score:2)

If you go Supermicro (Score:1)

Re: (Score:1)

Re: (Score:1)

Obligatory car analogy (Score:1)

Useful links (Score:1)

Re: (Score:2)

Yet another voice of experience (Score:2)

Re: (Score:2)

Silicon Mechanics! (Score:2)

Re: (Score:2)

Well ... it depends ... (Score:2)

Re: (Score:2)

Depends on a lot of things (Score:2)

What about the interconnect? (Score:2)

The machines may need to be heterogeneous (Score:2)

LOTS of info on the net (Score:3)