<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Head in the clouds</title>
	<atom:link href="http://www.politigenomics.com/2010/01/head-in-the-clouds.html/feed" rel="self" type="application/rss+xml" />
	<link>http://www.politigenomics.com/2010/01/head-in-the-clouds.html</link>
	<description>Politics, Information Technology, and Genomics</description>
	<lastBuildDate>Fri, 06 Aug 2010 21:21:45 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<item>
		<title>By: Clive G. Brown</title>
		<link>http://www.politigenomics.com/2010/01/head-in-the-clouds.html/comment-page-1#comment-15755</link>
		<dc:creator>Clive G. Brown</dc:creator>
		<pubDate>Sun, 17 Jan 2010 19:23:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.politigenomics.com/?p=1819#comment-15755</guid>
		<description>Well. I&#039;m a skeptic - and proud !</description>
		<content:encoded><![CDATA[<p>Well. I&#8217;m a skeptic &#8211; and proud !</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bob Carpenter</title>
		<link>http://www.politigenomics.com/2010/01/head-in-the-clouds.html/comment-page-1#comment-15726</link>
		<dc:creator>Bob Carpenter</dc:creator>
		<pubDate>Wed, 13 Jan 2010 00:15:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.politigenomics.com/?p=1819#comment-15726</guid>
		<description>If you can get away with a single desktop workstation for analyses, by all means go for it.  That&#039;s exactly what we do at work.  I&#039;d recommend a spare workstation, as you don&#039;t want your desktop computing getting jammed by jobs hammering away multi-threaded.  

Admittedly, if you can do all your sysadmin work yourself, and you have time to spare, it&#039;s a very different issue than for those of us whose time is already overbooked.

Clusters just up the ante.  My wife&#039;s having to run jobs that take hours over dozens of compute nodes, and often runs and reruns lots of them at the same time for different projects and analyses.  The work just won&#039;t fit on a desktop workstation.

I believe part of the argument in this post involves the same quantitative mistake people make to get themselves deep into debt: one more purchase can&#039;t hurt, can it?   

If it&#039;s really no trouble for sysadmins to manage &quot;just one more machine&quot; or &quot;one more core&quot;, then why ever have more than one sysadmin?  In fact, if a 12 year old can do it, why have a sysadmin at all?

Have you ever had Dell out for repairs or sat on their phone queue?  It&#039;s not fun, even if they do come out pretty quickly.  Our Dells broke down all the time when we were using them for speech processing and telephony at SpeechWorks.  It&#039;s at least half a day&#039;s lost work if you need to wait for the machine being repaired.

You may not be paying heating/cooling costs for your machines or even space or maybe even sysadmins (different institutions do this differently), and that can seriously change a decision point.  CPUs run at least 25 watts these days, and cheaper quad cores run around 80 watts or more per CPU. There&#039;s also memory, disks, heating and cooling the server room, etc.  It adds up.  So much so that your power circuits may need to be upgraded.  Again, if it&#039;s just a single workstation and you have a free high-amp relatively clean power circuit, no problem.  If not, at least get a UPS!

Also, there&#039;s the problem with adding the machine that breaks the camel&#039;s back in terms of heating/cooling.  This is what happened to me back in the late 1990s at Bell Labs.  We bought a bunch of SGI rack computers, mostly for speech and vision statistical processing.  Lucent was charging us a fortune for space on the Murray Hill campus (more than Manhattan in the dot com boom), so our cheapskate lab director decided to put the new machines in the same room with the old machines.  The problem was that the cooling system couldn&#039;t keep up and temps in the machine room quickly soared over 110 degrees and disks started failing.  So the whole thing got shut down for months while they rethought where they could put the new machines and how the lab&#039;s budget could pay for it.  

That&#039;s also what just happened to my wife&#039;s cluster at NYU (which went down again this weekend).  They added new machines on the &quot;what can a few more hurt?&quot; principle without adding new cooling beyond the building&#039;s own.  Not surprisingly, they overheat all the time and get shut down.  

Typically, you create a single image on EC2 and it shares them across nodes -- you don&#039;t literally have to manage all the nodes in a job.  At least for what I&#039;ve seen people do, they run something like Hadoop in a prepackaged configuration, so there&#039;s no configuring the OS at all.  

I&#039;m not sure if the cluster software for which sequencers are set up is available prepackaged on EC2 or other clouds, so that could be a huge hassle with EC2. Unless,  perhaps, you have that wily 12 year old on hand!</description>
		<content:encoded><![CDATA[<p>If you can get away with a single desktop workstation for analyses, by all means go for it.  That&#8217;s exactly what we do at work.  I&#8217;d recommend a spare workstation, as you don&#8217;t want your desktop computing getting jammed by jobs hammering away multi-threaded.  </p>
<p>Admittedly, if you can do all your sysadmin work yourself, and you have time to spare, it&#8217;s a very different issue than for those of us whose time is already overbooked.</p>
<p>Clusters just up the ante.  My wife&#8217;s having to run jobs that take hours over dozens of compute nodes, and often runs and reruns lots of them at the same time for different projects and analyses.  The work just won&#8217;t fit on a desktop workstation.</p>
<p>I believe part of the argument in this post involves the same quantitative mistake people make to get themselves deep into debt: one more purchase can&#8217;t hurt, can it?   </p>
<p>If it&#8217;s really no trouble for sysadmins to manage &#8220;just one more machine&#8221; or &#8220;one more core&#8221;, then why ever have more than one sysadmin?  In fact, if a 12 year old can do it, why have a sysadmin at all?</p>
<p>Have you ever had Dell out for repairs or sat on their phone queue?  It&#8217;s not fun, even if they do come out pretty quickly.  Our Dells broke down all the time when we were using them for speech processing and telephony at SpeechWorks.  It&#8217;s at least half a day&#8217;s lost work if you need to wait for the machine being repaired.</p>
<p>You may not be paying heating/cooling costs for your machines or even space or maybe even sysadmins (different institutions do this differently), and that can seriously change a decision point.  CPUs run at least 25 watts these days, and cheaper quad cores run around 80 watts or more per CPU. There&#8217;s also memory, disks, heating and cooling the server room, etc.  It adds up.  So much so that your power circuits may need to be upgraded.  Again, if it&#8217;s just a single workstation and you have a free high-amp relatively clean power circuit, no problem.  If not, at least get a UPS!</p>
<p>Also, there&#8217;s the problem with adding the machine that breaks the camel&#8217;s back in terms of heating/cooling.  This is what happened to me back in the late 1990s at Bell Labs.  We bought a bunch of SGI rack computers, mostly for speech and vision statistical processing.  Lucent was charging us a fortune for space on the Murray Hill campus (more than Manhattan in the dot com boom), so our cheapskate lab director decided to put the new machines in the same room with the old machines.  The problem was that the cooling system couldn&#8217;t keep up and temps in the machine room quickly soared over 110 degrees and disks started failing.  So the whole thing got shut down for months while they rethought where they could put the new machines and how the lab&#8217;s budget could pay for it.  </p>
<p>That&#8217;s also what just happened to my wife&#8217;s cluster at NYU (which went down again this weekend).  They added new machines on the &#8220;what can a few more hurt?&#8221; principle without adding new cooling beyond the building&#8217;s own.  Not surprisingly, they overheat all the time and get shut down.  </p>
<p>Typically, you create a single image on EC2 and it shares them across nodes &#8212; you don&#8217;t literally have to manage all the nodes in a job.  At least for what I&#8217;ve seen people do, they run something like Hadoop in a prepackaged configuration, so there&#8217;s no configuring the OS at all.  </p>
<p>I&#8217;m not sure if the cluster software for which sequencers are set up is available prepackaged on EC2 or other clouds, so that could be a huge hassle with EC2. Unless,  perhaps, you have that wily 12 year old on hand!</p>
]]></content:encoded>
	</item>
</channel>
</rss>
