PolITiGenomics

Politics, Information Technology, and Genomics

Ides of March

AddThis Social Bookmark Button

March 15th, 2010

During the 30 minutes or so after boarding a plane and when you are free to use approved electronic devices, you can get a little reading done. On my last couple trips I had a stack of somewhat dated Newsweek magazines to pore through. Fortunately, there were several good articles, which I now pass along for you consideration (which I type on my computer during the period I am free to use approved electronic devices with the wireless features disabled). First a couple articles on the topic that has been on everyone’s mind now that the Republicans are no longer “responsible” for the growing debt: Fareed Zakaria’s Defusing the Debt Bomb talks about several concrete measures that can be taken to reduce the debt and The Real Greek Tragedy talks about why it is important to do that. Bringing a dose of reality to the debt issues is We the Problem which talks about why the US Congress will not enact any of the needed changes (he only gets it half right by blaming the people, lobbyists are part of the equation too). Shifting topics to the “partisan gridlock” in Washington DC, Ezra Klein’s Stay Out Of It, Mr. President discusses how the mere act of the President, any President, supporting some legislative agenda tees it up for the opposition party to, well, oppose it. This opposition occurs even when there is not much substantive difference between the two parties’ stances on the issue or when significant proposals of the opposition party have been included in the bill (giving credence to Mr. Klein’s thesis is the fact that Republicans no longer support their proposals from the 1993 health care debate that are in the current bill). The actual distance between Republicans and Democrats on issues is discussed in How the GOP Sees It. Finally, Google’s Orwell Moment discusses their flubbed roll out of Google Buzz and why things like that should concern you. I like the Newsweek article because it actually uses Orwell’s name in an appropriate reference to 1984. Most references to 1984 use terms like “Big Brother” in a pejorative way, e.g., “another example of Big Brother watching you.” But what is most powerful about 1984 is not that people saw the hyper-surveilling, truth-manipulating government as an intrusive presence in their life, but as a comforting one. The vast majority of people saw the government as something that brought benefits (peace and stability) and were more than happy to trade some small, meaningless rights for these benefits. What rights are you willing to trade for the benefits of social networking?


What the Crisis Nursery does

AddThis Social Bookmark Button

March 11th, 2010

Below is a nice interview with DiAnne Mueller, CEO of the St. Louis Crisis Nursery, talking about what the Crisis Nursery does and how you can help.


Gathering cloud at XGen

AddThis Social Bookmark Button

March 10th, 2010

If you are going to be at XGen next week and you are interested in cloud computing and its application to bioinformatics, be sure to stop and participate in the Cloud Computing in Bioinformatics discussion I will be “facilitating” on Wednesday morning (March 17). My talk is at 3:05 p.m. PT on Tuesday and I will be chairing the first session on Monday (if my plane is on time and the taxi is fast enough).


New data center approved

AddThis Social Bookmark Button

March 10th, 2010

The Genome Center recently received word that its grant proposal for a data center was approved (St. Louis Business Journal). The $14.3 million grant is funded by National Center for Research Resources and the money comes from ARRA. The grant, along with about $8 million dollars from Washington University, will allow us to essentially duplicate our current data center capacity. We took possession of our current data center in May 2008 and it is already 80-90% full, so this new data center will greatly help us to keep pace with all of the exciting, new projects we are undertaking.


Me, in podcast form

AddThis Social Bookmark Button

February 24th, 2010

I recently did an interview in advance of my talk at the XGen Congress next month in San Diego. The interview is about 14 minutes and discusses our work at The Genome Center in general and more specifically the software and IT infrastructure we have created to enable the analysis of the massive amounts of sequence data we generate. The interview is available to download as part of the XGen Congress podcast series.


The Pac’s out of the bag

AddThis Social Bookmark Button

February 23rd, 2010

Most of you have probably already seen this, but Pacific Biosciences announced the institutions that will be getting their first ten prototype instruments (Bio-IT World, GenomeWeb, MarketWatch). The Genome Center is among the institutions that will be getting one. It looks like PacBio will indeed be the first third generation sequencing company with instruments out in the wild. Don’t get too excited though, it’s probable that these third generation instruments will be a lot like the first batch of second generation instruments: it will take a while before they are ready for production sequencing, reliably producing good quality data. We’ll find out more from all the sequencing instrument companies in the coming days at AGBT.


Next-Generation Sequencing Informatics Update

AddThis Social Bookmark Button

February 19th, 2010

I updated the Next-Generation Sequencing Informatics table a few weeks ago but forgot to mention it on the blog. The main update was the 50G configuration of the Illumina GA IIx. Also, the Sides & Associates blog linked to my table and referred to it as a “somewhat dated comparison of next-generation sequencing platforms.” Just to clarify, this table represents average throughput for production systems; not vendor claims about throughput, not future vaporware (and Alejandro Gutierrez corrected his description in the post once I pointed this out). As new systems come online and further improvements are made to existing platforms, the table will be updated.


Puff piece

AddThis Social Bookmark Button

February 16th, 2010

Why should one be skeptical of all the information touting the wonders of cloud computing? This older, in-depth piece by Gartner, Hype Cycle for Cloud Computing, 2009, lays out the reasons pretty well. But one need not spend that much time reading about it. You can simply read this much shorter piece by Jason Stowe: Is the Future Of High- Performance Computing For Life Sciences Cloudy? Reading that story, one can only get the impression that the cloud is some panacea where all computational problems are solved. In fact, the picture is so rosy that one may become suspicious. So suspicious that one may read the About the Author section at the bottom of the piece an see that Mr. Stowe happens to be CEO of a company selling cloud computing services.

Jason Stowe is the founder and CEO of Cycle Computing, a provider of high-performance computing (HPC) and open source technology in the cloud. A seasoned entrepreneur and experienced technologist, Jason attended Carnegie Mellon and Cornell Universities.

No wonder he makes cloud computing sound so attractive. No mention of the IT expertise needed to get up and running on the cloud. No mention of the software engineering needed to ensure your programs run efficiently on the cloud. It may not be apparent from his article, but a program that runs well on one or ten computers does not necessarily run well on hundreds of computers. In fact, he implies the exact opposite.

For compute clusters as a service, the math is different: Having 40 processors work for 100 hours costs the same as having 1,000 processors run for 4 hours.

It may cost the same under that scenario, but not everything scales linearly. In fact, most things don’t and that less-than-linear scaling actually ends up making it cost more to get a shorter turnaround. This fact was clearly evident in the Crossbow paper where it cost $52 to complete the analysis in 6.5 hours but $84 to finish it under 3 hours (Table 4). The article fails to mention this; a marvel given the fact that the lack of good, scalable bioinformatics tools that can run well in highly parallel environments is perhaps the largest impediment to the adoption cloud computing in bioinformatics. Of course, I am sure he will gladly sell you consulting services that will get you up and running on the cloud. In short, this looks like a shill.

Unfortunately, omitting information is not the only problem with many of the stories about cloud computing; many also contain misinformation. For example, the story Gathering clouds and a sequencing storm in Nature Biotechnology mentions the software engineering challenges but erroneously states

…bioinformaticians might not be willing to spend the time to familiarize themselves with hadoop, the open source program needed to process large data sets on a cloud

What?!? You do not have to develop tools using Hadoop. Sure it is a nice platform that provides fault-tolerant parallelism, but it is by no means required by any cloud provider that I know of (not even Google, whose MapReduce framework provided the model for Hadoop!) nor is it the only way to achieve parallel processing (far from it). Amazon EC2 just provides you with a virtual machine with a basic operating system installed on it and remote access. You can do whatever you want with it after that. Google and Microsoft do require that you develop your code in their cloud framework, but you do not have to use Hadoop. For information on what you do have to do to run jobs on the major cloud providers, check out this article by Udayan Banerjee, Cloud Economics — Amazon, Microsoft, Google Compared, and each providers web site: Amazon AWS, Google App Engine, and Microsoft Windows Azure.

(How many bad cloud puns can I work into post titles? Stay tuned.)


In case you missed second grade

AddThis Social Bookmark Button

February 15th, 2010

Speaking of global climate change and snowstorms, NPR has a story this morning about how a lot of snow in Washington, DC does not contradict the theory of global climate change. For those who missed second grade, the piece contains this information.

A storm is part of what scientists classify as weather. Weather is largely influenced by local conditions and changes week to week. It’s fickle — fraught with wild ups and downs.

Climate is the long-term trend of atmospheric conditions across large regions, even the whole planet. Changes in climate are slow and measured in decades, not weeks.

Judging from the comments on the story, it seems some are not swayed by facts and logic. I am sure their objections are based on sound scientific inquiry and not politically motivated.


Seeing double

AddThis Social Bookmark Button

February 12th, 2010

It seems there is a shortage of news satire ideas. Two days ago, The Daily Show and The Colbert Report each had similar pieces on global climate change.

The Colbert Report Mon – Thurs 11:30pm / 10:30c
We’re Off to See the Blizzard
www.colbertnation.com
Colbert Report Full Episodes Political Humor Skate Expectations

Rachel Maddow also had similar sentiments.

Visit msnbc.com for breaking news, world news, and news about the economy

(For the slow learners out there, climate and weather are not the same thing. You were supposed to have learned this in second grade.)

Then, again, last night there were similar similarities (yes, that’s intentional) between The Daily Show and The Colbert Report’s reports on the response of Republicans (and Admiral Ackbar) to President Obama’s invitation to participate in a televised bipartisan summit on health care reform.

The Colbert Report Mon – Thurs 11:30pm / 10:30c
The Word – Political Suicide
www.colbertnation.com
Colbert Report Full Episodes Political Humor Skate Expectations

Well, it’s funny anyway.