<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>PolITiGenomics &#187; FLOSS</title>
	<atom:link href="http://www.politigenomics.com/tag/floss/feed" rel="self" type="application/rss+xml" />
	<link>http://www.politigenomics.com</link>
	<description>Politics, Information Technology, and Genomics</description>
	<lastBuildDate>Thu, 21 Apr 2011 17:49:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Bioinformatics and cloud computing</title>
		<link>http://www.politigenomics.com/2009/11/bioinformatics-and-cloud-computing.html</link>
		<comments>http://www.politigenomics.com/2009/11/bioinformatics-and-cloud-computing.html#comments</comments>
		<pubDate>Tue, 24 Nov 2009 19:54:22 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[compute]]></category>
		<category><![CDATA[FLOSS]]></category>
		<category><![CDATA[Illumina]]></category>
		<category><![CDATA[informatics]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1728</guid>
		<description><![CDATA[From the Using clouds for parallel computations in systems biology workshop at the recent SC09 conference (Informatics Iron writeup) to last month&#8217;s Genome Informatics meeting, everyone in bioinformatics is talking about cloud computing these days. Last week Steven Salzberg&#8216;s group published a paper on their Crossbow tool entitled Searching for SNPs with cloud computing (Cloudera [...]]]></description>
			<content:encoded><![CDATA[<p>From the <a href="http://www.mcs.anl.gov/events/workshops/sc09-sysbio/index.php">Using clouds for parallel computations in systems biology</a> workshop at the recent <a href="http://sc09.supercomputing.org/">SC09 conference</a> (<a href="http://www.genomeweb.com/blog/cloud-bio-computing-sc09">Informatics Iron writeup</a>) to last month&#8217;s <a href="http://www.genomeweb.com/informatics/genome-informatics-speakers-say-second-gen-sequencing-makes-giddy-times-bioinfor">Genome Informatics meeting</a>, everyone in bioinformatics is talking about cloud computing these days. Last week <a href="http://genome.fieldofscience.com/">Steven Salzberg</a>&#8216;s <a href="http://www.cbcb.umd.edu/~salzberg/">group</a> published a paper on their Crossbow tool entitled <a href="http://genomebiology.com/2009/10/11/R134">Searching for SNPs with cloud computing</a> (<a href="http://www.cloudera.com/blog/2009/10/15/analyzing-human-genomes-with-hadoop/">Cloudera blog post on Crossbow</a>). In the paper the authors describe how they were able to analyze the human sequence data <a href="http://www.nature.com/nature/journal/v456/n7218/abs/nature07484.html">published last year by BGI</a> using <a href="http://aws.amazon.com/ec2/">Amazon EC2</a>.  Specifically, they have developed an alignment (<a href="http://bowtie-bio.sourceforge.net/index.shtml">bowtie</a>) and SNP detection (<a href="http://soap.genomics.org.cn/soapsnp.html">SoapSNP</a>) pipeline that is executed in parallel across a cluster using the <a href="http://hadoop.apache.org/">Hadoop</a> framework (a <a href="http://fsf.org/">free software</a> implementation of <a href="http://labs.google.com/papers/mapreduce.html">Google&#8217;s MapReduce</a> framework).  Using a 40-node, 320-core EC2 cluster, they were able to analyze 38&times; coverage sequence data in about three hours. The whole analysis, including data transfer and storage on <a href="http://aws.amazon.com/s3/">Amazon S3</a>, cost about $125. You can find a more detailed cost breakdown and comparison on Gary Stiehr&#8217;s <a href="http://hpcinfo.com/2009/11/22/benchmarking-the-cloud-for-genomics/">HPCInfo post<a/> and more detail on the SNP detection on Dan Koboldt&#8217;s <a href="http://www.massgenomics.org/2009/11/crossbow-ngs-informatics-in-the-cloud.html">Mass Genomics post</a>.</p>
<p>For analyzing a single genome, you really can&#8217;t beat that price.  Of course, at the rate next-generation sequencing instruments are generating data, most people are not going to want to analyze just one genome. So the question becomes, what is the break even point? That is, how many genomes do you have to sequence to make buying compute resources cheaper than renting them from Amazon? We currently estimate that the fully loaded (node, chassis, rack, networking, etc.) cost of a single computational core is about $500. Thus, to purchase 320 cores would cost you about $160,000.  It&#8217;s going to take a lot (1280) genomes to hit that break even point. But, do you really need to analyze a genome in three hours? With the current per run throughput of a single Illumina GA IIx, it would take about four ten-day runs (40 days) to generate 38&times; coverage of a human genome. After each run, you could align the sequence data from that run. Each lane of data would take 8-12 core&middot;hours to align, so a whole run&#8217;s (eight lanes&#8217;) worth of data would take about 80 core&middot;hours. Therefore, even if you had just one core, you could align all the data before the next run completed. The consensus calling and variant detection portions of the pipeline typically take a handful of core&middot;hours and therefore do not change the economics; they too can be completed before the first run of the next genome is completed. Thus, with a $500 investment in computational resources, you can more than keep pace with the Illumina instrument. Note that I am completely excluding the cost of storage, as that will be needed for the data and results regardless of where the computation is done. Of course, you probably wouldn&#8217;t buy just <em>one</em> core. Checking over at the <a href="http://www.dell.com/us/en/highered/df.aspx?refid=df&#038;s=hied&#038;cs=RC956904&#038;~ck=mn">Dell Higher Education web site</a>, you can get a Quad Core Precision T3500n with 4 GiB of RAM (more RAM per core than the <a href="http://aws.amazon.com/ec2/#instance">Amazon EC2 Extra Large Instance</a> used in the paper) and 750 GB local storage capacity (about the same storage per core as the Extra Large Instance) for $1700. You would need less than one core&#8217;s (25%) of that workstation&#8217;s capacity dedicated to alignment of and variant detection on data from a single Illumina GA IIx (thanks to <a href="http://en.wikipedia.org/wiki/Burrows-Wheeler_transform">Burrows-Wheeler Transform</a> aligners like bowtie and <a href="http://bio-bwa.sourceforge.net/">bwa</a>). Using the single core numbers, the break even point for purchase versus cloud is less than five whole genomes. Using  the entire cost of the Dell workstation (even though you require less than 25% of its computational capacity), the break even point is about 14 genomes. It would take about 1.5 years (about half the expected life of IT hardware) at current throughput to sequence 14 genomes with a single Illumina GA IIx. At data rates expected in January 2010, it would take less than a year to break even.</p>
<p>These numbers indicate that unless you are just sequencing a few genomes, you are probably better off purchasing a (possibly single node) cluster. With the proliferation of sequencing applications and publications in the last couple years, not many researchers will fall into the &#8220;few genomes&#8221; bin. Our experience has been that the more sequencing data people get, the more they want. Another way to look at this is that the entire analysis computational hardware costs (<$1700) is less than 1% of the sequencing instrument cost; or the computational cost to analyze a whole genome (<$500) is less than 1% of the total data generation costs (reagents, flow cells, instrument depreciation, technician time, etc.). This is all not to say that there is not a place for cloud and other distributed computing frameworks in bioinformatics, but that's the topic of a future post.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/11/bioinformatics-and-cloud-computing.html/feed</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Download sequence data fast</title>
		<link>http://www.politigenomics.com/2009/07/download-sequence-data-fast.html</link>
		<comments>http://www.politigenomics.com/2009/07/download-sequence-data-fast.html#comments</comments>
		<pubDate>Thu, 16 Jul 2009 21:22:47 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[FLOSS]]></category>
		<category><![CDATA[informatics]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1363</guid>
		<description><![CDATA[More and more data are being submitted to the NCBI Short Read Archive (SRA). So you may ask yourself, &#8220;How am I going to download all that data?&#8221; Well, as luck would have it, you can download it using the same high-speed network protocol that we use to upload it, Aspera. You can download the [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.asperasoft.com/download/sw/connect/AsperaConnect"><img alt="" src="http://www.asperasoft.com/images/pic_th/icon_connect.png" title="Aspera Connect" class="alignright" width="76" height="70" /></a></p>
<p>More and more data are being submitted to the <a href="http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi">NCBI Short Read Archive (SRA)</a>. So you may ask yourself, &#8220;How am I going to download all that data?&#8221; Well, as luck would have it, you can download it using the same high-speed network protocol that we use to upload it, <a href="http://www.asperasoft.com/">Aspera</a>. You can <a href="http://www.asperasoft.com/download/sw/connect/AsperaConnect">download the Aspera Connect browser plugin</a> (it is offered at not cost, but sadly is not <a href="http://www.gnu.org/philosophy/free-sw.html">free</a>), install it, and then begin downloading data at near line speed in no time. Of course, if your line speed is not so hot, Aspera cannot help you much.</p>
<p>In a previous post, I mentioned some of the <a href="http://www.politigenomics.com/2008/05/n-genomes.html">difficulties in using the Aspera scp client</a>. The president of Aspera, Michelle Munson, posted a <a href="http://www.politigenomics.com/2008/05/n-genomes.html?comment-9839">good retort to my musings</a>, which I reproduce below for ease of viewing. Basically, to avoid problems, don&#8217;t allow Aspera scp to transfer data faster than your system can provide it. If you do so, I can report that Aspera scp behaves quite reliably (and still speedily). Well, except for the time NCBI overrode <em>on the server side</em> the bandwidth limit we set <em>on the client side</em>, increasing it beyond what the back-end disk systems were happy with. After contacting NCBI, we were told they wouldn&#8217;t do that any more.</p>
<p>Here is Ms. Munson&#8217;s comment.</p>
<blockquote><p>Hello all,</p>
<p>On the use of Aspera Scp, the stalling behavior described is a result of artificially induced heavy packet loss for the FASP protocol, usually due to setting a target transfer rate that significantly exceeds the throughput to the storage system on the receiver side. The other cause is bandwidth shaping/artificial dropping of UDP traffic along the transmission path.</p>
<p>The Aspera transfer logs (routed to syslog on Unix systems) have detailed statistics that we can interpret for you which will indicate the root cause.</p>
<p>Assuming that the receiver side I/O throughput is overdriven, you can verify this for yourselves by running a 3rd party disk benchmarking utility such as bonnie++. Use bonnie to measure the write throughput for blocks of 64K and 1 MB (Aspera software uses a configurable block size, 64K by default).</p>
<p>Once you know the disk throughput bottleneck, you can either set a target rate that does not exceed, or better yet, as of our 2.2 release (available as of April 2009) you can configure on the storage rate control option, which will automatically adapt the transmission rate to the storage throughput. This is much like network congestion control extended to the storage systems (a patent-pending innovation by our company).</p>
<p>If you have any questions or problems on the above, be glad to help over here at Aspera. You can reach us at <a href="mailto:support@asperasoft.com">support@asperasoft.com</a> or email me directly, <a href="mailto:michelle@asperasoft.com">michelle@asperasoft.com</a>.</p>
<p>Thank you,<br />Michelle Munson<br />President, Aspera, Inc.</p>
</blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/07/download-sequence-data-fast.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cancer Genomics Tools</title>
		<link>http://www.politigenomics.com/2009/07/cancer-genomics-tools.html</link>
		<comments>http://www.politigenomics.com/2009/07/cancer-genomics-tools.html#comments</comments>
		<pubDate>Wed, 15 Jul 2009 19:44:18 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[FLOSS]]></category>
		<category><![CDATA[health]]></category>
		<category><![CDATA[informatics]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[wustl]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1340</guid>
		<description><![CDATA[The folks in our Medical Genomics group have compiled a list of bioinformatics tools that are useful in the field of cancer genomics. The tools, mostly focused on next/second-generation sequence data, are organized by their purposes (sequence alignment, variant detection, annotation/pathway, significance analysis, group significance analysis, gene expression clustering, clinical correlation, cross-tumor analysis, and graphical/visualization). [...]]]></description>
			<content:encoded><![CDATA[<p>The folks in our Medical Genomics group have compiled a list of <a href="http://genome.wustl.edu/tools/cancer-genomics">bioinformatics tools that are useful in the field of cancer genomics</a>. The tools, mostly focused on next/second-generation sequence data, are organized by their purposes (sequence alignment, variant detection, annotation/pathway, significance analysis, group significance analysis, gene expression clustering, clinical correlation, cross-tumor analysis, and graphical/visualization). A lot of the tools were developed at <a href="http://genome.wustl.edu/">The Genome Center</a>, but they have also included tools from many different investigators that we find useful. Many, but unfortunately not all, of the tools are free/open source software.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/07/cancer-genomics-tools.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OSCON lead up</title>
		<link>http://www.politigenomics.com/2009/07/oscon-lead-up.html</link>
		<comments>http://www.politigenomics.com/2009/07/oscon-lead-up.html#comments</comments>
		<pubDate>Mon, 13 Jul 2009 17:49:14 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[FLOSS]]></category>
		<category><![CDATA[health]]></category>
		<category><![CDATA[informatics]]></category>
		<category><![CDATA[OSCON]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[wustl]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1323</guid>
		<description><![CDATA[Last week I did an interview with James Turner at O&#8217;Reilly about my upcoming talk at OSCON. It turns out that James is a bit of a genomics nut and therefore had a lot of insightful questions about the current state of genomics and health. Hopefully my responses were as good as his questions. You [...]]]></description>
			<content:encoded><![CDATA[<p>Last week I did an interview with <a href="http://radar.oreilly.com/jamest/">James Turner</a> at <a href="http://oreilly.com/">O&#8217;Reilly</a> about my <a href="http://www.politigenomics.com/2009/04/oscon-2009.html">upcoming talk at OSCON</a>. It turns out that James is a bit of a genomics nut and therefore had a lot of insightful questions about the current state of genomics and health. Hopefully my responses were as good as his questions. You can judge for yourself by listening to the interview or reading the transcript: <a href="http://radar.oreilly.com/2009/07/sequencing-a-genome-a-week.html">Sequencing a Genome a Week</a>.</p>
<p><strong>Update:</strong> The story has been posted on <a href="http://science.slashdot.org/story/09/07/13/2129229/Sequencing-a-Human-Genome-In-a-Week">Slashdot</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/07/oscon-lead-up.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Illumina cluster needs</title>
		<link>http://www.politigenomics.com/2009/06/illumina-cluster-needs.html</link>
		<comments>http://www.politigenomics.com/2009/06/illumina-cluster-needs.html#comments</comments>
		<pubDate>Thu, 18 Jun 2009 16:13:36 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[compute]]></category>
		<category><![CDATA[FLOSS]]></category>
		<category><![CDATA[Illumina]]></category>
		<category><![CDATA[LSF]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1254</guid>
		<description><![CDATA[There is an interesting thread over at the Solexa Google Group about the IT infrastructure needed to support an Illumina Genome Analyzer (GA). The discussion focuses mostly on the cluster and, to a lesser extent, the storage and network required to operate the instrument and generate sequence data (primary analysis). At The Genome Center, we [...]]]></description>
			<content:encoded><![CDATA[<p>There is an interesting thread over at the <a href="http://groups.google.com/group/solexa?hl=en">Solexa Google Group</a> about the <a href="http://groups.google.com/group/solexa/browse_thread/thread/38ff88dcf5f5df17?hl=en">IT infrastructure needed to support an Illumina Genome Analyzer (GA)</a>. The discussion focuses mostly on the cluster and, to a lesser extent, the storage and network required to operate the instrument and generate sequence data (primary analysis). At <a href="http://genome.wustl.edu/">The Genome Center</a>, we use Platform LSF HPC as our batch scheduler and currently use <a href="http://www.politigenomics.com/2008/03/illumina-genome-analyzer-pipeline-and.html">lsgmake-gap</a> to execute the GAPipeline (the Illumina primary analysis software) in parallel on our cluster. However, GAPipeline is developed and tested by Illumina on a cluster managed by <a href="http://www.sun.com/software/sge/">Sun Grid Engine (SGE)</a>, which is <a href="http://gridengine.sunsource.net/">free/open source software</a>. This situation creates some headaches for us because as the internals of GAPipeline change, we need to <a href="http://www.politigenomics.com/2009/02/lsgmake-gap-for-gapipeline-13.html">regularly update lsgmake-gap</a> so that GAPipeline will continue to run properly on our cluster. Several years ago when we migrated to LSF, the driving force for the selection of LSF was that it was the only batch scheduler that could handle scheduling 50,000+ jobs at a time (a regular occurrence on our cluster). Fortunately, users may no longer have to choose between scalability and ease of use when running GAPipeline as part of their larger computational needs. Chris Dagdigian, who writes the <a href="http://gridengine.info/">gridengine.info blog</a>, had this information about the current capabilities of SGE.</p>
<blockquote><ol>
<li>SGE 6.2 design goal includes supporting a single array job with 500,000 tasks and hundreds of thousands of concurrent jobs</li>
<li>People have been running hundreds of thousands of SGE jobs per week since the SGE 5.3 days many years ago
<li>I personally know of several sites pushing hundreds of thousands of heavy SGE jobs per week through their systems right now
<li>SGE 6.2 runs a 62,000 core cluster in Texas (RANGER) and has been for some time</li>
</ol>
<p>&#8220;tens of thousands of jobs&#8221; is actually pretty easy with Grid Engine and has been for some time, scaling issues encountered in this range have more to do with bad spooling decisions, filesystem design and occasionally an overwhelmed qmaster host. The developers have worked quite a bit this year to improve threading performance, reduce memory footprints and remove things like external RSH methods that consumed system resources like filehandles and TCP ports etc.</p>
<p>This is especially evident in the SGE 6.2  and 6.2u1 release series where speed and scaling were specifically addressed as part of the design effort (6.2u3 and 6.3 will introduce new features). This is the reason why the <em>SGE scheduler is now a thread within the qmaster</em> &#8211; one of the more obvious user-visible changes made recently. (emphasis mine &#8211; dd)</p>
<p>There are many reasons why one would chose between LSF vs SGE (I have used both for years now) but scaling is not one of the significant selection factors. Features, price, APIs and quality of documentation are far more important along with community adoption/support.</p>
</blockquote>
<p>I would guess breaking out the scheduler into its own thread is a major factor in SGE&#8217;s ability to manage so many jobs. This was the major deficiency of SGE and other batch schedulers we tested at the time. Several systems designed their schedulers to automatically run through the list of jobs a certain intervals. With a lot of jobs in the queue, the scheduler would not finish its previous traversal before the new one was scheduled to start. Depending on the design implementation this meant that either the original scheduling was killed and the scheduler never processed some jobs or that scheduler threads kept spawning until the resources were exhausted on the master node (that&#8217;s bad).</p>
<p>(A couple asides here, since GAPipeline is built on Makefile&#8217;s, another option that came up in the thread was parallel execution across an LSF cluster using <a href="http://distmake.sourceforge.net/pmwiki/pmwiki.php">distmake</a>. Because of <a href="http://hpcinfo.com/">our interest</a> in <a href="http://www.opensciencegrid.org/">grid computing</a>, we are currently investigating replacing LSF with <a href="http://www.cs.wisc.edu/condor/">Condor</a>.)</p>
<p>Of course, with the roll out of SCS2.4 with RTA (real-time analysis), most of the primary analysis is now done on the instrument control computer. Thus, all of this talk about the requirements to produce sequence from the machine are made much less important. Now there is only one stage of the pipeline, the alignment and reporting (called GERALD), now run off the instrument computer. The most computationally intensive part of this stage of the pipeline is the alignment (ELAND and its post-processing) and it can only be made parallel on a per lane basis, i.e., eight ways.</p>
<p>Of course, there is also the specter of the requirements for sequence analysis at Illumina GA IIx scale, but that&#8217;s a whole other post&hellip;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/06/illumina-cluster-needs.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Learning opportunities</title>
		<link>http://www.politigenomics.com/2009/06/learning-opportunities.html</link>
		<comments>http://www.politigenomics.com/2009/06/learning-opportunities.html#comments</comments>
		<pubDate>Wed, 17 Jun 2009 21:28:07 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[IT]]></category>
		<category><![CDATA[FLOSS]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1252</guid>
		<description><![CDATA[These links came to my attention this past weekend and I thought they might be of use to some of the readers here. First, you can access all course materials, even lectures, for the CS61A: Structure and Interpretation of Computer Programs course at UC Berkeley. The course comes highly recommended. Second, Melissa Kahney has aggregated [...]]]></description>
			<content:encoded><![CDATA[<p>These links came to my attention this past weekend and I thought they might be of use to some of the readers here. First, you can access all course materials, even lectures, for the <a href="http://inst.eecs.berkeley.edu/~cs61a/sp08/">CS61A: Structure and Interpretation of Computer Programs</a> course at UC Berkeley. The course comes highly recommended. Second, Melissa Kahney has aggregated links for a bunch of <a href="http://educhoices.org/articles/Useful_Tutorials_on_Linux_and_UNIX_for_Beginners_and_Experts_Alike.html">UNIX and GNU/Linux tutorials</a> grouped by topic and target audience (beginner and expert).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/06/learning-opportunities.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Great Expectations</title>
		<link>http://www.politigenomics.com/2009/06/great-expectations.html</link>
		<comments>http://www.politigenomics.com/2009/06/great-expectations.html#comments</comments>
		<pubDate>Mon, 15 Jun 2009 14:15:38 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[FLOSS]]></category>
		<category><![CDATA[health]]></category>
		<category><![CDATA[OSCON]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[wustl]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1246</guid>
		<description><![CDATA[A colleague of mine at The Genome Center pointed me to this O&#8217;Reilly Radar blog post about the talks at OSCON 2009 that Allison Randal, one of the organizers, considers highlights. Very kindly, she mentions my talk, The Freedom to Cure Cancer. I have a rough outline of the talk clanging around in my head. [...]]]></description>
			<content:encoded><![CDATA[<p>A colleague of mine at <a href="http://genome.wustl.edu/">The Genome Center</a> pointed me to this <a href="http://radar.oreilly.com/2009/06/oscon-2009-highlights-and-earl.html">O&#8217;Reilly Radar blog post about the talks at OSCON 2009 that Allison Randal, one of the organizers, considers highlights</a>. Very kindly, she mentions my talk, <a href="http://en.oreilly.com/oscon2009/public/schedule/detail/7985">The Freedom to Cure Cancer</a>. I have a rough outline of the talk clanging around in my head. Having it take shape on a slide deck is going to take some work (and a lot of time on Google image search). Hopefully, the talk will live up to the hype.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/06/great-expectations.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Video of St. Louis Perl Mongers talk</title>
		<link>http://www.politigenomics.com/2009/06/video-of-st-louis-perl-mongers-talk.html</link>
		<comments>http://www.politigenomics.com/2009/06/video-of-st-louis-perl-mongers-talk.html#comments</comments>
		<pubDate>Mon, 08 Jun 2009 14:29:41 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[FLOSS]]></category>
		<category><![CDATA[health]]></category>
		<category><![CDATA[informatics]]></category>
		<category><![CDATA[OSCON]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[wustl]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1192</guid>
		<description><![CDATA[Those fine folks at St. Louis Perl Mongers have posted videos of my talk this past February. Looks like I am going to have to come up with a whole new talk for OSCON.]]></description>
			<content:encoded><![CDATA[<p>Those fine folks at <a href="http://stlouisperlmongers.blogspot.com/">St. Louis Perl Mongers</a> have posted <a href="http://stlouisperlmongers.blogspot.com/2009/06/february-meeting-video.html">videos</a> of <a href="http://www.politigenomics.com/2009/02/st-louis-perl-mongers.html">my talk this past February</a>. Looks like I am going to have to come up with a whole new talk for <a href="http://www.politigenomics.com/2009/04/oscon-2009.html">OSCON</a>.</p>
<div class="widevideo"><embed src="http://blip.tv/play/AfrGdgA" type="application/x-shockwave-flash" width="500" height="297" allowscriptaccess="always" allowfullscreen="true"></embed></embed></div>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/06/video-of-st-louis-perl-mongers-talk.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OSCON Friends and Family discount</title>
		<link>http://www.politigenomics.com/2009/06/oscon-friends-and-family-discount.html</link>
		<comments>http://www.politigenomics.com/2009/06/oscon-friends-and-family-discount.html#comments</comments>
		<pubDate>Wed, 03 Jun 2009 05:54:56 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[FLOSS]]></category>
		<category><![CDATA[informatics]]></category>
		<category><![CDATA[OSCON]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[wustl]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1148</guid>
		<description><![CDATA[Do you want to attend OSCON this year but just realized you missed the early registration deadline? Have no fear, through a friends a family discount, you can get the early registration discount (20%) through June 23. Hello David, We are extending early registration until June 23, 2009 for OSCON. A savings of $250 off [...]]]></description>
			<content:encoded><![CDATA[<p>Do you want to attend <a href="http://en.oreilly.com/oscon2009">OSCON</a> this year but just realized you missed the early registration deadline? Have no fear, through a friends a family discount, you can get the early registration discount (20%) through June 23.</p>
<blockquote><p>
Hello David,</p>
<p>We are extending early registration until June 23, 2009 for OSCON. A savings of $250 off standard registration. Because you have been accepted as a speaker for this year&#8217;s show, we would like to extend an additional savings for your friends, family and co-workers. If you have a blog, newsletter or website, please feel free to use and distribute this 20% off code: os09fos.</p>
<p>Act now while early registration is still in effect.
</p></blockquote>
<p>So head on over the the <a href="https://en.oreilly.com/oscon2009/public/register">OSCON registration site</a> and enter the discount code <code>os09fos</code>. I hope to see you there!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/06/oscon-friends-and-family-discount.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>UR so beautiful to me</title>
		<link>http://www.politigenomics.com/2009/05/ur-so-beautiful-to-me.html</link>
		<comments>http://www.politigenomics.com/2009/05/ur-so-beautiful-to-me.html#comments</comments>
		<pubDate>Mon, 01 Jun 2009 03:40:29 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[FLOSS]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[wustl]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1125</guid>
		<description><![CDATA[Thanks to the hard work of a whole host of developers at The Genome Center, especially Scott Smith and Tony Brummett, UR, the class framework and object-relational mapping (ORM) layer we developed and use at The Genome Center, has been released on CPAN (actually, I am a little late announcing this, Scott has already uploaded [...]]]></description>
			<content:encoded><![CDATA[<p>Thanks to the hard work of a whole host of developers at <a href="http://genome.wustl.edu/">The Genome Center</a>, especially Scott Smith and Tony Brummett, <a href="http://search.cpan.org/~sakoht/UR-0.7/lib/UR.pm">UR</a>, the <a href="http://everything2.com/title/class%2520framework">class framework</a> and <a href="http://en.wikipedia.org/wiki/Object-relational_mapping">object-relational mapping (ORM)</a> layer we developed and use at The Genome Center, has been <a href="http://search.cpan.org/search?query=UR&#038;mode=module">released on CPAN</a> (actually, I am a little late announcing this, Scott has already uploaded version 0.3). UR is the foundation of the <a href="http://www.politigenomics.com/2009/05/biology-of-genomes-poster.html">analysis pipelines I presented at the Biology of Genomes meeting at Cold Spring Harbor</a> and the software I will be presenting at <a href="http://www.politigenomics.com/2009/04/oscon-2009.html">OSCON</a> in July. As the list of modules indicate, it is quite a piece of software. It is able to interface with multiple databases simultaneously, handling cross-database joins and transactions while caching information in memory for impressive performance enhancements. If you need something more scalable and powerful than the ORM&#8217;s available, i.e., you need an enterprise ORM, you should have a look. For an introduction to using UR in your environment, see the <a href="http://search.cpan.org/~sakoht/UR-0.7/lib/UR/Manual.pod">fine manual</a>. Just like <a href="http://www.perl.org/">Perl</a>, UR is released under a dual license, Artistic and <a href="http://www.gnu.org/copyleft/gpl.html">GPL</a>; you choose which one to use.</p>
<p><strong>Update:</strong> Updated links to latest released version (0.6).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/05/ur-so-beautiful-to-me.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

