<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>PolITiGenomics &#187; CSHL</title>
	<atom:link href="http://www.politigenomics.com/tag/cshl/feed" rel="self" type="application/rss+xml" />
	<link>http://www.politigenomics.com</link>
	<description>Politics, Information Technology, and Genomics</description>
	<lastBuildDate>Thu, 21 Apr 2011 17:49:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Double standard</title>
		<link>http://www.politigenomics.com/2009/06/double-standard.html</link>
		<comments>http://www.politigenomics.com/2009/06/double-standard.html#comments</comments>
		<pubDate>Fri, 05 Jun 2009 21:25:45 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[blog]]></category>
		<category><![CDATA[CSHL]]></category>
		<category><![CDATA[freedom]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1161</guid>
		<description><![CDATA[Since the Biology of Genomes meeting in early May, a tempest has been brewing. It is only in this last week that this tempest has gathered enough strength that it could no longer be contained by those who have chosen to stir it up. The esteemed Daniel MacArthur blogged and tweeted from the conference. This [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://scienceblogs.com/geneticfuture/"><img src="http://www.politigenomics.com/wp-content/uploads/2009/06/genetic-future.png" alt="Genetic Future" title="Genetic Future" width="296" height="86" class="alignright size-full wp-image-1170" /></a></p>
<p>Since the <a href="http://meetings.cshl.edu/meetings/genome09.shtml">Biology of Genomes</a> meeting in early May, a tempest has been brewing. It is only in this last week that this tempest has gathered enough strength that it could no longer be contained by those who have chosen to stir it up. The esteemed Daniel MacArthur <a href="http://scienceblogs.com/geneticfuture/">blogged</a> and <a href="http://twitter.com/dgmacarthur">tweeted</a> from the conference. This apparently caught the attention of the conference organizers and <a href="http://www.genomeweb.com/">GenomeWeb</a>. As journalists, the folks at GenomeWeb are required to follow CSHL&#8217;s media rules which require that journalists get the permission of a speaker before publishing any information from her talk. GenomeWeb saw a double standard when comparing what Daniel was allowed to do and what they were allowed to do. They then contacted CSHL. The initial write-up of the gathering storm in <a href="http://blogs.sciencemag.org/scienceinsider/2009/06/cold-spring-har.html">Science Insider</a> characterized this contact as <em>complaining</em>. GenomeWeb characterized it as <a href="http://scienceblogs.com/geneticfuture/2009/06/on_the_challenges_of_conferenc.php#comment-1680264">asking CSHL for <em>clarification of their policy</em></a> (in a comment on a <a href="http://scienceblogs.com/geneticfuture/2009/06/on_the_challenges_of_conferenc.php">response posted by Daniel in his blog, Genetic Future</a>). Of course this attempt to, in effect, censor has only served to bring more attention to Daniel&#8217;s blog (the so-called <a href="http://en.wikipedia.org/wiki/Streisand_effect">Streisand effect</a>), and has resulted in a number of responses from other bloggers like <a href="http://www.fejes.ca/2009/06/rights-of-science-blogging.html">Anthony Fejes</a>, <a href="http://scienceblogs.com/drugmonkey/2009/06/secret_science_again.php">DrugMonkey</a>, and even <a href="http://www.genomeweb.com/blog/conferences-blogging-and-media">GenomeWeb&#8217;s Daily Scan</a>, comments (some quite passionate) on the Science Insider story, Daniel&#8217;s response, and <a href="http://friendfeed.com/sciphu/4bf7c857/on-challenges-of-conference-blogging-genetic">FriendFeed</a>, as well as a couple well-reasoned pieces on where the policy should head from here by <a href="http://scienceblogs.com/notrocketscience/2009/04/on_science_blogging_and_mainstream_science_writing.php">Ed Yong</a> and <a href="http://2020science.org/2009/06/03/to-tweet-or-not-to-tweet/">Andrew Maynard</a>. Daniel himself provides a nice <a href="http://scienceblogs.com/geneticfuture/2009/06/social_media_and_scientific_co.php">summary of it all in a follow-up post</a>. With all that sound and fury, there is not much to add on the subject other than to say I suppose I am lucky that the 500 or so emails I had to pore through each night after the meeting ended at 10:30 or 11 p.m. prevented me from posting any commentary during the meeting (well, the emails plus the fact that I knew Daniel would do a better job than me).</p>
<p>Taking a step back, there is a larger double standard at play here than the distinction between professional journalists and peddlers of new media. Many of the conclusions around whether CSHL is right in restricting any type of journalist focus on the type of conference and the expectations that type of conference creates in the minds of the presenters. At a private, invitation-only conference, no publishing. At a breaking results conference like Biology of Genomes, get permission. At an open conference, anything goes. So then one might ask: why aren&#8217;t all conferences open? The whole notion that presenting something at a conference that has some understanding of respecting others&#8217; unpublished work is a bit ridiculous (this point has been made by others, along with the fact that Biology of Genomes is over-subscribed every year; getting people in the door is not a problem). But I am not even going to debate that point. The more interesting question is: why aren&#8217;t all <em>data and research</em> released rapidly and freely available? Since the <a href="http://www.sanger.ac.uk/HGP/policy-forum.shtml">Bermuda Principles</a> were agreed to in 1996, all genome sequencing centers have submitted their data, from raw sequence data to finished sequence to assemblies to annotation, to public repositories as quickly after generation as possible. These principles were reinforced by the <a href="http://www.genome.gov/10506537">Fort Lauderdale agreement</a> in 2003 which added a provision that protected the production centers&#8217; right to first publication. But as we have seen recently, that provision of the <a href="http://genomebiology.com/2009/10/4/105">Fort Lauderdale agreement is not always enforced</a>. As sequencing has moved into medical applications, the sequencing centers have taken great pains to release human sequence data in a responsible manner, but still rapidly. What&#8217;s more, they now also release the detected variants fully annotated and correlated with phenotypic information in protected access databases available to any researcher. As data that requires more and more analysis and significant human curation are made rapidly available well before publication, the production centers become ever more vulnerable to getting &#8220;scooped&#8221; on their hard won findings.</p>
<p>As Church and Hillier properly conclude in the above referenced article<br />
<blockquote>Sequence data are now easier to produce, but decisions about <em>timelines for data release, publication, and ownership</em> and standards for assembly comparison and quality assessment, as well as the tools for managing and displaying these data, need considerable attention in order to best serve the entire community. (Emphasis mine)</p></blockquote>
<p> This conclusion begets many questions. If the rapid release described in the Bermuda Principles still holds true, why does it only apply to large-scale sequencing centers? Many researchers are generating more sequence in a month than the Human Genome Project was able to produce in a year. As they continue to be allowed to perform pre-publication (as opposed to post-generation) data submission, why are they not being held to the same standard as the large-scale sequencing centers?</p>
<p>Stepping back further, does dumping all of those data, literally terabytes and terabytes, into public nucleotide repositories like the SRA and ERA as soon as it is generated still make sense? Who has the bandwidth to download and use it all? Mainly only those centers that are submitting it. For human data, a single instrument run contains enough data to identify an individual. Should there not be at least some provisions in place to allow data generators to properly assess and quality control their data?</p>
<p>The human reference has been published (with a recent update to <a href="http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/index.shtml">GRCh37</a>). The blueprint exists. Thus, many of the reasons underlying the conclusions of the Bermuda Principles are no longer applicable. So should those open access principles be applied more widely to other areas of biology and science at large or should they no longer apply to sequence data from a genome for which a reference exists? It is time to rethink the current policies and begin to apply them to <strong>all</strong> sequence generators. And people are doing <a href="http://www.sciencemag.org/cgi/content/full/324/5930/1000-b">just that</a>. The double standard must end.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/06/double-standard.html/feed</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Biology of Genomes Poster</title>
		<link>http://www.politigenomics.com/2009/05/biology-of-genomes-poster.html</link>
		<comments>http://www.politigenomics.com/2009/05/biology-of-genomes-poster.html#comments</comments>
		<pubDate>Wed, 13 May 2009 16:55:59 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[CSHL]]></category>
		<category><![CDATA[informatics]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[wustl]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1086</guid>
		<description><![CDATA[Several people have asked for an electronic copy of my poster, Maximizing Utility of Genome Sequence Data (pdf) (posted on the Internet Archive). As is hopefully clear from the poster, in addition to high-throughput sequencing, we now have high-throughput sequence analysis. After listening to Lynda Chin&#8216;s talk on the first evening of the conference, which [...]]]></description>
			<content:encoded><![CDATA[<div style="margin-left: 33px;"><a href="http://ia331438.us.archive.org/1/items/MaximizingUtilityOfGenomeSequenceData/ddooling-cshl-bog-2009.pdf"><img alt="CSHL Biology of Genomes poster" src="http://ia341229.us.archive.org/1/items/MaximizingUtilityOfGenomeSequenceData/ddooling-cshl-bog-2009.png" title="Maximizing Utility of Genome Sequence Data" width="400" height="400" /></a></div>
<p>Several people have asked for an electronic copy of my poster, <a href="http://www.archive.org/download/MaximizingUtilityOfGenomeSequenceData/ddooling-cshl-bog-2009.pdf">Maximizing Utility of Genome Sequence Data (pdf)</a> (posted on the <a href="http://www.archive.org/details/MaximizingUtilityOfGenomeSequenceData">Internet Archive</a>). As is hopefully clear from the poster, in addition to high-throughput sequencing, we now have high-throughput sequence analysis. After listening to <a href="http://genomic.dfci.harvard.edu/">Lynda Chin</a>&#8216;s talk on the first evening of the conference, which described the arduous process of translating a <em>single</em> putative <a href="http://www.nature.com/nature/journal/v458/n7239/box/nature07943_BX1.html">cancer driver mutation</a> to its function in the cell, one can&#8217;t help but feel we are just kicking the can down the road here. The alleviation of one bottleneck just creates another. This was the case with the PC, where after CPUs became faster and faster, other components, e.g., memory, network, and disk I/O, became bottlenecks. This has also been the case with high-throughput production sequencing. You buy more sequencers, you need more disk, then need more CPUs to analyze all the data, and then you need to upgrade your network to move all the data around. Now in genomics, we have a situation where we are able to generate lots of data and lots of variants which may play a role in cancer. How will we be able to determine the function of all these variants?  What technologies are on the horizon that will enable high-throughput functional genomics?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/05/biology-of-genomes-poster.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>CSHL Biology of Genomes 2009</title>
		<link>http://www.politigenomics.com/2009/04/cshl-biology-of-genomes-2009.html</link>
		<comments>http://www.politigenomics.com/2009/04/cshl-biology-of-genomes-2009.html#comments</comments>
		<pubDate>Mon, 27 Apr 2009 21:50:46 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[1000 Genomes]]></category>
		<category><![CDATA[compute]]></category>
		<category><![CDATA[CSHL]]></category>
		<category><![CDATA[informatics]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[wustl]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=1059</guid>
		<description><![CDATA[I should have posted this earlier, but things have been pretty busy. In any event, I will be presenting a poster next week at the Biology of Genomes meeting at Cold Spring Harbor. The poster is entitled &#8220;Maximizing utility of genome sequence data&#8221;. Here is the abstract. Advances in DNA sequencing technologies over the past [...]]]></description>
			<content:encoded><![CDATA[<p>I should have posted this earlier, but things have been pretty busy. In any event, I will be presenting a poster next week at the <a href="http://meetings.cshl.edu/meetings/genome09.shtml">Biology of Genomes</a> meeting at <a href="http://www.cshl.edu/">Cold Spring Harbor</a>. The poster is entitled &#8220;Maximizing utility of genome sequence data&#8221;. Here is the abstract.<br />
<blockquote>Advances in DNA sequencing technologies over the past few years have led to data generation and processing rates that far outpace Moore&#8217;s Law and storage capacity improvements.  As a result, there will come a time when one will no longer be able to “throw more money” at the problems presented by DNA sequencing, i.e., researchers will not be able to keep pace with data generation by purchasing more and more storage and computational nodes.  Proposed sequencing platform improvements and the rapid rate of adoption of these technologies by labs large and small will only hasten the time when the old solutions will no longer apply.  The history of freely shared sequence data through the NCBI and EBI Trace Archives transform the very difficult problem of massive sequence data generation into a problem of data generation and data sharing on a scale heretofore unimaginable.  Over the last year, several organizations, e.g., MGED, NCI, Illumina, 1000 Genomes DCC, and NHGRI, have convened meetings to discuss the problems presented by the massive amounts of data generated by next-generation sequencing technologies.  As prologue, brief overviews of these meetings will be presented along with approaches to dealing with massive data generation rates from other disciplines, e.g., high energy physics and high-resolution medical imaging.  The Genome Center at Washington University in St. Louis, due to its large-scale sequencing operation and whole-genome analysis capabilities, experiences the difficulties presented by massively-parallel sequencing platforms acutely.  To address the many challenges presented by the scale of data generation and requisite analysis, we have developed a multidisciplinary approach involving experts in biology, genomics, bioinformatics, computer science, information technology, and engineering.  The resulting approach involves many techniques including intelligent compression and data reduction, data aging, archiving, parallelization, fault-tolerant workflows, scalable software frameworks, and multivariate/multi-genome visualization and comparison, which leverage and extend our laboratory information management system.  This approach and its application to the sequencing and analysis of cancer samples will be presented.</p></blockquote>
<p> It&#8217;s a lot to cover in 4 ft &times; 4 ft, but I&#8217;ll do my best. If you are going to be at Cold Spring Harbor, stop by and say hello.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2009/04/cshl-biology-of-genomes-2009.html/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Short Read Archive</title>
		<link>http://www.politigenomics.com/2008/05/short-read-archive.html</link>
		<comments>http://www.politigenomics.com/2008/05/short-read-archive.html#comments</comments>
		<pubDate>Thu, 29 May 2008 18:37:01 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[CSHL]]></category>
		<category><![CDATA[informatics]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=86</guid>
		<description><![CDATA[While at Cold Spring Harbor, I attended a presentation by Valex about their format for storing data in the NCBI Short Read Archive (SRA) (I mentioned the SRA in a previous post). Rather than storing the data in the current standard for transferring massively-parallel sequencing data, SRF, they have designed a new format that builds [...]]]></description>
			<content:encoded><![CDATA[<p>While at Cold Spring Harbor, I attended a presentation by <a href="http://www.valexllc.com/">Valex</a> about their format for storing data in the <a href="http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi">NCBI Short Read Archive (SRA)</a> (I mentioned the SRA in a <a href="http://www.politigenomics.com/2008/05/n-genomes.html">previous post</a>). Rather than storing the data in the current standard for transferring massively-parallel sequencing data, <a href="http://srf.sourceforge.net/">SRF</a>, they have designed a new format that builds on their learnings from the current <a href="http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?">trace archive</a>, but tailors it to the unprecedented amount of data associated with massively-parallel sequence data (Valex are NCBI contractors who developed and maintain the trace archive). The format is a database, with a few twists. First, the database storage utilizes the file system directly rather than storing its data structured within large files. Second, the database is <a href="http://en.wikipedia.org/wiki/Column-oriented_DBMS">column oriented</a> rather than row oriented. The result is that each SRA submission is a directory on the file system and each data type is stored in a directory within the submission directory.</p>
<p>Column-oriented database architecture is not a new idea, but it does seem well suited for the SRA.  In traditional relational databases, a table has several columns of related items. For example, a &#8220;person&#8221; table might have a column for &#8220;first_name&#8221;, a column for &#8220;last_name&#8221;, and a column to hold a unique identifier (id), typically an integer. When you add an entry to the database, you add a row in the table. In the person example, you might add &#8220;Jane&#8221; in the first_name column, &#8220;Doe&#8221; in the last_name column, and a unique number in the id column. In a column-based database, you basically translate each column into its own two column table, one column for the data and one for the unique identifier. So in the person example you would still have two &#8220;columns&#8221;, but the columns would each store the appropriate name (first or last) and the id associated with that name. This architecture performs very well when you only retrieve one type of data at a time. Other advantages include that each column is fully indexed, each column compresses well (since each column contains only one type of data), and adding new data types (columns) is easy. One disadvantage is that every time you want to retrieve more than one data type, a &#8220;join&#8221; is required. So, for example, if you want to retrieve the sequence base calls and quality values for some number of reads, you will need to find that read separately in the base call column and the base quality column. All in all, it seems the advantages outweigh the disadvantages, especially considering the likely use cases for the data in the SRA.</p>
<p>The best part about their design is that they are planning to release the source code used to create and maintain the SRA freely (when we spoke, they had not settled on a free/open-source license). If they actually follow through with this (which they said would take a few months to remove NCBI-specific code), it would be of great benefit to researchers working with this data as the SRA format is likely more efficient for analyzing this data while the SRF is more efficient for transferring the data. If they don&#8217;t follow through, someone else will probably fill the gap as the column-oriented architecture seems to be the right idea and its implementation need not be difficult.</p>
<p>You can find out more detailed information about the SRA format from <a href="http://www.valexllc.com/">Valex&#8217;s web site</a> which includes a link to <a href="http://www.valexllc.com/d/Short_Read_Archive_Format.pdf">documentation in PDF format</a> and the <a href="http://www.valexllc.com/d/Short_Read_Archive_solutions.pps">presentation I saw in PowerPoint format</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2008/05/short-read-archive.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>N Genomes</title>
		<link>http://www.politigenomics.com/2008/05/n-genomes.html</link>
		<comments>http://www.politigenomics.com/2008/05/n-genomes.html#comments</comments>
		<pubDate>Fri, 09 May 2008 20:21:23 +0000</pubDate>
		<dc:creator>dd</dc:creator>
				<category><![CDATA[genomics]]></category>
		<category><![CDATA[1000 Genomes]]></category>
		<category><![CDATA[454]]></category>
		<category><![CDATA[CSHL]]></category>
		<category><![CDATA[health]]></category>
		<category><![CDATA[Illumina]]></category>
		<category><![CDATA[informatics]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[SOLiD]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://www.politigenomics.com/?p=74</guid>
		<description><![CDATA[Earlier this week there were several meetings about the 1000 Genomes Project at Cold Spring Harbor Labs. The first meeting Monday morning was about data flow and data repositories. NCBI&#8217;s Short Read Archive (SRA) and the equivalent at EBI (which should be ready in a month or two) will house all the data. The pilot [...]]]></description>
			<content:encoded><![CDATA[<p>Earlier this week there were several meetings about the <a href="http://www.1000genomes.org/">1000 Genomes Project</a> at <a href="http://www.cshl.edu/">Cold Spring Harbor Labs</a>.  The first meeting Monday morning was about data flow and data repositories.  NCBI&#8217;s <a href="http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi">Short Read Archive (SRA)</a> and the equivalent at <a href="http://www.ebi.ac.uk/">EBI</a> (which should be ready in a month or two) will house all the data.  The <a href="http://www.politigenomics.com/2008/03/1000-genomes.html">pilot projects</a> for the 1000 Genomes Project just started less than two months ago and have already generated as much sequence data as half of the entire <a href="http://www.ncbi.nlm.nih.gov/Traces/trace.cgi">trace archive</a> (which contains the sequence data for all publicly funded genome projects over the last 10 years).    In other words, this project is going to generate a <span style="font-weight:bold;">lot</span> of sequence data (not to mention all the data generated by analysis of the sequence).  Paul Flicek from EBI estimates the pilot projects alone will generate about 1 PT (1,000,000 GB) of sequence data.  Moving that much data from site to site will be a challenge.  Normal solutions, e.g., FTP, rsync, and shipping hard drives, can&#8217;t seem to keep up with the data generation rates.  NCBI, EBI, and the sequencing centers are testing a high-speed data transfer solution called <a href="http://www.asperasoft.com/products/scp/index.html">Aspera scp</a>.  It has impressive transfer rates, but seems to stall after a while for no discernible reason.  We&#8217;ll see if we can get it to work reliably over the coming weeks.</p>
<p>After the data flow meeting was a meeting of the 1000 Genomes Steering Committee.  The day and a half that ensued was filled with a lot of lively discussion.  When all was said and done, one thing was clear: there are a lot of questions that need to be answered.  The analysis group presented convincing results from simulations that indicated 2× coverage in a large number of individual genomes (Pilot 1) is probably not sufficient to detect the rare variants the project is going after (present in 1-2% of the population).  The simulations indicated that the power of the study to detect such variants (at a constant cost, i.e., constant total amount of sequence generated) would be greatly enhanced by sequencing half as many people at 4× coverage.  There was no firm decision on how to change the pilot (if at all), but going forward it is likely that some of the individuals in Pilot 1 will be sequenced up to 4× or even 8×.  Thus, while the project may be named 1000 Genomes, exactly how many genomes we are going to sequence is yet to be determined.</p>
<p>Another issue that arose was the rapid development of the massively parallel sequencing technologies.  These platforms (454 FLX, Illumina Genome Analyzer, and AB SOLiD) increase their throughput, improve their data quality, improve analysis software, etc. several times each year.  Such dynamic platforms make the development of tools to analyze their data, e.g., align the data to a reference genome and detect variants, very difficult.  The right platforms and tools today may not be the best next month or next year when the main project gets underway.  This causes two major needs to come to the fore.  First, experimental design will not end when the project starts.  The experiment will need to be adjusted as capabilities and capacities change.  Second, we will not only have to continually develop and refine tools throughout the project, we will need to develop frameworks to continually test and compare the tools that are available.  It&#8217;s always fun to hit a moving target.</p>
<p>The meeting also discussed the ethical, legal, and social implications (ELSI) of the project.  This discussion largely focused on which populations to sample for the project.  Should we deepen our knowledge of individuals of Central European, African, and East Asian ancestry to aid in methodology development?  Or should we broaden our knowledge of overall human variation by including fewer individuals from a larger number of populations?  To be determined&hellip;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.politigenomics.com/2008/05/n-genomes.html/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

