Re: [DIYbio] Re: large dna extraction

Forget about getting a closed assembly of the genome - you don't need it. Gold standard for de novo sequence assembly is typically around 30x coverage, so you need about thirty times more sequence data than the genome size. But even at that coverage, you usually still get multiple contigs with unsequenced gaps between them. "Back in the old days" (i.e., 4-5 years ago), sequencing centers would put in a lot of man power to close those gaps using primer walking and techniques like that. These days, hardly anyone bothers to spend that much effort for a measly bacterial genome. And who cares if you miss a handful of genes in some gaps between contigs, if you can't even attach any functional annotation to 30-50% of genes anyway. Heck, even the human genome is not 100% finished yet, but nobody loses any sleep about that either.

Depending on the size of the plasmid and its copy number per cell, you may or may not get a closed assembly for the plasmid. If the plasmid is in multiple contigs, it's usually not to hard to recognize them as plasmid contigs because of the presence of plasmid-specific genes. 

The only time when I would imagine you might want to separate the genomic and plasmid DNA and sequence them separately would be when you're dealing with a high copy number plasmid, and you don't want to waste your $ on sequencing the plasmid at 10,000x coverage. With Illumina technology being so damn cheap, I'm not sure if anyone worries over wasting a few reads on high copy number plasmids though. 

All of this gets even more fun when you're doing metagenomics. After assembly, you might wind up with thousands of contigs, with coverage ranging from single digits to 1000x or more. And then you need to figure which of those contigs belong together and make up a microbial genome. We can routinely reconstruct bacterial genomes that are 95% or more complete from the most abundant organisms in a mixed community, by looking for phylogenetic marker genes, and clustering together contigs with similar coverage levels and tetranucleotide frequencies etc.

Patrik

On Thursday, March 6, 2014 4:08:38 PM UTC-8, Nathan McCorkle wrote:
On Thu, Mar 6, 2014 at 3:46 PM, Jeswin <phill...@gmail.com> wrote:
> On Wed, Mar 5, 2014 at 4:24 PM, qetzal <qet...@yahoo.com> wrote:
>> ...
>> Unfortunately, I don't know how many total Mb of sequence data you'd need to
>> reasonably expect to get complete assembly. It might well be more than
>> you're able to do, depending on your budget & resources.

That's what the metagenomics folks do, I believe there are some
caveats to getting nice circles (exome maybe?)... I think Patrik would
be the expert on this.
--
-Nathan

--
-- You received this message because you are subscribed to the Google Groups DIYbio group. To post to this group, send email to diybio@googlegroups.com. To unsubscribe from this group, send email to diybio+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/diybio?hl=en
Learn more at www.diybio.org
---
You received this message because you are subscribed to the Google Groups "DIYbio" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diybio+unsubscribe@googlegroups.com.
To post to this group, send email to diybio@googlegroups.com.
Visit this group at http://groups.google.com/group/diybio.
To view this discussion on the web visit https://groups.google.com/d/msgid/diybio/42f32969-8aa7-477a-be7b-bf66b75cefa1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

  • Digg
  • Del.icio.us
  • StumbleUpon
  • Reddit
  • RSS

0 comments:

Post a Comment