[DIYbio] Synthetic promoters

Just posting this because it was a surprise that the company could not synthesize this sequence. The sequence in question is below, it is a promoter, 1 kb upstream of the histone H1 gene of the ascomycete fungus Neurospora crassa, which I wanted to modify in vivo after re-introducing it back into crassa upstream of a GFP reporter. I sent it to Operon and they could not synthesize it due to some repetitive/simple sequence regions (these regions they noted are also pasted below). they tried for a month. This surprised me because I would think that one DNA element a lot of people would want synthesized are promoters that they can then hook up to a gene of interest, and there are going to be a lot of people working with simple microbes (E cole, yeast, fungi) doing this since they are easier to genetically manipulate and have relatively compact promoters, and all promoters, by their nature, are going to have some structure. I will likey send it to somebody else to try, Epoch has been mentioned in this group, to see if they can do it. Anybody with experience at getting promoters synthesized? It did not really occur to me that this would be an issue, maybe the company just sucks at this. Anybody doing larger synthetic projects has to have run into this issue and overcome it. Part of the problem may be that they thought I was getting a gene synthesized; didnt think to mention it was a promoter but that should not matter as I was not asking them to optimize it in any way.

Sequence:
>UPSTREAM1000_histone H1
TGAAGAAGAGAGATAAATAGATAAATAGAAGATGATGGGATGGGCGAGATACCGACTGAA
TCTGAGAGATGGGATGCGGGATGGATGGATTGATGCCCTGCGGCTTCGATTGTGCCAACC
CAGCCAGCCAGCCAGCCATGCCGACCGACCGACCGATATGCGACATGCCATGCCACCTCA
CACACACAGGCACACTGATAACCTGCGAGTCGACGAGGAGAGGTGACGGGCGGCAGAACA
TGCTTCTTTTGGGGCCAGTGAATGATGCCGTGTCCCACCTTGGATCATCCAATCTGTCCG
GACCAGACTCCATCTGGAATGGACATCCATCGGCATCCGCACTCCCCTGGACCCCAATCG
GTTTCTAAAAAGAGGAGAAACGGAAGAGGAAAAGGGAAAGGGAAAAAAAAAAAGAAGAAC
AAGTGGGATCGATGGGACATGGGACAGCACCACTGCATCTCCAGCCGAGTCCATGGAAAC
GGGAAGACAAGGGGAGGGGGGGGGAGAGGGAGAGGGAGGAGGGGGAAGGGGAAGGAGAAG
GGGATGGGGATGGCGACCGAGAGGATAGGTACCTACTGTAGGGACGGGAAATCTCATCGA
CAACCACACAACGAAGCATCGATGCTCTCGAGGTCTCTTCCCCTTCCTTCATGAGACAAG
CGAAAAGGAAAAGGTCCGGAGCCCCAGCTTCCACATCGTGTTGACATGGAACGAGGGAAC
AGGAATCGGGGCCACTGGCCGGCTTCTTTCGTTCTTTCAGCGTGTGTTAGTGGGGTGCAC
GGGCCACATATCCCCGGGAAATGGGCTGGGGGTAGCGGCTTCCAGGAGGTCACAGAGGCC
CCCCCCCCCAGGTCGCAGGGGGAGACGGGAGGTCCGTCGGGGCAGGGGCAGGGAAGAATC
AGCGAAATCACTCGGTCGCGCCAGGAGACCCCGCCTCCGTATATAAACACCCAATCTTCC
CCCCTCGAGCGCGACTGAGCCCACCCATCCTCCTCTCGTC


Dear Thomas  Randall,

 

Your gene sequence exhibited the following properties:

 

1.     A codon adaptation index (CAI) of 0.75 (typically should be between 1.0–0.8)

2.     GC content of approximately 58.61% (typically should be between 30–70%)

3.     Number of CpG = 59

4.     Percentage of low frequency codons based on an E.coli host organism is 21% (this is decent)

5.     Direct repeats = 8

6.     Negative cis-acting elements = 1

 

All genes submitted to Eurofins MWG Operon for sequencing usually undergo optimization prior to synthesis, so I think your problem is more inherent in the design of your gene. I have asked for a more in depth explanation from the lab on your order failure. Once I get that information I will be better equipped to provide alternative options for your design.

 

 

Pertaining to good bioinformatics gene optimization and design tools, here are a few links:

 

https://www.dna20.com/genedesigner2/    (DNA 2.0 Gene Design Software Beta version – requires an account)

http://www.geneinfinity.org/sp/sp_motif.html#patterns  (Repository of free online servers for performing various sequence screens and comparisons)

http://www.bitgene.com/index.shtml (open source basic gene analysis and synthesis tool)

http://www.geneius.de/GENEius/Security_login.action (Eurofins MWG Operon gene optimization free online tool)

A few days later:


Dear Thomas Randall,

 

Our lab technicians were able to pull up more information pertaining to your gene (please see below). From the query, the gene sequence didn't seem to have that many regions of complexity to cause the failure observed, so it's difficult to pinpoint exactly how to improve upon the sequence. It is possible the presence of folding motifs or the GC rich portion interfered with the correct assembly of the gene (we were unable to obtain a clone with the correct gene sequence). Also, there were two ORFs for this gene sequence, so that could have factored in to the ambiguity of the clones.

 

I hope this helps.

 

 

Result

 

Sequence Length

 

GC-Content

1014 bps

 

57.99%

 

Direct Repeats

 

Direct Repeat 1

1. Position: 128

2. Position: 132

Length: 15

Mismatches: 0

 

 

ccagccagccagcca
ccagccagccagcca

 

Direct Repeat 2

1. Position: 149

2. Position: 153

Length: 12

Mismatches: 0

 

 

ccgaccgaccga
ccgaccgaccga

 

Inverted Repeats

 

Inverted Repeat 1

1. Position: 623

2. Position: 623

Length: 12

Mismatches: 0

 

 

agcatcgatgct
agcatcgatgct

 

GC-Rich Subsequences

 

Sequence 1

Position: 800

GC-Content: 75%

 

 

ccccgggaaatgggctgggggtagcggcttccaggaggtcacagaggcccccccccccaggtcgcagggggagacgggaggtccgtcggggcaggggcag

 

Homopolymers

 

Homopolymer 1

Position: 411

Length: 11

 

 

aaaaaaaaaaa

 

Homopolymer 2

Position: 504

Length: 9

 

 

ggggggggg

 

Homopolymer 3

Position: 847

Length: 11

 

 

ccccccccccc



--
You received this message because you are subscribed to the Google Groups "DIYbio" group.
To post to this group, send email to diybio@googlegroups.com.
To unsubscribe from this group, send email to diybio+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/diybio/-/z6TXqhV2Ld8J.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

  • Digg
  • Del.icio.us
  • StumbleUpon
  • Reddit
  • RSS

0 comments:

Post a Comment