I tried to add some kind of indicator but, they always make my the
conversion script 10x slow. So I just put simple print warnings, cause
that's the best I could do. One thing I have been told is to use
os.path because it is an acutal basename function. Also added a prompt
for user to check there is enough room on HDD, cause it seems they
don't delete or move files to archive. I compared the outputs of the
original with my modified script and I didn't find any differences.
Can someone check my modified script to make sure I didn't miss
something? Anything I can do better? This is the best I can do with my
abilities. Its for use with python 2.7 (thru Anaconda on windows).
Right now, I tested it on my linux machine; hopefully it works on
windows when I get back to work next week.
=======================CODE============================
#!/usr/bin/env python
#Takes a single FASTQ file and splits to .fasta + .qual files
import sys
import os.path
from Bio import SeqIO
#Disk space Checking
print "Warning! Please check that there is at least 50GB of Free Disk
Space per file conversion."
DiskCheck = raw_input('Is there enough space? (yes or no) ')
if DiskCheck == 'yes':
if len(sys.argv) == 1:
print "Please specify a single .fastq file to convert."
sys.exit()
filetoload = sys.argv[1]
basename = filetoload
#BETTER WAY: Chop the extension to get names for output files
basename, extension = os.path.splitext(os.path.basename(filetoload))
print "\nWorking on", basename
print "Don't close this window."
SeqIO.convert(filetoload, "fastq", basename + ".fasta", "fasta")
#QUAL file creation disabled
#SeqIO.convert(filetoload, "fastq", basename + ".qual", "qual")
print "\nDone converting", basename, "to FASTA format."
elif DiskCheck == 'no':
print "\nMake room on disk, then run script again"
sys.exit()
=======================CODE============================
On Thu, Mar 26, 2015 at 1:37 PM, Gavin Scott <gavin@learn.bio> wrote:
> Not to distract from your Python explorations, but at some point you
> might find it interesting to explore what you can do with the free
> Galaxy bioinformatics workflow service:
>
> http://galaxyproject.org/
>
> Going through their tutorial is worthwhile and enlightening.
>
> G.
>
> On Thu, Mar 26, 2015 at 11:46 AM, Jeswin <phillyj101@gmail.com> wrote:
>> Hi all,
>> Once again, thanks for helping me with my python issue. Let me just
>> point out that my programming skills are very low (script kiddie at
>> best) and I don't have much time nowadays to sit down and really learn
>> python. I know the basics.
>>
>> Anyway, my boss wanted to convert FASTQ to FASTA. The only way is thru
>> a script and I settled with python. I found the script online that
>> works:
>> ================================================
>> #!/usr/bin/env python
>>
>> #Takes a single FASTQ file and splits to .fasta + .qual files
>> import sys
>> from Bio import SeqIO
>>
>> if len(sys.argv) == 1:
>> print "Please specify a single .fastq file to convert."
>> sys.exit()
>>
>> filetoload = sys.argv[1]
>> basename = filetoload
>>
>> #Chop the extension to get names for output files
>> if basename.find(".") != -1:
>> basename = '.'.join(basename.split(".")[:-1])
>>
>> SeqIO.convert(filetoload, "fastq", basename + ".fasta", "fasta")
>> SeqIO.convert(filetoload, "fastq", basename + ".qual", "qual")
>> ================================================
>>
>> I'm thinking about adding 2 features to it for the convenience for my
>> colleagues. I know you all don't like leading people step by step, so
>> I'm fine if you all can point me in the right direction (simplest and
>> fastest solutions).
>>
>> [1] I would like to add something that shows progress {bar, text,
>> etc.} of the conversion so that people on the computer know it's
>> working and not frozen. Maybe read the size of the output file every
>> 30 seconds (first the FASTA file, then a QUAL file)?
>>
>> [2] I am not sure if the script can process more than one file
>> (sequentially) in the argument. I just ran "fastq_to_fasta.py
>> file1.fastq". I am wondering if I can do: "fastq_to_fasta.py
>> file1.fastq file2.fastq"? Basically, I am not sure if python can do
>> that?
>>
>> BTW, I got the script from:
>> http://nebc.nerc.ac.uk/nebc_website_frozen/nebc.nerc.ac.uk//tools/code-corner/scripts/sequence-formatting-and-other-text-manipulation
>>
>> Thanks
>>
>> --
>> In necessariis unitas, in dubiis libertas, in omnibus caritas.
>> -Marco Antonio Dominis
>>
>> --
>> -- You received this message because you are subscribed to the Google Groups DIYbio group. To post to this group, send email to diybio@googlegroups.com. To unsubscribe from this group, send email to diybio+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/diybio?hl=en
>> Learn more at www.diybio.org
>> ---
>> You received this message because you are subscribed to the Google Groups "DIYbio" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to diybio+unsubscribe@googlegroups.com.
>> To post to this group, send email to diybio@googlegroups.com.
>> Visit this group at http://groups.google.com/group/diybio.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/diybio/CAAhF0RKiRhWGYQ%2B%2Bu06xZ3mqTf9eF4UD0Oh2w2aOd0cOXQAWDw%40mail.gmail.com.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> -- You received this message because you are subscribed to the Google Groups DIYbio group. To post to this group, send email to diybio@googlegroups.com. To unsubscribe from this group, send email to diybio+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/diybio?hl=en
> Learn more at www.diybio.org
> ---
> You received this message because you are subscribed to the Google Groups "DIYbio" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to diybio+unsubscribe@googlegroups.com.
> To post to this group, send email to diybio@googlegroups.com.
> Visit this group at http://groups.google.com/group/diybio.
> To view this discussion on the web visit https://groups.google.com/d/msgid/diybio/CA%2BcsFZiNfGqANizvXbzBcAbjx13H8FoM01KP1uQVV0WLO%2Bp9sA%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.
--
In necessariis unitas, in dubiis libertas, in omnibus caritas.
-Marco Antonio Dominis
--
-- You received this message because you are subscribed to the Google Groups DIYbio group. To post to this group, send email to diybio@googlegroups.com. To unsubscribe from this group, send email to diybio+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/diybio?hl=en
Learn more at www.diybio.org
---
You received this message because you are subscribed to the Google Groups "DIYbio" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diybio+unsubscribe@googlegroups.com.
To post to this group, send email to diybio@googlegroups.com.
Visit this group at http://groups.google.com/group/diybio.
To view this discussion on the web visit https://groups.google.com/d/msgid/diybio/CAAhF0RLU7ZEnXa5RsgE9nHw1XZf3DzQ8p9WXOyQfajYYxvYFBA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Re: [DIYbio] modifying the FASTQ>FASTA script
11:07 AM |
Subscribe to:
Post Comments (Atom)






0 comments:
Post a Comment