shell-script-pt
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Formatando nomenclatura e sequências genéticas


From: Vinão
Subject: Formatando nomenclatura e sequências genéticas
Date: Mon, 27 Jun 2005 16:30:27 -0300
User-agent: Mozilla Thunderbird 1.0.2-1.3.3 (X11/20050513)

Pessoal,

Tenho uns arquivo de entrada que são desse formato:
>a10a10p1 CHROMAT_FILE: a10a10p1 PHD_FILE: a10a10p1.phd.1 CHEM: term DYE: big TIME: Mon Jun 27 12:08:02 2005
TTAATGTGGGCGATTCTAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXGCAGTGAATTGAAATGTTTT
TTTTTAACAATCAAATATTATTTGGAATGTTGTCACTCCTTACTAAACAA
CTAATTGGAACACATTCACCTCAAGCTAAGACACTTGGTTGGTTAGAAAA
ATATCTGATTACACGACTACAAGATTGCTCACTAAAAAAAGTGTTACGTC
TTATAACATACAAGATGAAGTTCAATCTTCTAGGTTGCTAAAGAACATAA
TAATGCTCTATCAACGAGTTATGCTTGCAATCTTAAGATTTCAACTTACA
ATTGTCCATAACTTTGTAAGTTCGTCGGCTTTCTTCTTCAAAGATATATT
TTATATAGAGAGGGGTGAAAAAAAGAGTTGTGGCAATCCTTTGATGGTTG
TTTGAAACAAAGAGAAGTCTTTTTTTATTGTATTTGTTCTTATAAATGTT
GAGGATTTTATTGAATTCTCTCCATGTAACACTTAAATACATACTTCTAT
TTGAAGGAATGTTTTGAAGGAACGTTCTGATACTTAACATATGCTATCAA
GATTACGTTTAACTCCTTTTTACTTAATAGAGTAAAAG
>a10p3 CHROMAT_FILE: a10p3 PHD_FILE: a10p3.phd.1 CHEM: term DYE: big TIME: Mon Jun 27 12:13:47 2005
CTCAACGTCGGNCTCACGAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ACTTTCTGGTTTACTCAACAACTTAAAGAAACAGAAACAATAAAAACAAA
TAAATACCCAATTGATCACTCTCTCCCTCTCACACACACACACTAGTCGA
GTCATACATACGCACAGAATCACTTGGCGAATAAACCACTCCCAACGAGA
ATCGATCAAGATGAAATTAAAACTGAAGTCAATAAAACTAAAGTGTGTCT
ATCATGTGACTGAAAATAAATGGAAAAACAATAGAAGACATATTTAGTGA
AAATTCGGAGAAAAAACAAGCATGTGGCTGAAAAAAATGTTGCAAGAACA
ACGAAAAACGTTGCAACAACAATGAAAAATCTTGTTAGAGAGAAGGCATA
CCCTGAGATGAGAAGACACCGAAAAAGCGAACGGAACAGGGAAATGAAAA
TTCTTCACTTCTAAGGGTTGAGAAAAAAATAAAAGGCACATTCAATCATT
GACACGTGCATGTTCTTAACCGTTACATATGTCGTTCATGTTTTTCATTG
GCCAATTCTTTTATTTTAATTTAAATATTAAATATACAATCTATAATATT
CATGAATAATATACAGACAGACAATTAATCACCAATTTATCATACTTTTA
ATAGTTTATTTATATTTTTCATTAAATATAAGTGTTATATTTTTCACTAA
AATGTTATTTNCATCAAATAAAAAAAATTATTTTATATAGAAATCCGAAT
CAAAATAATATCATGTGATTTGATCTAATTGTCCTCCC

Mas só consigo rodar se conseguir modificar para esse formato (removendo tudo que há na frente do espaço duplo depois dos nomes: a10a10p1 e a10p3):

>a10a10p1
TTAATGTGGGCGATTCTAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXGCAGTGAATTGAAATGTTTT
TTTTTAACAATCAAATATTATTTGGAATGTTGTCACTCCTTACTAAACAA
CTAATTGGAACACATTCACCTCAAGCTAAGACACTTGGTTGGTTAGAAAA
ATATCTGATTACACGACTACAAGATTGCTCACTAAAAAAAGTGTTACGTC
TTATAACATACAAGATGAAGTTCAATCTTCTAGGTTGCTAAAGAACATAA
TAATGCTCTATCAACGAGTTATGCTTGCAATCTTAAGATTTCAACTTACA
ATTGTCCATAACTTTGTAAGTTCGTCGGCTTTCTTCTTCAAAGATATATT
TTATATAGAGAGGGGTGAAAAAAAGAGTTGTGGCAATCCTTTGATGGTTG
TTTGAAACAAAGAGAAGTCTTTTTTTATTGTATTTGTTCTTATAAATGTT
GAGGATTTTATTGAATTCTCTCCATGTAACACTTAAATACATACTTCTAT
TTGAAGGAATGTTTTGAAGGAACGTTCTGATACTTAACATATGCTATCAA
GATTACGTTTAACTCCTTTTTACTTAATAGAGTAAAAG
>a10p3
CTCAACGTCGGNCTCACGAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ACTTTCTGGTTTACTCAACAACTTAAAGAAACAGAAACAATAAAAACAAA
TAAATACCCAATTGATCACTCTCTCCCTCTCACACACACACACTAGTCGA
GTCATACATACGCACAGAATCACTTGGCGAATAAACCACTCCCAACGAGA
ATCGATCAAGATGAAATTAAAACTGAAGTCAATAAAACTAAAGTGTGTCT
ATCATGTGACTGAAAATAAATGGAAAAACAATAGAAGACATATTTAGTGA
AAATTCGGAGAAAAAACAAGCATGTGGCTGAAAAAAATGTTGCAAGAACA
ACGAAAAACGTTGCAACAACAATGAAAAATCTTGTTAGAGAGAAGGCATA
CCCTGAGATGAGAAGACACCGAAAAAGCGAACGGAACAGGGAAATGAAAA
TTCTTCACTTCTAAGGGTTGAGAAAAAAATAAAAGGCACATTCAATCATT
GACACGTGCATGTTCTTAACCGTTACATATGTCGTTCATGTTTTTCATTG
GCCAATTCTTTTATTTTAATTTAAATATTAAATATACAATCTATAATATT
CATGAATAATATACAGACAGACAATTAATCACCAATTTATCATACTTTTA
ATAGTTTATTTATATTTTTCATTAAATATAAGTGTTATATTTTTCACTAA
AATGTTATTTNCATCAAATAAAAAAAATTATTTTATATAGAAATCCGAAT
CAAAATAATATCATGTGATTTGATCTAATTGTCCTCCC

Alguém tem idéia de um script para fazer isso?

Obrigado,
Vinicius.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]