[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Formatando nomenclatura e sequências genéticas
From: |
Vinão |
Subject: |
Formatando nomenclatura e sequências genéticas |
Date: |
Mon, 27 Jun 2005 16:30:27 -0300 |
User-agent: |
Mozilla Thunderbird 1.0.2-1.3.3 (X11/20050513) |
Pessoal,
Tenho uns arquivo de entrada que são desse formato:
>a10a10p1 CHROMAT_FILE: a10a10p1 PHD_FILE: a10a10p1.phd.1 CHEM: term
DYE: big TIME: Mon Jun 27 12:08:02 2005
TTAATGTGGGCGATTCTAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXGCAGTGAATTGAAATGTTTT
TTTTTAACAATCAAATATTATTTGGAATGTTGTCACTCCTTACTAAACAA
CTAATTGGAACACATTCACCTCAAGCTAAGACACTTGGTTGGTTAGAAAA
ATATCTGATTACACGACTACAAGATTGCTCACTAAAAAAAGTGTTACGTC
TTATAACATACAAGATGAAGTTCAATCTTCTAGGTTGCTAAAGAACATAA
TAATGCTCTATCAACGAGTTATGCTTGCAATCTTAAGATTTCAACTTACA
ATTGTCCATAACTTTGTAAGTTCGTCGGCTTTCTTCTTCAAAGATATATT
TTATATAGAGAGGGGTGAAAAAAAGAGTTGTGGCAATCCTTTGATGGTTG
TTTGAAACAAAGAGAAGTCTTTTTTTATTGTATTTGTTCTTATAAATGTT
GAGGATTTTATTGAATTCTCTCCATGTAACACTTAAATACATACTTCTAT
TTGAAGGAATGTTTTGAAGGAACGTTCTGATACTTAACATATGCTATCAA
GATTACGTTTAACTCCTTTTTACTTAATAGAGTAAAAG
>a10p3 CHROMAT_FILE: a10p3 PHD_FILE: a10p3.phd.1 CHEM: term DYE: big
TIME: Mon Jun 27 12:13:47 2005
CTCAACGTCGGNCTCACGAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ACTTTCTGGTTTACTCAACAACTTAAAGAAACAGAAACAATAAAAACAAA
TAAATACCCAATTGATCACTCTCTCCCTCTCACACACACACACTAGTCGA
GTCATACATACGCACAGAATCACTTGGCGAATAAACCACTCCCAACGAGA
ATCGATCAAGATGAAATTAAAACTGAAGTCAATAAAACTAAAGTGTGTCT
ATCATGTGACTGAAAATAAATGGAAAAACAATAGAAGACATATTTAGTGA
AAATTCGGAGAAAAAACAAGCATGTGGCTGAAAAAAATGTTGCAAGAACA
ACGAAAAACGTTGCAACAACAATGAAAAATCTTGTTAGAGAGAAGGCATA
CCCTGAGATGAGAAGACACCGAAAAAGCGAACGGAACAGGGAAATGAAAA
TTCTTCACTTCTAAGGGTTGAGAAAAAAATAAAAGGCACATTCAATCATT
GACACGTGCATGTTCTTAACCGTTACATATGTCGTTCATGTTTTTCATTG
GCCAATTCTTTTATTTTAATTTAAATATTAAATATACAATCTATAATATT
CATGAATAATATACAGACAGACAATTAATCACCAATTTATCATACTTTTA
ATAGTTTATTTATATTTTTCATTAAATATAAGTGTTATATTTTTCACTAA
AATGTTATTTNCATCAAATAAAAAAAATTATTTTATATAGAAATCCGAAT
CAAAATAATATCATGTGATTTGATCTAATTGTCCTCCC
Mas só consigo rodar se conseguir modificar para esse formato (removendo
tudo que há na frente do espaço duplo depois dos nomes: a10a10p1 e a10p3):
>a10a10p1
TTAATGTGGGCGATTCTAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXGCAGTGAATTGAAATGTTTT
TTTTTAACAATCAAATATTATTTGGAATGTTGTCACTCCTTACTAAACAA
CTAATTGGAACACATTCACCTCAAGCTAAGACACTTGGTTGGTTAGAAAA
ATATCTGATTACACGACTACAAGATTGCTCACTAAAAAAAGTGTTACGTC
TTATAACATACAAGATGAAGTTCAATCTTCTAGGTTGCTAAAGAACATAA
TAATGCTCTATCAACGAGTTATGCTTGCAATCTTAAGATTTCAACTTACA
ATTGTCCATAACTTTGTAAGTTCGTCGGCTTTCTTCTTCAAAGATATATT
TTATATAGAGAGGGGTGAAAAAAAGAGTTGTGGCAATCCTTTGATGGTTG
TTTGAAACAAAGAGAAGTCTTTTTTTATTGTATTTGTTCTTATAAATGTT
GAGGATTTTATTGAATTCTCTCCATGTAACACTTAAATACATACTTCTAT
TTGAAGGAATGTTTTGAAGGAACGTTCTGATACTTAACATATGCTATCAA
GATTACGTTTAACTCCTTTTTACTTAATAGAGTAAAAG
>a10p3
CTCAACGTCGGNCTCACGAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ACTTTCTGGTTTACTCAACAACTTAAAGAAACAGAAACAATAAAAACAAA
TAAATACCCAATTGATCACTCTCTCCCTCTCACACACACACACTAGTCGA
GTCATACATACGCACAGAATCACTTGGCGAATAAACCACTCCCAACGAGA
ATCGATCAAGATGAAATTAAAACTGAAGTCAATAAAACTAAAGTGTGTCT
ATCATGTGACTGAAAATAAATGGAAAAACAATAGAAGACATATTTAGTGA
AAATTCGGAGAAAAAACAAGCATGTGGCTGAAAAAAATGTTGCAAGAACA
ACGAAAAACGTTGCAACAACAATGAAAAATCTTGTTAGAGAGAAGGCATA
CCCTGAGATGAGAAGACACCGAAAAAGCGAACGGAACAGGGAAATGAAAA
TTCTTCACTTCTAAGGGTTGAGAAAAAAATAAAAGGCACATTCAATCATT
GACACGTGCATGTTCTTAACCGTTACATATGTCGTTCATGTTTTTCATTG
GCCAATTCTTTTATTTTAATTTAAATATTAAATATACAATCTATAATATT
CATGAATAATATACAGACAGACAATTAATCACCAATTTATCATACTTTTA
ATAGTTTATTTATATTTTTCATTAAATATAAGTGTTATATTTTTCACTAA
AATGTTATTTNCATCAAATAAAAAAAATTATTTTATATAGAAATCCGAAT
CAAAATAATATCATGTGATTTGATCTAATTGTCCTCCC
Alguém tem idéia de um script para fazer isso?
Obrigado,
Vinicius.