作者chhuang (不要重复制造轮子)
看板perl
标题Re: [问题] 之前问过的程式加了新条件~~
时间Tue Oct 24 14:46:42 2006
: 上面是小弟之前问过的问题~~
: 就是在每个"//"为结尾的档案做切割输出
: 後来因为这样输出後档案太多了,一个gbvrl1.seq就可以输出7万多笔
: 如此一来我若是先输出再把我要的档案去grep出来就太费时了
: 所以我想说能在文件档中"ORGANISM"栏位里有提及的名称如"Enterovirus"等
: 才作输出如此一来就可以节省不少时间了
: 请问有比较好的做法吗~~THX
我记得上次我也有回过你了...
对於解析序列档案...
我认为塞进资料是最好的作法...
不过以在生资所两年...生资公司快一年的经验来说...
写个 power script 就可以对 NCBI 做查询...
并把查询的结果以 FASTA 的格式传回...
才是节省时间...较有效率的办法!
以下是我写的程式撷取的部分结果(去除 Nucleotide)...
有兴趣打电话给我 3345678...:P
>gi|116256796|gb|DQ993173.1| Human coxsackievirus A16 isolate 0249-06 VP1
>gi|9626677|ref|NC_001472.1| Human enterovirus B, complete genome
>gi|1839281|gb|S79977.1| swine vesicular disease virus SVDV-specific sequence
>gi|73533657|gb|DQ167421.1| Human coxsackievirus B3 isolate p19 5' UTR
>gi|73533656|gb|DQ167420.1| Human coxsackievirus B3 isolate p18 5' UTR
>gi|73533655|gb|DQ167419.1| Human coxsackievirus B4 isolate p16 5' UTR
>gi|73533654|gb|DQ167418.1| Human coxsackievirus B4 isolate p14 5' UTR
>gi|73533653|gb|DQ167417.1| Human coxsackievirus B4 isolate p12 5' UTR
>gi|73533652|gb|DQ167416.1| Human coxsackievirus B3 isolate p10 5' UTR
>gi|73533651|gb|DQ167415.1| Human coxsackievirus B6 isolate p9 5' UTR
>gi|73533650|gb|DQ167414.1| Human poliovirus 3 isolate p8 5' UTR
>gi|73533649|gb|DQ167413.1| Human echovirus 30 isolate p7 5' UTR
>gi|73533648|gb|DQ167412.1| Human coxsackievirus B3 isolate p6 5' UTR
>gi|115499492|gb|DQ984529.1| Human coxsackievirus A16 isolate HME-310 5' UTR
>gi|61608320|gb|AY843312.1| Enterovirus 86 strain BAN99-10356, partial genome
>gi|61608318|gb|AY843311.1| Enterovirus 82 strain OMA98-10391, partial genome
>gi|61608316|gb|AY843310.1| Enterovirus 80 strain OMA98-10388, partial genome
>gi|61608314|gb|AY843309.1| Enterovirus 79 strain USA/CA82-10385, partial
>gi|61608311|gb|AY843308.1| Enterovirus 95 strain CIV03-10361, complete genome
>gi|61608308|gb|AY843307.1| Enterovirus 94 strain BAN99-10355, complete genome
>gi|61608305|gb|AY843306.1| Enterovirus 88 strain BAN01-10398, complete genome
>gi|61608302|gb|AY843305.1| Enterovirus 87 strain BAN01-10396, complete genome
>gi|61608299|gb|AY843304.1| Enterovirus 86 strain BAN00-10354, complete genome
>gi|61608296|gb|AY843303.1| Enterovirus 85 strain BAN00-10353, complete genome
>gi|61608293|gb|AY843302.1| Enterovirus 84 strain USA/TX97-10394, complete
>gi|61608290|gb|AY843301.1| Enterovirus 83 strain USA/CA76-10392, complete
>gi|61608286|gb|AY843300.1| Enterovirus 82 strain USA/CA64-10390, complete
>gi|61608282|gb|AY843299.1| Enterovirus 81 strain USA/CA68-10389, complete
>gi|61608279|gb|AY843298.1| Enterovirus 80 strain USA/CA67-10387, complete
>gi|61608274|gb|AY843297.1| Enterovirus 79 strain USA/CA79-10384, complete
>gi|12408699|ref|NC_002058.3| Poliovirus, complete genome
>gi|115500010|dbj|AB275852.1| Human coxsackievirus A14 gene for polyprotein,
>gi|115500008|dbj|AB275851.1| Human coxsackievirus A14 gene for polyprotein,
>gi|115500006|dbj|AB275850.1| Human coxsackievirus A14 gene for polyprotein,
>gi|115500005|dbj|AB275849.1| Human coxsackievirus A14 gene, similar to
>gi|115500003|dbj|AB275848.1| Human coxsackievirus A14 gene for polyprotein,
>gi|115430550|emb|AM084225.1| Human poliovirus 2 RNA for polyprotein,
>gi|115430548|emb|AM084224.1| Human poliovirus 2 RNA for polyprotein,
>gi|115430546|emb|AM084223.1| Human poliovirus 2 RNA for polyprotein,
--
我是瓶男~我很难懂!
http://blog.yam.com/chhuang
--
※ 发信站: 批踢踢实业坊(ptt.cc)
◆ From: 61.30.74.102
1F:推 akillerbear:感谢大大回覆~~为什麽我没有要照你提议的方法做呢? 10/24 19:38
2F:→ akillerbear:因为老板要我做一个资料库给他~~~要这种抓下来分解的 10/24 19:39
3F:→ akillerbear:流程~~~~所以才会要这要弄~~~~^^a 10/24 19:40
4F:→ akillerbear:之前大大说的power script~~这个方向小弟会试试看~~ 10/24 19:40
5F:→ akillerbear:那找大大的时间什麽时候方便阿~~3QQ 10/24 19:41