Myostatin, with a highly conservative gene among breeds is a negative regulator of muscle. The 3′ coding regions of wild boar and crossbred pig myostatin were cloned by RT-PCR and sequenced respectively. The homology...Myostatin, with a highly conservative gene among breeds is a negative regulator of muscle. The 3′ coding regions of wild boar and crossbred pig myostatin were cloned by RT-PCR and sequenced respectively. The homology of the nucleotide sequence between wild boar and crossbred pig was 100% and there was no difference in this region compared with pig myostatin gene of Genbank. This indicated that there was not change of gene sequence in this region during the evolution processes.展开更多
Development of efficient gene prediction algorithms is one of the fundamental efforts in gene prediction study in the area of genomics. In genomic signal processing the basic step of the identification of protein codi...Development of efficient gene prediction algorithms is one of the fundamental efforts in gene prediction study in the area of genomics. In genomic signal processing the basic step of the identification of protein coding regions in DNA sequences is based on the period-3 property exhibited by nucleotides in exons. Several approaches based on signal processing tools and numerical representations have been applied to solve this problem, trying to achieve more accurate predictions. This paper presents a new indicator sequence based on amino acid sequence, called as aminoacid indicator sequence, derived from DNA string that uses the existing signal processing based time-domain and frequency domain methods to predict these regions within the billions long DNA sequence of eukaryotic cells which reduces the computational load by one-third. It is known that each triplet of bases, called as codon, instructs the cell machinery to synthesize an amino acid. The codon sequence therefore uniquely identifies an amino acid sequence which defines a protein. Thus the protein coding region is attributed by the codons in amino acid sequence. This property is used for detection of period-3 regions using amino acid sequence. Physico-chemical properties of amino acids are used for numerical representation. Various accuracy measures such as exonic peaks, discriminating factor, sensitivity, specificity, miss rate, wrong rate and approximate correlation are used to demonstrate the efficacy of the proposed predictor. The proposed method is validated on various organisms using the standard data-set HMR195, Burset and Guigo and KEGG. The simulation result shows that the proposed method is an effective approach for protein coding prediction.展开更多
The complete genomic sequence of foot-and-mouth disease virus (FMDV) Chinese strain OH/CHA/99 was determined. The 8040 nt sequence and the deduced amino acid sequence werecompared with FMDV sequences published. The re...The complete genomic sequence of foot-and-mouth disease virus (FMDV) Chinese strain OH/CHA/99 was determined. The 8040 nt sequence and the deduced amino acid sequence werecompared with FMDV sequences published. The results showed that OH/CHA/99 shared highersequence homology with OTYTW/97, indicating their close genetic relationship. However,the strain had lower sequence identity with O1/Kaufbeuren/66 strain. Besides, largedeletions in 3A coding region were observed in OH/CHA/99. It was shown that the poly (A)tail of OH/CHA/99 had 56 As at least.展开更多
Accurate identification of protein-coding regions (exons) in DNA sequences has been a challenging task in bioinformatics. Particularly the coding regions have a 3-base periodicity, which forms the basis of all exon ...Accurate identification of protein-coding regions (exons) in DNA sequences has been a challenging task in bioinformatics. Particularly the coding regions have a 3-base periodicity, which forms the basis of all exon identifica- tion methods. Many signal processing tools and techniques have been applied successfully for the identification task but still improvement in this direction is needed. In this paper, we have introduced a new promising model-independent time-frequency filtering technique based on S-transform for accurate identification of the coding regions. The S-transform is a powerful linear time-frequency representation useful for filtering in time-frequency domain. The potential of the proposed technique has been assessed through simulation study and the results obtained have been compared with the existing methods using standard datasets. The comparative study demonstrates that the proposed method outperforms its counterparts in identifying the coding regions.展开更多
A class of multistage filters, namely, real narrowband bandpass filter (RNBPF) has been previously used for identification of protein coding regions. This filter passes the frequency component at 2π/3 along with it...A class of multistage filters, namely, real narrowband bandpass filter (RNBPF) has been previously used for identification of protein coding regions. This filter passes the frequency component at 2π/3 along with its conjugate. This conjugate frequency compo- nent may degrade the identification accuracy. To improve the identification accuracy, two types of multistage filters are proposed in this paper. A complex narrowband bandpass filter (CNBPF) is proposed for suppressing the conjugate frequency component which, in turn, reduces the background noise present in the deoxyribonucleic acid (DNA) spec- trum and improves identification accuracy. By cascading RNBPF with moving average filter (RNBPFMA), another type of multistage filter is proposed. As moving average filter smooth out the rapid variations in the DNA spectrum, RNBPFMA improves the identification accuracy. The computational complexity of RNBPFMA is less than that of CNBPF. The RNBPF and proposed multistage filters are compared with previously reported short-time discrete Fourier transform (ST-DFT) method in terms of compu- tational complexity. It is found that multistage filters reduce the computational load to a greater extent compared to ST-DFT method. The identification accuracy of the proposed CNBPF and RNBPFMA methods is compared with existing anti-notch filter and RNBPF methods. The results show that proposed methods outperform existing methods in terms of identification accuracy for benchmark data sets.展开更多
文摘Myostatin, with a highly conservative gene among breeds is a negative regulator of muscle. The 3′ coding regions of wild boar and crossbred pig myostatin were cloned by RT-PCR and sequenced respectively. The homology of the nucleotide sequence between wild boar and crossbred pig was 100% and there was no difference in this region compared with pig myostatin gene of Genbank. This indicated that there was not change of gene sequence in this region during the evolution processes.
文摘Development of efficient gene prediction algorithms is one of the fundamental efforts in gene prediction study in the area of genomics. In genomic signal processing the basic step of the identification of protein coding regions in DNA sequences is based on the period-3 property exhibited by nucleotides in exons. Several approaches based on signal processing tools and numerical representations have been applied to solve this problem, trying to achieve more accurate predictions. This paper presents a new indicator sequence based on amino acid sequence, called as aminoacid indicator sequence, derived from DNA string that uses the existing signal processing based time-domain and frequency domain methods to predict these regions within the billions long DNA sequence of eukaryotic cells which reduces the computational load by one-third. It is known that each triplet of bases, called as codon, instructs the cell machinery to synthesize an amino acid. The codon sequence therefore uniquely identifies an amino acid sequence which defines a protein. Thus the protein coding region is attributed by the codons in amino acid sequence. This property is used for detection of period-3 regions using amino acid sequence. Physico-chemical properties of amino acids are used for numerical representation. Various accuracy measures such as exonic peaks, discriminating factor, sensitivity, specificity, miss rate, wrong rate and approximate correlation are used to demonstrate the efficacy of the proposed predictor. The proposed method is validated on various organisms using the standard data-set HMR195, Burset and Guigo and KEGG. The simulation result shows that the proposed method is an effective approach for protein coding prediction.
文摘The complete genomic sequence of foot-and-mouth disease virus (FMDV) Chinese strain OH/CHA/99 was determined. The 8040 nt sequence and the deduced amino acid sequence werecompared with FMDV sequences published. The results showed that OH/CHA/99 shared highersequence homology with OTYTW/97, indicating their close genetic relationship. However,the strain had lower sequence identity with O1/Kaufbeuren/66 strain. Besides, largedeletions in 3A coding region were observed in OH/CHA/99. It was shown that the poly (A)tail of OH/CHA/99 had 56 As at least.
文摘Accurate identification of protein-coding regions (exons) in DNA sequences has been a challenging task in bioinformatics. Particularly the coding regions have a 3-base periodicity, which forms the basis of all exon identifica- tion methods. Many signal processing tools and techniques have been applied successfully for the identification task but still improvement in this direction is needed. In this paper, we have introduced a new promising model-independent time-frequency filtering technique based on S-transform for accurate identification of the coding regions. The S-transform is a powerful linear time-frequency representation useful for filtering in time-frequency domain. The potential of the proposed technique has been assessed through simulation study and the results obtained have been compared with the existing methods using standard datasets. The comparative study demonstrates that the proposed method outperforms its counterparts in identifying the coding regions.
文摘A class of multistage filters, namely, real narrowband bandpass filter (RNBPF) has been previously used for identification of protein coding regions. This filter passes the frequency component at 2π/3 along with its conjugate. This conjugate frequency compo- nent may degrade the identification accuracy. To improve the identification accuracy, two types of multistage filters are proposed in this paper. A complex narrowband bandpass filter (CNBPF) is proposed for suppressing the conjugate frequency component which, in turn, reduces the background noise present in the deoxyribonucleic acid (DNA) spec- trum and improves identification accuracy. By cascading RNBPF with moving average filter (RNBPFMA), another type of multistage filter is proposed. As moving average filter smooth out the rapid variations in the DNA spectrum, RNBPFMA improves the identification accuracy. The computational complexity of RNBPFMA is less than that of CNBPF. The RNBPF and proposed multistage filters are compared with previously reported short-time discrete Fourier transform (ST-DFT) method in terms of compu- tational complexity. It is found that multistage filters reduce the computational load to a greater extent compared to ST-DFT method. The identification accuracy of the proposed CNBPF and RNBPFMA methods is compared with existing anti-notch filter and RNBPF methods. The results show that proposed methods outperform existing methods in terms of identification accuracy for benchmark data sets.