Antiviral software systems (AVSs) have problems in identifying polymorphic variants of viruses without explicit signatures for such variants. Alignment-based techniques from bioinformatics may provide a novel way to g...Antiviral software systems (AVSs) have problems in identifying polymorphic variants of viruses without explicit signatures for such variants. Alignment-based techniques from bioinformatics may provide a novel way to generate signatures from consensuses found in polymorphic variant code. We demonstrate how multiple sequence alignment supplemented with gap penalties leads to viral code signatures that generalize successfully to previously known polymorphic variants of JS. Cassandra virus and previously unknown polymorphic variants of W32.CTX/W32.Cholera and W32.Kitti viruses. The implications are that future smart AVSs may be able to generate effective signatures automatically from actual viral code by varying gap penalties to cover for both known and unknown polymorphic variants.展开更多
Antiviral software systems (AVSs) have problems in detecting polymorphic variants of viruses without specific signatures for such variants. Previous alignment-based approaches for automatic signature extraction have s...Antiviral software systems (AVSs) have problems in detecting polymorphic variants of viruses without specific signatures for such variants. Previous alignment-based approaches for automatic signature extraction have shown how signatures can be generated from consensuses found in polymorphic variant code. Such sequence alignment approaches required variable length viral code to be extended through gap insertions into much longer equal length code for signature extraction through data mining of consensuses. Non-nested generalized exemplars (NNge) are used in this paper in an attempt to further improve the automatic detection of polymorphic variants. The important contribution of this paper is to compare a variable length data mining technique using viral source code to the previously used equal length data mining technique obtained through sequence alignment. This comparison was achieved by conducting three different experiments (i.e. Experiments I-III). Although Experiments I and II generated unique and effective syntactic signatures, Experiment III generated the most effective signatures with an average detection rate of over 93%. The implications are that future, syntactic-based smart AVSs may be able to generate effective signatures automatically from malware code by adopting data mining and alignment techniques to cover for both known and unknown polymorphic variants and without the need for semantic (run-time) analysis.展开更多
文摘Antiviral software systems (AVSs) have problems in identifying polymorphic variants of viruses without explicit signatures for such variants. Alignment-based techniques from bioinformatics may provide a novel way to generate signatures from consensuses found in polymorphic variant code. We demonstrate how multiple sequence alignment supplemented with gap penalties leads to viral code signatures that generalize successfully to previously known polymorphic variants of JS. Cassandra virus and previously unknown polymorphic variants of W32.CTX/W32.Cholera and W32.Kitti viruses. The implications are that future smart AVSs may be able to generate effective signatures automatically from actual viral code by varying gap penalties to cover for both known and unknown polymorphic variants.
文摘Antiviral software systems (AVSs) have problems in detecting polymorphic variants of viruses without specific signatures for such variants. Previous alignment-based approaches for automatic signature extraction have shown how signatures can be generated from consensuses found in polymorphic variant code. Such sequence alignment approaches required variable length viral code to be extended through gap insertions into much longer equal length code for signature extraction through data mining of consensuses. Non-nested generalized exemplars (NNge) are used in this paper in an attempt to further improve the automatic detection of polymorphic variants. The important contribution of this paper is to compare a variable length data mining technique using viral source code to the previously used equal length data mining technique obtained through sequence alignment. This comparison was achieved by conducting three different experiments (i.e. Experiments I-III). Although Experiments I and II generated unique and effective syntactic signatures, Experiment III generated the most effective signatures with an average detection rate of over 93%. The implications are that future, syntactic-based smart AVSs may be able to generate effective signatures automatically from malware code by adopting data mining and alignment techniques to cover for both known and unknown polymorphic variants and without the need for semantic (run-time) analysis.