Sudden wildfires cause significant global ecological damage.While satellite imagery has advanced early fire detection and mitigation,image-based systems face limitations including high false alarm rates,visual obstruc...Sudden wildfires cause significant global ecological damage.While satellite imagery has advanced early fire detection and mitigation,image-based systems face limitations including high false alarm rates,visual obstructions,and substantial computational demands,especially in complex forest terrains.To address these challenges,this study proposes a novel forest fire detection model utilizing audio classification and machine learning.We developed an audio-based pipeline using real-world environmental sound recordings.Sounds were converted into Mel-spectrograms and classified via a Convolutional Neural Network(CNN),enabling the capture of distinctive fire acoustic signatures(e.g.,crackling,roaring)that are minimally impacted by visual or weather conditions.Internet of Things(IoT)sound sensors were crucial for generating complex environmental parameters to optimize feature extraction.The CNN model achieved high performance in stratified 5-fold cross-validation(92.4%±1.6 accuracy,91.2%±1.8 F1-score)and on test data(94.93%accuracy,93.04%F1-score),with 98.44%precision and 88.32%recall,demonstrating reliability across environmental conditions.These results indicate that the audio-based approach not only improves detection reliability but also markedly reduces computational overhead compared to traditional image-based methods.The findings suggest that acoustic sensing integrated with machine learning offers a powerful,low-cost,and efficient solution for real-time forest fire monitoring in complex,dynamic environments.展开更多
Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to est...Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.展开更多
A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at ...A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at various levels of abstraction via effective integration of video, audio, and text data available from the news programs. Based on these news video structure and content analysis technologies, a TV news video Library is generated, from which users can retrieve definite news story according to their demands.展开更多
AIM:To support probe-based confocal laser endomi-croscopy (pCLE) diagnosis by designing software for the automated classification of colonic polyps. METHODS:Intravenous fluorescein pCLE imaging of colorectal lesions w...AIM:To support probe-based confocal laser endomi-croscopy (pCLE) diagnosis by designing software for the automated classification of colonic polyps. METHODS:Intravenous fluorescein pCLE imaging of colorectal lesions was performed on patients under-going screening and surveillance colonoscopies, followed by polypectomies. All resected specimens were reviewed by a reference gastrointestinal pathologist blinded to pCLE information. Histopathology was used as the criterion standard for the differentiation between neoplastic and non-neoplastic lesions. The pCLE video sequences, recorded for each polyp, were analyzed off-line by 2 expert endoscopists who were blinded to the endoscopic characteristics and histopathology. These pCLE videos, along with their histopathology diagnosis, were used to train the automated classification software which is a content-based image retrieval technique followed by k-nearest neighbor classification. The performance of the off-line diagnosis of pCLE videos established by the 2 expert endoscopists was compared with that of automated pCLE software classification. All evaluations were performed using leave-one-patient- out cross-validation to avoid bias. RESULTS:Colorectal lesions (135) were imaged in 71 patients. Based on histopathology, 93 of these 135 lesions were neoplastic and 42 were non-neoplastic. The study found no statistical significance for the difference between the performance of automated pCLE software classification (accuracy 89.6%, sensitivity 92.5%, specificity 83.3%, using leave-one-patient-out cross-validation) and the performance of the off-line diagnosis of pCLE videos established by the 2 expert endoscopists (accuracy 89.6%, sensitivity 91.4%, specificity 85.7%). There was very low power (< 6%) to detect the observed differences. The 95% confidence intervals for equivalence testing were:-0.073 to 0.073 for accuracy, -0.068 to 0.089 for sensitivity and -0.18 to 0.13 for specificity. The classification software proposed in this study is not a "black box" but an informative tool based on the query by example model that produces, as intermediate results, visually similar annotated videos that are directly interpretable by the endoscopist. CONCLUSION:The proposed software for automated classification of pCLE videos of colonic polyps achieves high performance, comparable to that of off-line diagnosis of pCLE videos established by expert endoscopists.展开更多
This paper presents a novel efficient semantic image classification algorithm for high-level feature indexing of high-dimension image database. Experiments show that the algorithm performs well. The size of the train ...This paper presents a novel efficient semantic image classification algorithm for high-level feature indexing of high-dimension image database. Experiments show that the algorithm performs well. The size of the train set and the test set is 7 537 and 5 000 respectively. Based on this theory, another ground is built with 12,000 images, which are divided into three classes: city, landscape and person, the total result of the classifications is 88.92%, meanwhile, some preliminary results are presented for image understanding based on semantic image classification and low level features. The groundtruth for the experiments is built with the images from Corel database, photos and some famous face databases.展开更多
Purpose–Current practices in data classification and retrieval have experienced a surge in the use of multimedia content.Identification of desired information from the huge image databases has been facing increased c...Purpose–Current practices in data classification and retrieval have experienced a surge in the use of multimedia content.Identification of desired information from the huge image databases has been facing increased complexities for designing an efficient feature extraction process.Conventional approaches of image classification with text-based image annotation have faced assorted limitations due to erroneous interpretation of vocabulary and huge time consumption involved due to manual annotation.Content-based image recognition has emerged as an alternative to combat the aforesaid limitations.However,exploring rich feature content in an image with a single technique has lesser probability of extract meaningful signatures compared to multi-technique feature extraction.Therefore,the purpose of this paper is to explore the possibilities of enhanced content-based image recognition by fusion of classification decision obtained using diverse feature extraction techniques.Design/methodology/approach–Three novel techniques of feature extraction have been introduced in this paper and have been tested with four different classifiers individually.The four classifiers used for performance testing were K nearest neighbor(KNN)classifier,RIDOR classifier,artificial neural network classifier and support vector machine classifier.Thereafter,classification decisions obtained using KNN classifier for different feature extraction techniques have been integrated by Z-score normalization and feature scaling to create fusion-based framework of image recognition.It has been followed by the introduction of a fusion-based retrieval model to validate the retrieval performance with classified query.Earlier works on content-based image identification have adopted fusion-based approach.However,to the best of the authors’knowledge,fusion-based query classification has been addressed for the first time as a precursor of retrieval in this work.Findings–The proposed fusion techniques have successfully outclassed the state-of-the-art techniques in classification and retrieval performances.Four public data sets,namely,Wang data set,Oliva and Torralba(OT-scene)data set,Corel data set and Caltech data set comprising of 22,615 images on the whole are used for the evaluation purpose.Originality/value–To the best of the authors’knowledge,fusion-based query classification has been addressed for the first time as a precursor of retrieval in this work.The novel idea of exploring rich image features by fusion of multiple feature extraction techniques has also encouraged further research on dimensionality reduction of feature vectors for enhanced classification results.展开更多
基金funded by the Directorate of Research and Community Service,Directorate General of Research and Development,Ministry of Higher Education,Science and Technologyin accordance with the Implementation Contract for the Operational Assistance Program for State Universities,Research Program Number:109/C3/DT.05.00/PL/2025.
文摘Sudden wildfires cause significant global ecological damage.While satellite imagery has advanced early fire detection and mitigation,image-based systems face limitations including high false alarm rates,visual obstructions,and substantial computational demands,especially in complex forest terrains.To address these challenges,this study proposes a novel forest fire detection model utilizing audio classification and machine learning.We developed an audio-based pipeline using real-world environmental sound recordings.Sounds were converted into Mel-spectrograms and classified via a Convolutional Neural Network(CNN),enabling the capture of distinctive fire acoustic signatures(e.g.,crackling,roaring)that are minimally impacted by visual or weather conditions.Internet of Things(IoT)sound sensors were crucial for generating complex environmental parameters to optimize feature extraction.The CNN model achieved high performance in stratified 5-fold cross-validation(92.4%±1.6 accuracy,91.2%±1.8 F1-score)and on test data(94.93%accuracy,93.04%F1-score),with 98.44%precision and 88.32%recall,demonstrating reliability across environmental conditions.These results indicate that the audio-based approach not only improves detection reliability but also markedly reduces computational overhead compared to traditional image-based methods.The findings suggest that acoustic sensing integrated with machine learning offers a powerful,low-cost,and efficient solution for real-time forest fire monitoring in complex,dynamic environments.
基金supported in part by the Nationa Natural Science Foundation of China (61876011)the National Key Research and Development Program of China (2022YFB4703700)+1 种基金the Key Research and Development Program 2020 of Guangzhou (202007050002)the Key-Area Research and Development Program of Guangdong Province (2020B090921003)。
文摘Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.
基金Supported by the Science Item of National Power Company( No.SPKJ0 16 -0 71)
文摘A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at various levels of abstraction via effective integration of video, audio, and text data available from the news programs. Based on these news video structure and content analysis technologies, a TV news video Library is generated, from which users can retrieve definite news story according to their demands.
文摘AIM:To support probe-based confocal laser endomi-croscopy (pCLE) diagnosis by designing software for the automated classification of colonic polyps. METHODS:Intravenous fluorescein pCLE imaging of colorectal lesions was performed on patients under-going screening and surveillance colonoscopies, followed by polypectomies. All resected specimens were reviewed by a reference gastrointestinal pathologist blinded to pCLE information. Histopathology was used as the criterion standard for the differentiation between neoplastic and non-neoplastic lesions. The pCLE video sequences, recorded for each polyp, were analyzed off-line by 2 expert endoscopists who were blinded to the endoscopic characteristics and histopathology. These pCLE videos, along with their histopathology diagnosis, were used to train the automated classification software which is a content-based image retrieval technique followed by k-nearest neighbor classification. The performance of the off-line diagnosis of pCLE videos established by the 2 expert endoscopists was compared with that of automated pCLE software classification. All evaluations were performed using leave-one-patient- out cross-validation to avoid bias. RESULTS:Colorectal lesions (135) were imaged in 71 patients. Based on histopathology, 93 of these 135 lesions were neoplastic and 42 were non-neoplastic. The study found no statistical significance for the difference between the performance of automated pCLE software classification (accuracy 89.6%, sensitivity 92.5%, specificity 83.3%, using leave-one-patient-out cross-validation) and the performance of the off-line diagnosis of pCLE videos established by the 2 expert endoscopists (accuracy 89.6%, sensitivity 91.4%, specificity 85.7%). There was very low power (< 6%) to detect the observed differences. The 95% confidence intervals for equivalence testing were:-0.073 to 0.073 for accuracy, -0.068 to 0.089 for sensitivity and -0.18 to 0.13 for specificity. The classification software proposed in this study is not a "black box" but an informative tool based on the query by example model that produces, as intermediate results, visually similar annotated videos that are directly interpretable by the endoscopist. CONCLUSION:The proposed software for automated classification of pCLE videos of colonic polyps achieves high performance, comparable to that of off-line diagnosis of pCLE videos established by expert endoscopists.
文摘This paper presents a novel efficient semantic image classification algorithm for high-level feature indexing of high-dimension image database. Experiments show that the algorithm performs well. The size of the train set and the test set is 7 537 and 5 000 respectively. Based on this theory, another ground is built with 12,000 images, which are divided into three classes: city, landscape and person, the total result of the classifications is 88.92%, meanwhile, some preliminary results are presented for image understanding based on semantic image classification and low level features. The groundtruth for the experiments is built with the images from Corel database, photos and some famous face databases.
文摘Purpose–Current practices in data classification and retrieval have experienced a surge in the use of multimedia content.Identification of desired information from the huge image databases has been facing increased complexities for designing an efficient feature extraction process.Conventional approaches of image classification with text-based image annotation have faced assorted limitations due to erroneous interpretation of vocabulary and huge time consumption involved due to manual annotation.Content-based image recognition has emerged as an alternative to combat the aforesaid limitations.However,exploring rich feature content in an image with a single technique has lesser probability of extract meaningful signatures compared to multi-technique feature extraction.Therefore,the purpose of this paper is to explore the possibilities of enhanced content-based image recognition by fusion of classification decision obtained using diverse feature extraction techniques.Design/methodology/approach–Three novel techniques of feature extraction have been introduced in this paper and have been tested with four different classifiers individually.The four classifiers used for performance testing were K nearest neighbor(KNN)classifier,RIDOR classifier,artificial neural network classifier and support vector machine classifier.Thereafter,classification decisions obtained using KNN classifier for different feature extraction techniques have been integrated by Z-score normalization and feature scaling to create fusion-based framework of image recognition.It has been followed by the introduction of a fusion-based retrieval model to validate the retrieval performance with classified query.Earlier works on content-based image identification have adopted fusion-based approach.However,to the best of the authors’knowledge,fusion-based query classification has been addressed for the first time as a precursor of retrieval in this work.Findings–The proposed fusion techniques have successfully outclassed the state-of-the-art techniques in classification and retrieval performances.Four public data sets,namely,Wang data set,Oliva and Torralba(OT-scene)data set,Corel data set and Caltech data set comprising of 22,615 images on the whole are used for the evaluation purpose.Originality/value–To the best of the authors’knowledge,fusion-based query classification has been addressed for the first time as a precursor of retrieval in this work.The novel idea of exploring rich image features by fusion of multiple feature extraction techniques has also encouraged further research on dimensionality reduction of feature vectors for enhanced classification results.