针对地图综合中建筑多边形化简方法依赖人工规则、自动化程度低且难以利用已有化简成果的问题,本文提出了一种基于Transformer机制的建筑多边形化简模型。该模型首先把建筑多边形映射至一定范围的网格空间,将建筑多边形的坐标串表达为...针对地图综合中建筑多边形化简方法依赖人工规则、自动化程度低且难以利用已有化简成果的问题,本文提出了一种基于Transformer机制的建筑多边形化简模型。该模型首先把建筑多边形映射至一定范围的网格空间,将建筑多边形的坐标串表达为网格序列,从而获取建筑多边形化简前后的Token序列,构建出建筑多边形化简样本对数据;随后采用Transformer架构建立模型,基于样本数据利用模型的掩码自注意力机制学习点序列之间的依赖关系,最终逐点生成新的简化多边形,从而实现建筑多边形的化简。在训练过程中,模型使用结构化的样本数据,设计了忽略特定索引的交叉熵损失函数以提升化简质量。试验设计包括主试验与泛化验证两部分。主试验基于洛杉矶1∶2000建筑数据集,分别采用0.2、0.3和0.5 mm 3种网格尺寸对多边形进行编码,实现了目标比例尺为1∶5000与1∶10000的化简。试验结果表明,在0.3 mm的网格尺寸下模型性能最优,验证集上的化简结果与人工标注的一致率超过92.0%,且针对北京部分区域的建筑多边形数据的泛化试验验证了模型的迁移能力;与LSTM模型的对比分析显示,在参数规模相近的条件下,LSTM模型无法形成有效收敛,并生成可用结果。本文证实了Transformer在处理空间几何序列任务中的潜力,且能够有效复用已有化简样本,为智能建筑多边形化简提供了具有工程实用价值的途径。展开更多
The language barrier is the biggest obstacle for users watching foreign-language videos.Because of this,videos cannot be famous across borders,and their viewership is limited to a single language and culture.The easie...The language barrier is the biggest obstacle for users watching foreign-language videos.Because of this,videos cannot be famous across borders,and their viewership is limited to a single language and culture.The easiest way to solve this problem is to add subtitles in the language of the viewer.However,the current subtitling system lacks incentives,the ability to build a secure transaction environment,and a trusting relationship between video creators and subtitling makers.In response to the above situation,a tokenized subtitling crowdsourcing system(TSCS)based on blockchain and smart contract technologies is proposed.The source files for the subtitles are stored on the inter-planetary file system(IPFS)in the proposed system.Based on the ERC-721 standard,the returned corresponding address and subtitling-related information are made into a non-fungible token(NFT).At the same time,depending on the expected revenue from video view counts,the video token(VT),based on the ERC-777 standard and endorsed by the video platform,will be used as the payment token.The TSCS has two payment strategies:one-time and dividend.Through such a settlement mechanism,the subtitling maker’s revenue is also guaranteed by the code invariance and rule certainty of smart contract deployment.On the other hand,introducing an incentive mechanism for viewers to audit subtitles enables community autonomy,thus increasing the applicability of subtitles and the activity of users.展开更多
Legal case classification involves the categorization of legal documents into predefined categories,which facilitates legal information retrieval and case management.However,real-world legal datasets often suffer from...Legal case classification involves the categorization of legal documents into predefined categories,which facilitates legal information retrieval and case management.However,real-world legal datasets often suffer from class imbalances due to the uneven distribution of case types across legal domains.This leads to biased model performance,in the form of high accuracy for overrepresented categories and underperformance for minority classes.To address this issue,in this study,we propose a data augmentation method that masks unimportant terms within a document selectively while preserving key terms fromthe perspective of the legal domain.This approach enhances data diversity and improves the generalization capability of conventional models.Our experiments demonstrate consistent improvements achieved by the proposed augmentation strategy in terms of accuracy and F1 score across all models,validating the effectiveness of the proposed method in legal case classification.展开更多
With the increasing growth of online news,fake electronic news detection has become one of the most important paradigms of modern research.Traditional electronic news detection techniques are generally based on contex...With the increasing growth of online news,fake electronic news detection has become one of the most important paradigms of modern research.Traditional electronic news detection techniques are generally based on contextual understanding,sequential dependencies,and/or data imbalance.This makes distinction between genuine and fabricated news a challenging task.To address this problem,we propose a novel hybrid architecture,T5-SA-LSTM,which synergistically integrates the T5 Transformer for semantically rich contextual embedding with the Self-Attentionenhanced(SA)Long Short-Term Memory(LSTM).The LSTM is trained using the Adam optimizer,which provides faster and more stable convergence compared to the Stochastic Gradient Descend(SGD)and Root Mean Square Propagation(RMSProp).The WELFake and FakeNewsPrediction datasets are used,which consist of labeled news articles having fake and real news samples.Tokenization and Synthetic Minority Over-sampling Technique(SMOTE)methods are used for data preprocessing to ensure linguistic normalization and class imbalance.The incorporation of the Self-Attention(SA)mechanism enables the model to highlight critical words and phrases,thereby enhancing predictive accuracy.The proposed model is evaluated using accuracy,precision,recall(sensitivity),and F1-score as performance metrics.The model achieved 99%accuracy on the WELFake dataset and 96.5%accuracy on the FakeNewsPrediction dataset.It outperformed the competitive schemes such as T5-SA-LSTM(RMSProp),T5-SA-LSTM(SGD)and some other models.展开更多
Nonfungible tokens(NFTs)have become highly sought-after assets in recent years,exhibiting potential for profitability and hedging.The large and lucrative NFT market has attracted both practitioners and researchers to ...Nonfungible tokens(NFTs)have become highly sought-after assets in recent years,exhibiting potential for profitability and hedging.The large and lucrative NFT market has attracted both practitioners and researchers to develop NFT price-prediction models.However,the extant models have some weaknesses in terms of model comprehensiveness and operational convenience.To address these research gaps,we propose a multimodal end-to-end interpretable deep learning(MEID)framework for NFT investment.Our model integrates visual features,textual descriptions,transaction indicators,and historical price time series by leveraging the advantages of convolutional neural networks(CNNs),adopts integrated gradient(IG)to improve interpretability,and designs a built-in financial evaluation mechanism to generate not only the predicted price category but also the recommended purchase level.The experimental results demonstrate that the proposed MEID framework has excellent properties in terms of the evaluation metrics.The proposed MEID framework could help investors identify market opportunities and help NFT transaction platforms design smart investment tools and improve transaction volume.展开更多
Data trading is a crucial means of unlocking the value of Internet of Things(IoT)data.However,IoT data differs from traditional material goods due to its intangible and replicable nature.This difference leads to ambig...Data trading is a crucial means of unlocking the value of Internet of Things(IoT)data.However,IoT data differs from traditional material goods due to its intangible and replicable nature.This difference leads to ambiguous data rights,confusing pricing,and challenges in matching.Additionally,centralized IoT data trading platforms pose risks such as privacy leakage.To address these issues,we propose a profit-driven distributed trading mechanism for IoT data.First,a blockchain-based trading architecture for IoT data,leveraging the transparent and tamper-proof features of blockchain technology,is proposed to establish trust between data owners and data requesters.Second,an IoT data registration method that encompasses both rights confirmation and pricing is designed.The data right confirmation method uses non-fungible token to record ownership and authenticate IoT data.For pricing,we develop an IoT data value assessment index system and introduce a pricing model based on a combination of the sparrow search algorithm and the back propagation neural network.Finally,an IoT data matching method is designed based on the Stackelberg game.This establishes a Stackelberg game model involving multiple data owners and requesters,employing a hierarchical optimization method to determine the optimal purchase strategy.The security of the mechanism is analyzed and the performance of both the pricing method and matching method is evaluated.Experiments demonstrate that both methods outperform traditional approaches in terms of error rates and profit maximization.展开更多
Drone photography is an essential building block of intelligent transportation,enabling wide-ranging monitoring,precise positioning,and rapid transmission.However,the high computational cost of transformer-based metho...Drone photography is an essential building block of intelligent transportation,enabling wide-ranging monitoring,precise positioning,and rapid transmission.However,the high computational cost of transformer-based methods in object detection tasks hinders real-time result transmission in drone target detection applications.Therefore,we propose mask adaptive transformer (MAT) tailored for such scenarios.Specifically,we introduce a structure that supports collaborative token sparsification in support windows,enhancing fault tolerance and reducing computational overhead.This structure comprises two modules:a binary mask strategy and adaptive window self-attention (A-WSA).The binary mask strategy focuses on significant objects in various complex scenes.The A-WSA mechanism is employed to self-attend for balance perfomance and computational cost to select objects and isolate all contextual leakage.Extensive experiments on the challenging CarPK and VisDrone datasets demonstrate the effectiveness and superiority of the proposed method.Specifically,it achieves a mean average precision (mAP@0.5) improvement of 1.25%over car detector based on you only look once version 5 (CD-YOLOv5) on the CarPK dataset and a 3.75%average precision(AP@0.5) improvement over cascaded zoom-in detector (CZ Det) on the VisDrone dataset.展开更多
In the metaverse,digital assets are essential to define identity,shape the virtual environment,and facilitate economic transactions.This study introduces a novel feature to the metaverse by capturing a fundamental asp...In the metaverse,digital assets are essential to define identity,shape the virtual environment,and facilitate economic transactions.This study introduces a novel feature to the metaverse by capturing a fundamental aspect of individuals–their conversations–and transforming them into digital assets.It utilizes natural language processing and machine learning methods to extract key sentences from user conversations and match them with emojis that reflect their sentiments.The selected sentence,which encapsulates the essence of the user’s statements,is then transformed into digital art through a generative visual model.This digital artwork is transformed into a non-fungible token,becoming a valuable digital asset within the blockchain ecosystem that is ideal for integration into metaverse applications.Our aim is to manage personality traits as digital assets to foster individual uniqueness,enrich user experiences,and facilitate more personalized services and interactions with both like-minded users and non-player characters,thereby enhancing the overall user journey.展开更多
文摘针对地图综合中建筑多边形化简方法依赖人工规则、自动化程度低且难以利用已有化简成果的问题,本文提出了一种基于Transformer机制的建筑多边形化简模型。该模型首先把建筑多边形映射至一定范围的网格空间,将建筑多边形的坐标串表达为网格序列,从而获取建筑多边形化简前后的Token序列,构建出建筑多边形化简样本对数据;随后采用Transformer架构建立模型,基于样本数据利用模型的掩码自注意力机制学习点序列之间的依赖关系,最终逐点生成新的简化多边形,从而实现建筑多边形的化简。在训练过程中,模型使用结构化的样本数据,设计了忽略特定索引的交叉熵损失函数以提升化简质量。试验设计包括主试验与泛化验证两部分。主试验基于洛杉矶1∶2000建筑数据集,分别采用0.2、0.3和0.5 mm 3种网格尺寸对多边形进行编码,实现了目标比例尺为1∶5000与1∶10000的化简。试验结果表明,在0.3 mm的网格尺寸下模型性能最优,验证集上的化简结果与人工标注的一致率超过92.0%,且针对北京部分区域的建筑多边形数据的泛化试验验证了模型的迁移能力;与LSTM模型的对比分析显示,在参数规模相近的条件下,LSTM模型无法形成有效收敛,并生成可用结果。本文证实了Transformer在处理空间几何序列任务中的潜力,且能够有效复用已有化简样本,为智能建筑多边形化简提供了具有工程实用价值的途径。
文摘The language barrier is the biggest obstacle for users watching foreign-language videos.Because of this,videos cannot be famous across borders,and their viewership is limited to a single language and culture.The easiest way to solve this problem is to add subtitles in the language of the viewer.However,the current subtitling system lacks incentives,the ability to build a secure transaction environment,and a trusting relationship between video creators and subtitling makers.In response to the above situation,a tokenized subtitling crowdsourcing system(TSCS)based on blockchain and smart contract technologies is proposed.The source files for the subtitles are stored on the inter-planetary file system(IPFS)in the proposed system.Based on the ERC-721 standard,the returned corresponding address and subtitling-related information are made into a non-fungible token(NFT).At the same time,depending on the expected revenue from video view counts,the video token(VT),based on the ERC-777 standard and endorsed by the video platform,will be used as the payment token.The TSCS has two payment strategies:one-time and dividend.Through such a settlement mechanism,the subtitling maker’s revenue is also guaranteed by the code invariance and rule certainty of smart contract deployment.On the other hand,introducing an incentive mechanism for viewers to audit subtitles enables community autonomy,thus increasing the applicability of subtitles and the activity of users.
基金supported by the Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)[RS-2021-II211341,Artificial Intelligence Graduate School Program(Chung-Ang University)],and by the Chung-Ang University Graduate Research Scholarship in 2024.
文摘Legal case classification involves the categorization of legal documents into predefined categories,which facilitates legal information retrieval and case management.However,real-world legal datasets often suffer from class imbalances due to the uneven distribution of case types across legal domains.This leads to biased model performance,in the form of high accuracy for overrepresented categories and underperformance for minority classes.To address this issue,in this study,we propose a data augmentation method that masks unimportant terms within a document selectively while preserving key terms fromthe perspective of the legal domain.This approach enhances data diversity and improves the generalization capability of conventional models.Our experiments demonstrate consistent improvements achieved by the proposed augmentation strategy in terms of accuracy and F1 score across all models,validating the effectiveness of the proposed method in legal case classification.
基金supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R195)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘With the increasing growth of online news,fake electronic news detection has become one of the most important paradigms of modern research.Traditional electronic news detection techniques are generally based on contextual understanding,sequential dependencies,and/or data imbalance.This makes distinction between genuine and fabricated news a challenging task.To address this problem,we propose a novel hybrid architecture,T5-SA-LSTM,which synergistically integrates the T5 Transformer for semantically rich contextual embedding with the Self-Attentionenhanced(SA)Long Short-Term Memory(LSTM).The LSTM is trained using the Adam optimizer,which provides faster and more stable convergence compared to the Stochastic Gradient Descend(SGD)and Root Mean Square Propagation(RMSProp).The WELFake and FakeNewsPrediction datasets are used,which consist of labeled news articles having fake and real news samples.Tokenization and Synthetic Minority Over-sampling Technique(SMOTE)methods are used for data preprocessing to ensure linguistic normalization and class imbalance.The incorporation of the Self-Attention(SA)mechanism enables the model to highlight critical words and phrases,thereby enhancing predictive accuracy.The proposed model is evaluated using accuracy,precision,recall(sensitivity),and F1-score as performance metrics.The model achieved 99%accuracy on the WELFake dataset and 96.5%accuracy on the FakeNewsPrediction dataset.It outperformed the competitive schemes such as T5-SA-LSTM(RMSProp),T5-SA-LSTM(SGD)and some other models.
基金supported by the National Key Research and Development Program of China(Project No.2022YFC3320800)the National Natural Science Foundation of China(Project No.72571210).
文摘Nonfungible tokens(NFTs)have become highly sought-after assets in recent years,exhibiting potential for profitability and hedging.The large and lucrative NFT market has attracted both practitioners and researchers to develop NFT price-prediction models.However,the extant models have some weaknesses in terms of model comprehensiveness and operational convenience.To address these research gaps,we propose a multimodal end-to-end interpretable deep learning(MEID)framework for NFT investment.Our model integrates visual features,textual descriptions,transaction indicators,and historical price time series by leveraging the advantages of convolutional neural networks(CNNs),adopts integrated gradient(IG)to improve interpretability,and designs a built-in financial evaluation mechanism to generate not only the predicted price category but also the recommended purchase level.The experimental results demonstrate that the proposed MEID framework has excellent properties in terms of the evaluation metrics.The proposed MEID framework could help investors identify market opportunities and help NFT transaction platforms design smart investment tools and improve transaction volume.
基金supported by the National Key Research and Development Program of China(No.2022YFF0610003)the BUPT Excellent Ph.D.Students Foundation(No.CX2022218)the Fund of Central University Basic Research Projects(No.2023ZCTH11).
文摘Data trading is a crucial means of unlocking the value of Internet of Things(IoT)data.However,IoT data differs from traditional material goods due to its intangible and replicable nature.This difference leads to ambiguous data rights,confusing pricing,and challenges in matching.Additionally,centralized IoT data trading platforms pose risks such as privacy leakage.To address these issues,we propose a profit-driven distributed trading mechanism for IoT data.First,a blockchain-based trading architecture for IoT data,leveraging the transparent and tamper-proof features of blockchain technology,is proposed to establish trust between data owners and data requesters.Second,an IoT data registration method that encompasses both rights confirmation and pricing is designed.The data right confirmation method uses non-fungible token to record ownership and authenticate IoT data.For pricing,we develop an IoT data value assessment index system and introduce a pricing model based on a combination of the sparrow search algorithm and the back propagation neural network.Finally,an IoT data matching method is designed based on the Stackelberg game.This establishes a Stackelberg game model involving multiple data owners and requesters,employing a hierarchical optimization method to determine the optimal purchase strategy.The security of the mechanism is analyzed and the performance of both the pricing method and matching method is evaluated.Experiments demonstrate that both methods outperform traditional approaches in terms of error rates and profit maximization.
文摘Drone photography is an essential building block of intelligent transportation,enabling wide-ranging monitoring,precise positioning,and rapid transmission.However,the high computational cost of transformer-based methods in object detection tasks hinders real-time result transmission in drone target detection applications.Therefore,we propose mask adaptive transformer (MAT) tailored for such scenarios.Specifically,we introduce a structure that supports collaborative token sparsification in support windows,enhancing fault tolerance and reducing computational overhead.This structure comprises two modules:a binary mask strategy and adaptive window self-attention (A-WSA).The binary mask strategy focuses on significant objects in various complex scenes.The A-WSA mechanism is employed to self-attend for balance perfomance and computational cost to select objects and isolate all contextual leakage.Extensive experiments on the challenging CarPK and VisDrone datasets demonstrate the effectiveness and superiority of the proposed method.Specifically,it achieves a mean average precision (mAP@0.5) improvement of 1.25%over car detector based on you only look once version 5 (CD-YOLOv5) on the CarPK dataset and a 3.75%average precision(AP@0.5) improvement over cascaded zoom-in detector (CZ Det) on the VisDrone dataset.
文摘In the metaverse,digital assets are essential to define identity,shape the virtual environment,and facilitate economic transactions.This study introduces a novel feature to the metaverse by capturing a fundamental aspect of individuals–their conversations–and transforming them into digital assets.It utilizes natural language processing and machine learning methods to extract key sentences from user conversations and match them with emojis that reflect their sentiments.The selected sentence,which encapsulates the essence of the user’s statements,is then transformed into digital art through a generative visual model.This digital artwork is transformed into a non-fungible token,becoming a valuable digital asset within the blockchain ecosystem that is ideal for integration into metaverse applications.Our aim is to manage personality traits as digital assets to foster individual uniqueness,enrich user experiences,and facilitate more personalized services and interactions with both like-minded users and non-player characters,thereby enhancing the overall user journey.