At present,convolutional neural networks(CNNs)and transformers surpass humans in many situations(such as face recognition and object classification),but do not work well in identifying fibers in textile surface images...At present,convolutional neural networks(CNNs)and transformers surpass humans in many situations(such as face recognition and object classification),but do not work well in identifying fibers in textile surface images.Hence,this paper proposes an architecture named FiberCT which takes advantages of the feature extraction capability of CNNs and the long-range modeling capability of transformer decoders to adaptively extract multiple types of fiber features.Firstly,the convolution module extracts fiber features from the input textile surface images.Secondly,these features are sent into the transformer decoder module where label embeddings are compared with the features of each type of fibers through multi-head cross-attention and the desired features are pooled adaptively.Finally,an asymmetric loss further purifies the extracted fiber representations.Experiments show that FiberCT can more effectively extract the representations of various types of fibers and improve fiber identification accuracy than state-of-the-art multi-label classification approaches.展开更多
基金National Natural Science Foundation of China(No.61972081)Fundamental Research Funds for the Central Universities,China(No.2232023Y-01)Natural Science Foundation of Shanghai,China(No.22ZR1400200)。
文摘At present,convolutional neural networks(CNNs)and transformers surpass humans in many situations(such as face recognition and object classification),but do not work well in identifying fibers in textile surface images.Hence,this paper proposes an architecture named FiberCT which takes advantages of the feature extraction capability of CNNs and the long-range modeling capability of transformer decoders to adaptively extract multiple types of fiber features.Firstly,the convolution module extracts fiber features from the input textile surface images.Secondly,these features are sent into the transformer decoder module where label embeddings are compared with the features of each type of fibers through multi-head cross-attention and the desired features are pooled adaptively.Finally,an asymmetric loss further purifies the extracted fiber representations.Experiments show that FiberCT can more effectively extract the representations of various types of fibers and improve fiber identification accuracy than state-of-the-art multi-label classification approaches.