Sign language is used as a communication medium in the field of trade,defence,and in deaf-mute communities worldwide.Over the last few decades,research in the domain of translation of sign language has grown and becom...Sign language is used as a communication medium in the field of trade,defence,and in deaf-mute communities worldwide.Over the last few decades,research in the domain of translation of sign language has grown and become more challenging.This necessitates the development of a Sign Language Translation System(SLTS)to provide effective communication in different research domains.In this paper,novel Hybrid Adaptive Gaussian Thresholding with Otsu Algorithm(Hybrid-AO)for image segmentation is proposed for the translation of alphabet-level Indian Sign Language(ISLTS)with a 5-layer Convolution Neural Network(CNN).The focus of this paper is to analyze various image segmentation(Canny Edge Detection,Simple Thresholding,and Hybrid-AO),pooling approaches(Max,Average,and Global Average Pooling),and activation functions(ReLU,Leaky ReLU,and ELU).5-layer CNN with Max pooling,Leaky ReLU activation function,and Hybrid-AO(5MXLR-HAO)have outperformed other frameworks.An open-access dataset of ISL alphabets with approx.31 K images of 26 classes have been used to train and test the model.The proposed framework has been developed for translating alphabet-level Indian Sign Language into text.The proposed framework attains 98.95%training accuracy,98.05%validation accuracy,and 0.0721 training loss and 0.1021 validation loss and the perfor-mance of the proposed system outperforms other existing systems.展开更多
Sign languages are mainly expressed by human actions,such as arm,hand,and finger motions.Thus a skeleton which reflects human pose information can provide an important cue for distinguishing signs(i.e.,human actions),...Sign languages are mainly expressed by human actions,such as arm,hand,and finger motions.Thus a skeleton which reflects human pose information can provide an important cue for distinguishing signs(i.e.,human actions),and can be used for sign language translation(SLT),which aims to translate sign language to spoken language.However,the recent neural networks typically focus on extracting local-area or full-frame features,while ignoring informative skeleton features.Therefore,this paper proposes a novel skeleton-aware neural network,SANet,for vision-based SLT.Specifically,to introduce skeleton modality,we design a self-contained branch for skeleton extraction.To efficiently guide the feature extraction from videos with skeletons,we concatenate the skeleton channel and RGB channels of each frame for feature extraction.To distinguish the importance of clips(i.e.,segmented short videos),we construct a skeleton-based graph convolutional network,GCN,for feature scaling,i.e.,giving an importance weight for each clip.Finally,to generate spoken language from features,we provide an end-to-end method and a two-stage method for SLT.Besides,based on SANet,we provide an SLT solution on the smartphone for benefiting communication between hearing-impaired people and normal people.Extensive experiments on three public datasets and case studies in real scenarios demonstrate the effectiveness of our method,which outperforms existing methods.展开更多
Sign language dataset is essential in sign language recognition and translation(SLRT). Current public sign language datasets are small and lack diversity, which does not meet the practical application requirements for...Sign language dataset is essential in sign language recognition and translation(SLRT). Current public sign language datasets are small and lack diversity, which does not meet the practical application requirements for SLRT. However, making a large-scale and diverse sign language dataset is difficult as sign language data on the Internet is scarce. In making a large-scale and diverse sign language dataset, some sign language data qualities are not up to standard. This paper proposes a two information streams transformer(TIST) model to judge whether the quality of sign language data is qualified. To verify that TIST effectively improves sign language recognition(SLR), we make two datasets, the screened dataset and the unscreened dataset. In this experiment, this paper uses visual alignment constraint(VAC) as the baseline model. The experimental results show that the screened dataset can achieve better word error rate(WER) than the unscreened dataset.展开更多
Purpose–According to the Indian Sign Language Research and Training Centre(ISLRTC),India has approximately 300 certified human interpreters to help people with hearing loss.This paper aims to address the issue of Ind...Purpose–According to the Indian Sign Language Research and Training Centre(ISLRTC),India has approximately 300 certified human interpreters to help people with hearing loss.This paper aims to address the issue of Indian Sign Language(ISL)sentence recognition and translation into semantically equivalent English text in a signer-independent mode.Design/methodology/approach–This study presents an approach that translates ISL sentences into English text using the MobileNetV2 model and Neural Machine Translation(NMT).The authors have created an ISL corpus from the Brown corpus using ISL grammar rules to perform machine translation.The authors’approach converts ISL videos of the newly created dataset into ISL gloss sequences using the MobileNetV2 model and the recognized ISL gloss sequence is then fed to a machine translation module that generates an English sentence for each ISL sentence.Findings–As per the experimental results,pretrained MobileNetV2 model was proven the best-suited model for the recognition of ISL sentences and NMT provided better results than Statistical Machine Translation(SMT)to convert ISL text into English text.The automatic and human evaluation of the proposed approach yielded accuracies of 83.3 and 86.1%,respectively.Research limitations/implications–It can be seen that the neural machine translation systems produced translations with repetitions of other translated words,strange translations when the total number of words per sentence is increased and one or more unexpected terms that had no relation to the source text on occasion.The most common type of error is the mistranslation of places,numbers and dates.Although this has little effect on the overall structure of the translated sentence,it indicates that the embedding learned for these few words could be improved.Originality/value–Sign language recognition and translation is a crucial step toward improving communication between the deaf and the rest of society.Because of the shortage of human interpreters,an alternative approach is desired to help people achieve smooth communication with the Deaf.To motivate research in this field,the authors generated an ISL corpus of 13,720 sentences and a video dataset of 47,880 ISL videos.As there is no public dataset available for ISl videos incorporating signs released by ISLRTC,the authors created a new video dataset and ISL corpus.展开更多
This article proposes that the translation violence of the colonizing process—in terms of the confrontation between oral languages and ones with written grammars—coupled with the institutionalization of a given lang...This article proposes that the translation violence of the colonizing process—in terms of the confrontation between oral languages and ones with written grammars—coupled with the institutionalization of a given language as the national one,contains both a memory of the silencing of indigenous languages and cultures,and of transculturation.The latter,in turn,manifests itself,among other ways,through books that in different ways deal with the relationship between the West and descendants of the first-nation peoples of South America(especially in Circum-Roraima,the triple-border region between Brazil,Venezuela,and Guyana)and those of African descent(particularly in discussions about the use of the colonizer’s language or the indigenous languages by writers from former colonies).展开更多
Interlingual communication is actually intercultural communication,Language is part of the culture of a people and the chief way by which the members of a society commnuicate。Basing on these points,this article gives...Interlingual communication is actually intercultural communication,Language is part of the culture of a people and the chief way by which the members of a society commnuicate。Basing on these points,this article gives examlples to show the facts that there esist in translation semantic correspondence and semantic aero between different languages and different cultures.When we do translating we should try to make up semantic aero caused by the culture.展开更多
Translation for language teaching is different from general translation which is characterized with Faithfulness, Expressiveness, and Elegance. The differences lie in vocabulary, structure, and discourse. The extreme ...Translation for language teaching is different from general translation which is characterized with Faithfulness, Expressiveness, and Elegance. The differences lie in vocabulary, structure, and discourse. The extreme emphasis of translation skills will make it hard to learn certain language elements for the English learners. The paper makes an analysis on the three levels of translation for language teaching from the perspective of Skopos Theory, aiming at drawing attention from the translation teachers to care more about students' demands of learning language elements through translation.展开更多
Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word a...Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word aligned bilingual corpus,while ignoring the effect of the number of adjacent bilingual phrases.In this paper,we propose a method to take the number of adjacent phrases into account for better estimation of reordering models.Instead of just checking whether there is one phrase adjacent to a given phrase,our method firstly uses a compact structure named reordering graph to represent all phrase segmentations of a parallel sentence,then the effect of the adjacent phrase number can be quantified in a forward-backward fashion,and finally incorporated into the estimation of reordering models.Experimental results on the NIST Chinese-English and WMT French-Spanish data sets show that our approach significantly outperforms the baseline method.展开更多
The performance of a machine translation system heavily depends on the quantity and quality of the bilingual language resource. However,getting a parallel corpus,which has a large scale and is of high quality,is a ver...The performance of a machine translation system heavily depends on the quantity and quality of the bilingual language resource. However,getting a parallel corpus,which has a large scale and is of high quality,is a very difficult task especially for low resource languages such as Chinese-Vietnamese. Fortunately,multilingual user generated contents( UGC),such as bilingual movie subtitles,provide us access to automatic construction of the parallel corpus. Although the amount of UGC parallel corpora can be considerable,the original corpus is not suitable for statistical machine translation( SMT) systems. The corpus may contain translation errors,sentence mismatching,free translations,etc. To improve the quality of the bilingual corpus for SMT systems,three filtering methods are proposed: sentence length difference,the semantic of sentence pairs,and machine learning. Experiments are conducted on the Chinese to Vietnamese translation corpus.Experimental results demonstrate that all the three methods effectively improve the corpus quality,and the machine translation performance( BLEU score) can be improved by 1. 32.展开更多
This study presents a novel and innovative approach to auto-matically translating Arabic Sign Language(ATSL)into spoken Arabic.The proposed solution utilizes a deep learning-based classification approach and the trans...This study presents a novel and innovative approach to auto-matically translating Arabic Sign Language(ATSL)into spoken Arabic.The proposed solution utilizes a deep learning-based classification approach and the transfer learning technique to retrain 12 image recognition models.The image-based translation method maps sign language gestures to corre-sponding letters or words using distance measures and classification as a machine learning technique.The results show that the proposed model is more accurate and faster than traditional image-based models in classifying Arabic-language signs,with a translation accuracy of 93.7%.This research makes a significant contribution to the field of ATSL.It offers a practical solution for improving communication for individuals with special needs,such as the deaf and mute community.This work demonstrates the potential of deep learning techniques in translating sign language into natural language and highlights the importance of ATSL in facilitating communication for individuals with disabilities.展开更多
An algorithm is proposed for registering images related by translation, rotation, and scale based on angular and radial difference functions. In frequency domain, the spatial translation parameters are computed via ph...An algorithm is proposed for registering images related by translation, rotation, and scale based on angular and radial difference functions. In frequency domain, the spatial translation parameters are computed via phase correlation method. The magnitudes of images are represented in log-polar grid, and the angular and radial difference functions are given and applied to measure shifts in both angular and radial dimensions for rotation and scale estimation. Experimental results show that this method achieves the same accurate level as classic fast Fourier transform (FFT) based method with invariance to illumination change and lower computation costs.展开更多
文摘Sign language is used as a communication medium in the field of trade,defence,and in deaf-mute communities worldwide.Over the last few decades,research in the domain of translation of sign language has grown and become more challenging.This necessitates the development of a Sign Language Translation System(SLTS)to provide effective communication in different research domains.In this paper,novel Hybrid Adaptive Gaussian Thresholding with Otsu Algorithm(Hybrid-AO)for image segmentation is proposed for the translation of alphabet-level Indian Sign Language(ISLTS)with a 5-layer Convolution Neural Network(CNN).The focus of this paper is to analyze various image segmentation(Canny Edge Detection,Simple Thresholding,and Hybrid-AO),pooling approaches(Max,Average,and Global Average Pooling),and activation functions(ReLU,Leaky ReLU,and ELU).5-layer CNN with Max pooling,Leaky ReLU activation function,and Hybrid-AO(5MXLR-HAO)have outperformed other frameworks.An open-access dataset of ISL alphabets with approx.31 K images of 26 classes have been used to train and test the model.The proposed framework has been developed for translating alphabet-level Indian Sign Language into text.The proposed framework attains 98.95%training accuracy,98.05%validation accuracy,and 0.0721 training loss and 0.1021 validation loss and the perfor-mance of the proposed system outperforms other existing systems.
基金supported by Jiangsu Provincial Key Research and Development Program under Grant No.BE2020001-4the National Natural Science Foundation of China under Grant Nos.62172208,62272216,and 61832008supported by the Collaborative Innovation Center of Novel Software Technology and Industrialization.
文摘Sign languages are mainly expressed by human actions,such as arm,hand,and finger motions.Thus a skeleton which reflects human pose information can provide an important cue for distinguishing signs(i.e.,human actions),and can be used for sign language translation(SLT),which aims to translate sign language to spoken language.However,the recent neural networks typically focus on extracting local-area or full-frame features,while ignoring informative skeleton features.Therefore,this paper proposes a novel skeleton-aware neural network,SANet,for vision-based SLT.Specifically,to introduce skeleton modality,we design a self-contained branch for skeleton extraction.To efficiently guide the feature extraction from videos with skeletons,we concatenate the skeleton channel and RGB channels of each frame for feature extraction.To distinguish the importance of clips(i.e.,segmented short videos),we construct a skeleton-based graph convolutional network,GCN,for feature scaling,i.e.,giving an importance weight for each clip.Finally,to generate spoken language from features,we provide an end-to-end method and a two-stage method for SLT.Besides,based on SANet,we provide an SLT solution on the smartphone for benefiting communication between hearing-impaired people and normal people.Extensive experiments on three public datasets and case studies in real scenarios demonstrate the effectiveness of our method,which outperforms existing methods.
基金supported by the National Language Commission to research on sign language data specifications for artificial intelligence applications and test standards for language service translation systems (No.ZDI145-70)。
文摘Sign language dataset is essential in sign language recognition and translation(SLRT). Current public sign language datasets are small and lack diversity, which does not meet the practical application requirements for SLRT. However, making a large-scale and diverse sign language dataset is difficult as sign language data on the Internet is scarce. In making a large-scale and diverse sign language dataset, some sign language data qualities are not up to standard. This paper proposes a two information streams transformer(TIST) model to judge whether the quality of sign language data is qualified. To verify that TIST effectively improves sign language recognition(SLR), we make two datasets, the screened dataset and the unscreened dataset. In this experiment, this paper uses visual alignment constraint(VAC) as the baseline model. The experimental results show that the screened dataset can achieve better word error rate(WER) than the unscreened dataset.
文摘Purpose–According to the Indian Sign Language Research and Training Centre(ISLRTC),India has approximately 300 certified human interpreters to help people with hearing loss.This paper aims to address the issue of Indian Sign Language(ISL)sentence recognition and translation into semantically equivalent English text in a signer-independent mode.Design/methodology/approach–This study presents an approach that translates ISL sentences into English text using the MobileNetV2 model and Neural Machine Translation(NMT).The authors have created an ISL corpus from the Brown corpus using ISL grammar rules to perform machine translation.The authors’approach converts ISL videos of the newly created dataset into ISL gloss sequences using the MobileNetV2 model and the recognized ISL gloss sequence is then fed to a machine translation module that generates an English sentence for each ISL sentence.Findings–As per the experimental results,pretrained MobileNetV2 model was proven the best-suited model for the recognition of ISL sentences and NMT provided better results than Statistical Machine Translation(SMT)to convert ISL text into English text.The automatic and human evaluation of the proposed approach yielded accuracies of 83.3 and 86.1%,respectively.Research limitations/implications–It can be seen that the neural machine translation systems produced translations with repetitions of other translated words,strange translations when the total number of words per sentence is increased and one or more unexpected terms that had no relation to the source text on occasion.The most common type of error is the mistranslation of places,numbers and dates.Although this has little effect on the overall structure of the translated sentence,it indicates that the embedding learned for these few words could be improved.Originality/value–Sign language recognition and translation is a crucial step toward improving communication between the deaf and the rest of society.Because of the shortage of human interpreters,an alternative approach is desired to help people achieve smooth communication with the Deaf.To motivate research in this field,the authors generated an ISL corpus of 13,720 sentences and a video dataset of 47,880 ISL videos.As there is no public dataset available for ISl videos incorporating signs released by ISLRTC,the authors created a new video dataset and ISL corpus.
文摘This article proposes that the translation violence of the colonizing process—in terms of the confrontation between oral languages and ones with written grammars—coupled with the institutionalization of a given language as the national one,contains both a memory of the silencing of indigenous languages and cultures,and of transculturation.The latter,in turn,manifests itself,among other ways,through books that in different ways deal with the relationship between the West and descendants of the first-nation peoples of South America(especially in Circum-Roraima,the triple-border region between Brazil,Venezuela,and Guyana)and those of African descent(particularly in discussions about the use of the colonizer’s language or the indigenous languages by writers from former colonies).
文摘Interlingual communication is actually intercultural communication,Language is part of the culture of a people and the chief way by which the members of a society commnuicate。Basing on these points,this article gives examlples to show the facts that there esist in translation semantic correspondence and semantic aero between different languages and different cultures.When we do translating we should try to make up semantic aero caused by the culture.
文摘Translation for language teaching is different from general translation which is characterized with Faithfulness, Expressiveness, and Elegance. The differences lie in vocabulary, structure, and discourse. The extreme emphasis of translation skills will make it hard to learn certain language elements for the English learners. The paper makes an analysis on the three levels of translation for language teaching from the perspective of Skopos Theory, aiming at drawing attention from the translation teachers to care more about students' demands of learning language elements through translation.
基金supported by the National Natural Science Foundation of China(No.61303082) the Research Fund for the Doctoral Program of Higher Education of China(No.20120121120046)
文摘Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word aligned bilingual corpus,while ignoring the effect of the number of adjacent bilingual phrases.In this paper,we propose a method to take the number of adjacent phrases into account for better estimation of reordering models.Instead of just checking whether there is one phrase adjacent to a given phrase,our method firstly uses a compact structure named reordering graph to represent all phrase segmentations of a parallel sentence,then the effect of the adjacent phrase number can be quantified in a forward-backward fashion,and finally incorporated into the estimation of reordering models.Experimental results on the NIST Chinese-English and WMT French-Spanish data sets show that our approach significantly outperforms the baseline method.
基金Supported by the National Basic Research Program of China(973Program)(2013CB329303)the National Natural Science Foundation of China(61502035)
文摘The performance of a machine translation system heavily depends on the quantity and quality of the bilingual language resource. However,getting a parallel corpus,which has a large scale and is of high quality,is a very difficult task especially for low resource languages such as Chinese-Vietnamese. Fortunately,multilingual user generated contents( UGC),such as bilingual movie subtitles,provide us access to automatic construction of the parallel corpus. Although the amount of UGC parallel corpora can be considerable,the original corpus is not suitable for statistical machine translation( SMT) systems. The corpus may contain translation errors,sentence mismatching,free translations,etc. To improve the quality of the bilingual corpus for SMT systems,three filtering methods are proposed: sentence length difference,the semantic of sentence pairs,and machine learning. Experiments are conducted on the Chinese to Vietnamese translation corpus.Experimental results demonstrate that all the three methods effectively improve the corpus quality,and the machine translation performance( BLEU score) can be improved by 1. 32.
文摘This study presents a novel and innovative approach to auto-matically translating Arabic Sign Language(ATSL)into spoken Arabic.The proposed solution utilizes a deep learning-based classification approach and the transfer learning technique to retrain 12 image recognition models.The image-based translation method maps sign language gestures to corre-sponding letters or words using distance measures and classification as a machine learning technique.The results show that the proposed model is more accurate and faster than traditional image-based models in classifying Arabic-language signs,with a translation accuracy of 93.7%.This research makes a significant contribution to the field of ATSL.It offers a practical solution for improving communication for individuals with special needs,such as the deaf and mute community.This work demonstrates the potential of deep learning techniques in translating sign language into natural language and highlights the importance of ATSL in facilitating communication for individuals with disabilities.
基金supported by the Astronautics Technique Creation Project"vision-based spacecraft high accuracy realtime position and attitude measurement method research".
文摘An algorithm is proposed for registering images related by translation, rotation, and scale based on angular and radial difference functions. In frequency domain, the spatial translation parameters are computed via phase correlation method. The magnitudes of images are represented in log-polar grid, and the angular and radial difference functions are given and applied to measure shifts in both angular and radial dimensions for rotation and scale estimation. Experimental results show that this method achieves the same accurate level as classic fast Fourier transform (FFT) based method with invariance to illumination change and lower computation costs.