Samsung Electronics’ Global Research & Development (R&D) Centers are continuing to trailblaze in their research in the field of artificial intelligence (AI). Following the granting of several global AI awards and industry recognition to Samsung researchers around the globe, researchers in Poland and China recently won a set of highly prestigious global AI challenges.
Spearheading Speech Translation Research
Samsung R&D Institute Poland and Samsung R&D Institute China-Beijing competed with some of the world’s top universities and research labs to win first place in two separate challenges at the International Workshop on Spoken Language Translation (IWSLT), one of the world’s longest-running workshops on automatic language translation. This year, IWSLT joined the Association for Computational Linguistics conference (ACL), a premier conference in the field of computational linguistics, to cover a broad spectrum of research areas that are concerned with computational approaches to natural language.
For the Offline Speech Translation task, which assesses the translation of TED talks from English to German, Samsung R&D Institute Poland won first place for the second time with its own research capabilities in audio to text translation. The conferral of this award marks the fourth consecutive year that teams from Samsung R&D Institute Poland have taken first prize in IWSLT challenges, including previous years’ text translation tasks.
This year’s Offline Speech Translation task allowed participants to submit systems based on either the traditional speech translation pipeline system composed of an automatic speech recognition (ASR) and a machine translation (MT) or an End-to-End (E2E) system. Samsung R&D Institute Poland’s system is based on a single encoder-decoder deep neural network – an E2E system – capable of both English and German texts.
In computational linguistics, E2E systems are harnessed to solve the common problem of error accumulation, wherein, in a traditional pipeline, an error in the speech recognition phase can lead to a nonsensical translation. However, research from over the past three years has shown that traditional systems have constantly been outperforming E2E speech translation systems. The Samsung team’s system not only placed first in the E2E category, but also outscored all traditional pipeline system entrants, a remarkable achievement that puts Samsung R&D Institute Poland at the forefront of speech translation research.
Innovative Approaches in the Field of Computational Linguistics AI
Samsung R&D Institute China-Beijing took part in a second challenge, the Open Domain Translation task evaluating Japanese to Chinese translation capability, ultimately taking first place. The main goals of this task were the promotion of research into translation between Asian language, the exploitation of noisy parallel web corpora for machine translation and the thoughtful handling of data provenance.
Samsung R&D Institute China-Beijing submitted a system based on Transformer model architecture and adopted the relative position attention. The team focused on improving the Transform baseline system with elaborate data preprocessing and managed to achieve significant improvements. The team also tried shared and exclusive word embedding and compared different granularity of tokens, approaching the process at a sub-word level, including Byte Pair Encoding (BPE) and Sentence Piece. Large-scale back translation on monolingual corpus was used to improve the Neural Machine Translation (NMT) performance.
Achievements in AI Audio Signal Interpretation
In addition to their first-place finish in the IWSLT challenge, Samsung R&D Institute Poland was also recognized as one of the leading teams at the Detection and Classification of Acoustic Scenes and Events (DCASE) 2020 challenge, held by IEEE (Institute of Electrical and Electronics Engineers), which aims to use state-of-the-art AI technology to understand and interpret audio signals.
Engineers from Samsung R&D Institute Poland, who possess previous experience in Acoustic Scene Understanding and Sound Sources Localization tasks (having ranked first place in two tasks in 2019), set their focus on Task 2: Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring. The goal of this task was to identify whether the sound emitted from a target machine was normal or anomalous. The main challenge was detecting unknown anomalous sounds under a condition within which only normal sound samples have been provided as training data. The engineers scored second place out of 40 teams.
Envisaging the Future of Computer Vision and Pattern Recognition
In June, Samsung R&D Institute China-Beijing also participated in three challenges hosted by the 2020 Conference on Computer Vision and Pattern Recognition (CVPR 2020): the Embodied AI Challenge, the VizWiz-Captions Challenge and the VATEX Video Captioning Challenge. The team claimed second place in the challenges.
The Embodied AI Challenge aimed to enable robots to understand human commands and perform correct actions within a virtual environment, while the VizWiz-Captions Challenge involved predicting an accurate caption when given an image taken by a visually impaired person and the VATEX Video Captioning Challenge aimed to benchmark progress towards models that can describe videos in various languages including English and Chinese.