Your position: Home >> Call for Challenge OLR 2018 Challenge

OLR 2018 is the 3rd. event of the Oriental Language Recognition (OLR) challenge series, which is an international event organized by Tsinghua University and Speechocean, with the aim at boosting the development of language recognition techniques for oriental languages. Inspired by the brilliant success of OLR 2017 (31 teams from 6 countries/regions attended, with very promising performances submitted), we are calling participants for OLR 2018.


Oriental languages involve interesting specialties. The OLR challenge series aim at boosting language recognition technology for oriental languages. The new challenge in 2018 follows the same theme, but sets up more challenging tasks in the sense of:

 

Short-utterance identification task: This is a close-set identification task, which means the language of each utterance is among the known 10 target languages. The utterances are as short as 1 second.

Confusing-language identification task: This task identifies the language of utterances from 3 highly confusing languages (Cantonese, Korean and Mandarin).

Open-set recognition task: In this task, the test utterance may be in none of the 10 target languages.


We will publish the results on a special session of APSIPA ASC 2018.


See more details on the academic homepage.

Important Dates

2018/04/10 --- Registration Open


2018/05/01 --- Training/Dev. Data Release


2018/09/01 --- Registration Close


2018/10/08 --- Test Data Release


2018/10/15 --- Result Submission


2018/11/01 --- Performance Notification


2018/11/15 --- Result Announcement

Data

The challenge is based on three multilingual databases, AP16-OL7 that was designed for the OLR challenge 2016 and AP17-OL3 database for OLR challenge 2017.

 

AP16-OL7 is provided by Speechocean, and AP17-OL3 is provided by Tsinghua University, Northwest Minzu University and Xinjiang University, under the M2ASR project supported by NSFC


* The features for AP16-OL7 involve: 


-- Mobile channel 

-- 7 languages in total 

-- 71 hours of speech signals in total 

-- Transcriptions and lexica are provided

-- The data profile is here 

-- The license for the data is here 


* The features for AP17-OL3 involve: 


-- Mobile channel 

-- 3 languages in total 

-- Tibetan provided by Prof. Guanyu Li @ Northwest Minzu University

-- Uyghur and Kazak provided by Prof. Askar Hamdulla @ Xinjiang University

-- 35 hours of speech signals in total 

-- Transcriptions and lexica are provided 

-- The data profile is here

-- The license for the data is here




Need-to-Know

Evaluation plan


Refer to the scripts/paper following.

 

 

Evaluation tools


The Kaldi-based baseline scripts here

 

 

Participants from both academy and industry are welcome 

 

 

Publications based on the data provided by the challenge should cite the following paper: 


Dong Wang, Lantian Li, Difei Tang, Qing Chen, AP16-OL7: a multilingual database for oriental languages and a language recognition baseline, APSIPA ASC 2016. pdf 


Zhiyuan Tang, Dong Wang, Yixiang Chen, Qing Chen: AP17-OLR Challenge: Data, Plan, and Baseline, submitted to APSIPA ASC 2017. pdf


Zhiyuan Tang, Dong Wang, Qing Chen: AP18-OLR Challenge: Three Tasks and Their Baselines, submitted to APSIPA ASC 2018. pdf


 

 

Registration procedure


If you intend to participate the challenge, or if you have any questions, comments or suggestions about the challenge, please send email to the organizers:


--Prof. Dong Wang (wangdong99@mails.tsinghua.edu.cn

--Dr. Zhiyuan Tang (tangzhiyuan12@mails.ucas.ac.cn)

--Ms. Qing Chen (chenqing@speechocean.com

 

 

Committees



Dong Wang, Tsinghua University




Zhiyuan Tang, Tsinghua University 



Qing Chen, Speechocean



Copyright Oriental Language Recognition (OLR) 2018 Challenge All rights reserved.