Skip to content

nlpub/russe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

60ee2bb · Mar 1, 2022
Oct 13, 2017
Oct 24, 2019
Feb 2, 2020
Mar 1, 2022
Dec 9, 2021
Jan 13, 2022
Dec 12, 2021
Jun 9, 2019
Apr 10, 2016
Oct 9, 2017
Oct 31, 2017
Apr 10, 2016
Oct 9, 2017
Nov 2, 2017
Oct 9, 2017
Oct 21, 2014
Aug 13, 2017
Oct 9, 2017
Jun 9, 2019
Oct 20, 2014
Feb 3, 2019
Oct 21, 2014
Dec 12, 2021
Apr 10, 2016
Oct 17, 2015
Oct 23, 2017

Repository files navigation

The First International Workshop on Russian Semantic Similarity Evaluation (RUSSE)

Motivation

A similarity measure is a numerical measure of the degree the two objects are alike. Usually, it quantifies similarity with a scalar in range [0; 1] or [0; ∞]. A semantic similarity measure is a specific similarity measure designed to quantify semantic relatedness of lexical units (e.g. nouns and multiword expressions). It yields high values for the pairs of words in a semantic relation (synonyms, hyponyms, associations or co-hyponyms) and zero values for all other pairs.

Semantic similarity measures proved to be useful for text processing applications, including text similarity, query expansion, question answering and word sense disambiguation. Such measures are practical because of the gap between lexical surface of the text and its meaning. Indeed, the same concept is often represented by different terms. Furthermore, these measures can be useful in linguistic and philological studies.

Measures of semantic similarity is an actively developing field of computational linguistics. Many methods were proposed and tested during last 20 years. Recently with the advent of neural network language models yielding state-of-the-art results on the semantic similarity task the interest to this field increased even more. Many authors tried to performed exhaustive comparisons of semantic similarity measures and developed a whole range of benchmarks and evaluations datasets.

Contribution

Unfortunately, most of the approaches to semantic similarity were implemented and evaluated only on a handful of European languages, mostly in English. While some Russian researchers sporadically tried to adopt several methods developed for English, these efforts were mostly done in a context of some specific applications without any proper evaluation. To the best of our knowledge, no systematic investigation of semantic similarity measures of Russian language was ever performed.

Expected Results

The goal of the RUSSE is to fill this gap and to conduct an evaluation campaign of key currently available methods. The RUSSE competition will perform a systematic comparison and evaluation of the baseline and the most recent approaches to semantic similarity in the context of Russian language. This will let us identify specific features of the semantic similarity phenomena in Russian language. The event will be organized in a form of a competition of systems that calculate similarity between words.

Contacts

Further details, including task rationale, schedule and datasets can be found on the RUSSE website: http://russe.nlpub.ru/. Participants will be invited to submit a paper to the Dialogue-2015 conference describing their system.