NLP

Case-study of Offensive Language Classification from Online Tweets

Here we try to analyze different approaches of online tweet classification into offensive and non-offensive categories in the presence of emoticons. We try to enhance the tokenization process to capture better sparsity in the presence of non-textual corpus.

Enhancing clinical information retrieval on health forums using neural ranking models

Since health forums become a rich source of information to people with medical conditions discussing treatments, doctor's opinions, side-effects to complex-drugs, while also sharing personal background medical information in a community question-answering framework, we develop a neural search engine on top of such health forums by exploring the state-of-the-art neural ranking models. We first write a set of optimal heuristic functions that maximizes the relevancy scores for a labelled dataset by training a snorkel classifier that classifies a given query-document pair as relevant or irrelevant. Later, these functions are extended to classify the unlabelled set of query-document pairs, followed by re-ranking using neural re-rankers.

Score Me if You Can

Study on Robustness of Automated Essay Scoring Systems to Out-of-domain and Adversarial Inputs summary