Here we try to analyze different approaches of online tweet classification into offensive and non-offensive categories in the presence of emoticons. We try to enhance the tokenization process to capture better sparsity in the presence of non-textual corpus.
This is an unsupervised learning based lustering web tool that has been used to cluster microstructures in Steel based on their physical properties that have been extracted using classical image processing algorithms. Files can then be uploaded in .xlsx or .csv format. The tool supports five different clustering algorithms in combination with dimensionality reduction.
Since health forums become a rich source of information to people with medical conditions discussing treatments, doctor's opinions, side-effects to complex-drugs, while also sharing personal background medical information in a community question-answering framework, we develop a neural search engine on top of such health forums by exploring the state-of-the-art neural ranking models. We first write a set of optimal heuristic functions that maximizes the relevancy scores for a labelled dataset by training a snorkel classifier that classifies a given query-document pair as relevant or irrelevant. Later, these functions are extended to classify the unlabelled set of query-document pairs, followed by re-ranking using neural re-rankers.
We implement k-nearest neighbors, Gaussian Mixture Model, Multi-class SVM, Convolutional Neural Network, and Convolutional Recurrent Neural Network to classify the following four genres- Dark-Forest, Hi-Tech, Full-On, and Goa. We further extract 30 temporal features using a Long Short Term Memory based Auto encoder from individual frames, and augment them with the frame-level audio features, which is a novel contribution in this work.