0000000000166118
AUTHOR
Rohan Kumar Yadav
Interpretable Architectures and Algorithms for Natural Language Processing
Paper V is excluded from the dissertation with respect to copyright. This thesis has two parts: Firstly, we introduce the human level-interpretable models using Tsetlin Machine (TM) for NLP tasks. Secondly, we present an interpretable model using DNNs. The first part combines several architectures of various NLP tasks using TM along with its robustness. We use this model to propose logic-based text classification. We start with basic Word Sense Disambiguation (WSD), where we employ TM to design novel interpretation techniques using the frequency of words in the clause. We then tackle a new problem in NLP, i.e., aspect-based text classification using a novel feature engineering for TM. Since…
Enhancing Attention’s Explanation Using Interpretable Tsetlin Machine
Explainability is one of the key factors in Natural Language Processing (NLP) specially for legal documents, medical diagnosis, and clinical text. Attention mechanism has been a popular choice for such explainability recently by estimating the relative importance of input units. Recent research has revealed, however, that such processes tend to misidentify irrelevant input units when explaining them. This is due to the fact that language representation layers are initialized by pre-trained word embedding that is not context-dependent. Such a lack of context-dependent knowledge in the initial layer makes it difficult for the model to concentrate on the important aspects of input. Usually, th…
Indoor Space Classification Using Cascaded LSTM
Author's accepted manuscript. © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Indoor space classification is an important part of localization that helps in precise location extraction, which has been extensively utilized in industrial and domestic domain. There are various approaches that employ Bluetooth Low Energy (BLE), Wi-Fi, magnetic field, object detecti…
Positionless aspect based sentiment analysis using attention mechanism.
Abstract Aspect-based sentiment analysis (ABSA) aims at identifying fine-grained polarity of opinion associated with a given aspect word. Several existing articles demonstrated promising ABSA accuracy using positional embedding to show the relationship between an aspect word and its context. In most cases, the positional embedding depends on the distance between the aspect word and the remaining words in the context, known as the position index sequence. However, these techniques usually employ both complex preprocessing approaches with additional trainable positional embedding and complex architectures to obtain the state-of-the-art performance. In this paper, we simplify preprocessing by …
Interpretability in Word Sense Disambiguation using Tsetlin Machine
Massively Parallel and Asynchronous Tsetlin Machine Architecture Supporting Almost Constant-Time Scaling
Using logical clauses to represent patterns, Tsetlin Machine (TM) have recently obtained competitive performance in terms of accuracy, memory footprint, energy, and learning speed on several benchmarks. Each TM clause votes for or against a particular class, with classification resolved using a majority vote. While the evaluation of clauses is fast, being based on binary operators, the voting makes it necessary to synchronize the clause evaluation, impeding parallelization. In this paper, we propose a novel scheme for desynchronizing the evaluation of clauses, eliminating the voting bottleneck. In brief, every clause runs in its own thread for massive native parallelism. For each training e…
Robust Interpretable Text Classification against Spurious Correlations Using AND-rules with Negation
The state-of-the-art natural language processing models have raised the bar for excellent performance on a variety of tasks in recent years. However, concerns are rising over their primitive sensitivity to distribution biases that reside in the training and testing data. This issue hugely impacts the performance of the models when exposed to out-of-distribution and counterfactual data. The root cause seems to be that many machine learning models are prone to learn the shortcuts, modelling simple correlations rather than more fundamental and general relationships. As a result, such text classifiers tend to perform poorly when a human makes minor modifications to the data, which raises questi…
A Lite Romanian BERT: ALR-BERT
Large-scale pre-trained language representation and its promising performance in various downstream applications have become an area of interest in the field of natural language processing (NLP). There has been huge interest in further increasing the model’s size in order to outperform the best previously obtained performances. However, at some point, increasing the model’s parameters may lead to reaching its saturation point due to the limited capacity of GPU/TPU. In addition to this, such models are mostly available in English or a shared multilingual structure. Hence, in this paper, we propose a lite BERT trained on a large corpus solely in the Romanian language, which we cal…