NLP Research Assistant

  • Successfully developed a web-based interface for Natural Language Processing (NLP) text corpora that enables gender biases to be revealed visually and interactively.
  • A Flask-based web framework was created where the user can upload their corpora through inputting plain text, URL or txt files. Two NLP algorithms will run, namely the Bias Score Calculation algorithm and the Sentence Parsing algorithm, both based on word embeddings. The user is able to view the Bias scores associated with each token and specific sentence structures. Interactive pivot tables, bar graphs, word clouds, PCA and TSNE graphs are provided for the user to explore and extract information.
  • The user is also able to input a natural language query, where the query is parsed and the answer is given in the form of a data frame and a bar graph. A debias feature is also available if the user wishes to discard the more extreme parts and retrieve a less biased file.
  • Supervisor: Dr Marcus Tomalin
  • GitHub Repo