Project Description

In 2018, Google proposed the revolutionary BERT model for deep learning based natural language processing. Building on top of the transformer architecture, BERT was able to achieve the state-of-the-arts in many Artificial Intelligence tasks. A secret of BERT's training paradigm is that we have to mask about ~15% of the observed words in any given sentence,

in order for BERT to do well in self-supervised language learning: if we mask too many words, the lack of context will create an obstacle for learning; if we mask too few, then the training process will be inefficient. The question we ask in this project is: how about human language learning? When a human learns a foreign language in Duolingo style training, What is the optimal mask rate?

Team Members

  • Aditya Sharma
  • Matthew Ho
  • Nga Ngo
  • Justin Chang

Professor and Mentors

  • Prof. William Wang
  • Michael Saxon

Meeting Time

  • Research mentor meeting
    • 3PM - 4PM on Fridays
  • ERSP mentor meeting
    • 5PM - 5:30PM on Mondays
  • ERSP meeting with central mentors (zoom for the first two weeks, in person after that)
    • Chinmay: Mondays 5PM - 5:30PM
    • Diba: TBD

Links to Proposals and Presentation

  • Proposal (first draft): link
  • Proposal (after peer review): link
  • Final proposal (after instructor feedback): link
  • Final presentation: link

Individual Logs

Peer Review

Project Documentation and Resource