On Kaggle, there’s a great OKCupid dataset with ~60K rows. This is a great opportunity to learn about transformers, and that’s exactly what my group did.

We trained & fine-tuned models on OkCupid data to find optimal “matches” through user input. One of our teammates even made a Flask app!

I led a lot of the EDA on this project, and dating app data is always…interesting! This led me to experiment with a “toxicity” filter with our transformer model that worked surprisingly well.

View on GitHub.

Updated: