Dog breed identification

Used transfer learning to identify the breed of a dog, or find the dog that looks like you

  • Tools used: Python, opencv, AWS sagemaker
  • Category: Clasification, Deep learning
  • Year: Apr 2020

Affinity propagation clustering package

Built a Python package that implements affinity propagation from scatch - A technique based on message exchange between data points and do not require pre-specifying the number of clusters.

  • Tools used: Python, TestPyPi
  • Category: Clustering, Teamwork, Software engineering
  • Year: Apr 2020

Plagiarism detector

Using NLP attributes such as containments for feature engineering and Naive Bayes classification for detecting plagiarism.

  • Tools used: Python, AWS sagemaker, AWS Lambda and Amazon API Gateway
  • Category: Classification, Machine learning deployment
  • Year: Apr 2020

Machine learning interpretation app

An app to automatically interpret blackbox high-performance machine learning models

  • Tools used: Python (tree ensembles, Streamlit, SHAP, ELI5, Altair), Heroku, Docker
  • Category: ML interpretability
  • Year: Mar 2020

Predict patient survival 24hrs after admitting to ICU

Evaluated different boosting algorithms on a dataset including features from various vitals and lab test results. Achieved 0.905 AUC

  • Tools used: Python (catBoost, XGBoost, lightGBM, SHAP)
  • Category: Supervised Learning
  • Year: Feb 2020

Flask web app for text summarization

Deployed a flask app with continuous delivery that performs extractive summarization on long text or lists of web articles from user input

  • Tools used: Python (Flask, Gensim, Transformers), GCP (App Engine, CloudBuild)
  • Category: NLP, App development, CI/CD
  • Year: Jan 2020

Epidemic meets social media

Identify the different concerns among pro- and anti- vaccine group surrounding #flu and #flushot from Twitter, visualized the interactions between the groups and the influencers within them

  • Tools used: R (rtweet, tidyverse, ggraph)
  • Category: Exploratory Analysis, NLP, Network Analysis
  • Year: 2018

Customer segmentation

Modeled user the segments of a credit company through K-means clustering and t-SNE. Made recommendation on different targeting strategies

  • Tools used: R (K-means, t-SNE), Tableau
  • Category: Supervised learning
  • Year: 2018
Photo credit: Unsplash