Time series Forecasting - Unit Sales of Walmart Retail Goods

  • Forecast the unit sales of Walmart retail goods for the next 28 days
  • Performed exploratory data analysis, data munging and feature engineering to transform raw data
  • Trained random forest and XGBoost models per department for computational convenience and future scalability
  • Split the end-to-end pipeline into several independent logical steps.

Heartbeat Classification

  • NSERC-funded project in collaboration with Dapasoft
  • Classify ECG heartbeat signals into predefined categories based on heartbeat abnormality by transforming time series to images
  • Achieved 5% higher accuracy (98.7%) than baseline for the MIT-BIH dataset and 1% higher accuracy for the PTB dataset
  • Approaches used: Gramian Angular Field, Markov Transition Field, Recurrence Plot, Intermediate Fusion

Breast Cancer Confidence Metric Using Bayesian Neural Networks

  • Developed an evaluation metric to measure the uncertainty of mammograms to tune the confidence level of benign/malignant classification; modular network architecture consists of a convolutional feature generator and Bayesian neural network, reducing trainable parameters from 11 million to 164,000
  • Higher confidence results in fewer images classified with a reasonable posterior sample size and higher accuracy (96%)than baseline (80%)
  • Implemented with PyTorch and Pyro

Data Lake with Spark

  • ETL pipeline for a Data Lake hosted on AWS S3
  • Extracted data from S3, processed it using Spark
  • Loaded back into S3 in a set of Fact and Dimension Tables using partitioning and parquet formatting to find insights about songs users listen to
  • Deployed this Spark process on an AWS EMR cluster
  • Implemented with pyspark, AWS S3, AWS EMR

Parallelization of end-to-end ensemble training for text classification

  • Demonstrates performance gain of parallelism
  • The ensemble learning algorithm is used for the training and evaluation of text classification. 3 shallow models ( Logistic Regression, Naive Bayes and Random Forest) and 3 deep neural models (LSTMs with 3 different dropout rates) are trained and the training and evaluation of these models is executed in parallel. The preprocessing of the data is also parallelized.The ensemble learning algorithm is used for the training and evaluation of text classification.
  • Implemented with joblib, numpy, pandas, scikit-learn and Keras

Debiasing Word Embeddings - Implemention of a paper

  • Perform hard gender debiasing on pre-trained GloVe embeddings. For this project, we have chosen the 50-dimensional version of GloVe, which is based on Wik- ipedia 2014 and Gigaword5 and has 400,000 words.
  • The method used consists of neutralizing and equalizing gender word pairs in such a way that any non-gendered/neutral word is at equal distance to gender word pairs such as she-he. After plotting the extreme she-he occupations, we find that all occupations are at equal distance from the she and he axis. We also find that gender specific words have moved closer to their respective gender axis (corresponding she or he axis). Conclusions. The application of the suggested debiasing algorithm demonstrates promising results in terms of debiasing occupational stereotypes.
  • Implemented with numpy, pandas, scikit-learn and seaborn

This website

Built with Jekyll, HTML, CSS


Price and Promotion Manager (Rubikloud)

  • Developed and refactored client configurations, customizable tables, resizable and sortable columns, filters for tables, row select and CSV export utility
  • Built with Javascript, React, Redux, D3.js, Lodash, HTML, CSS

FlipGive Fundraising web app, Schwan's Cares Fundraising and Indigo Adopt A School

  • Developed upload and automatic change of shopping marketing banners from the admin UI, team challenges and leaderboard UI, top fundraising stories
  • Implemented mobile-first approach; built campaign activity, supporters features, HTML email templates, multiple Schwan's Cares UI redesigns
  • Built with Ruby on Rails, Javascript, jQuery, HTML, CSS, Bootstrap

Our Homes (NoBul Media)

Web Statistics