Uplift Modeling for Post-Enrollment Customer Spend in Rewards Program
The X-Learner algorithm, a metalearning technique for causal inference, as implemented in Uber’s CausalML package. The X-Learner is particularly effective for estimating heterogeneous treatment effects. The objective was to assess the causal impact of enrolling in a rewards program on customer spending. Utilizing data on customer transactions, we aim to identify how enrollment in the rewards program affects post-enrollment spend.
Stock Price Prediction with RNNs
Trained and tested multiple Tensorflow Time-Series models on S&P’s historical closing prices. The data was obtained from Yahoo Finance and showed the daily closing prices for 5 years. Since prices during the test period appear to be higher than the training period, I applied the MinMaxScaler to the data and then processed them into sequences with a window of 5 days and step of 1 day.
Forecasting Rideshare Demand with Transformers
Applying Transformers to forecast daily rail ridership using Keras Transformers. The proposed approach is based on the Transformers architecture which utilizes the attention mechanism, enabling the model to learn long-range dependencies in the data. This makes the model suitable for time-series tasks and can potentially outperform RNNs. The transformer encoder consists of stacked encoder layers, each containing self-attention and feed-forward neural networks. The self-attention mechanism allows the model to attend to different parts of the input sequence, capturing dependencies and patterns across time. The input sequence for the transformer encoder would include the historical ridership data along with additional features. Each data point would be represented as a vector containing the values of these features. Additionally, positional encoding is added to the input sequence to provide the model with information about the temporal order of the data.
Natural Language Processing with Disaster Tweets
RNN model in Tensorflow to classify whether Tweets are about a disaster or not. The data provided is in a tabular format and requires quite a lot of pre-processing for the model. Different normalization methods are applied to each column based on their type and tokens also had to be prepared for the embedding layer.