Project description
The goal is to build a product-grade system for predicting buying behavior of current and
prospect customers of insurance companies and a model for predicting the probability of the
customer resigning their policy next year.
Overview
The entire insurance industry in Denmark is buzzing with machine learning- and data-driven
opportunities yet not many companies attempted to improve their processes using related
technologies. As a collaboration between TIA Technology and one of our customers, we set ourselves
to build a product recommendation system and a churn model that are going to bring measurable
business value and set a precedence for how machine learning can be efficiently used in Nordic
insurance market.
We use a KNN and Neural Network model for product recommendations and an LSTM and XGBoost for the
churn.
The gif below shows a live integration beetween a front-end platform and a machine learning api.
The predictions of the model change upon an insurance product being added to a customer's profile.
Technologies
Oracle SQL Developer, Python, Pandas, Keras, Sklearn, Flask, Docker
Technical Details
We're mostly working with data stored in a data warehouse prepared for the purposes of our BI
department, which provides us with expertise regarding the data extraction. We’re using Pandas for
data transformations, Sklearn for the product recommendations and Keras for time series modelling.
The slides are a part of a presentation for a client we’re working with.
- Product recommendation
The product recommendation system is built out of 2 parts. KNN model that automatically
approximates current customers’ buying behavior by comparing each customer to their closest
neighbor, and a Neural Network that does the same for new customers. We use XGBoost to
interpret the results in terms of feature importances.
- Churn
The churn model works on an insurance policy level, where a year-by-year time series data is
used to come up with a probability of the policy not being extended next year. As the
baseline we use an XGBoost linear model on the last row before prediction and try to use an
RNN to beat its performance.
My contribution as an ML Software Engineer
I am involved in every part of the pipeline and I'm responsible
for the entire process from the data being available in a CSV format,
through the conceptualizing, experimentation, model building, communication with the client, to a
live API.
I developed a product recommendation and churn model and created a Flask api for retraining it
and exposing its predictions. I integrated it with a front-end application and used docker-compose
to make an online version of the product available for demo purposes.