Success stories
The way of success is the way of continuous pursuit of knowledge.
Napoleon Hill
Next
We have worked in several projects for Admiral Group since 2015. In 2018 we have signed a long term contract with Admiral Group for data analytics consultancy in Spain, Italy and UK.
Admiral Group is one of the largest motor insurance companies in the UK with a presence in eight countries. The Group now offers home, motor and travel insurance as well as personal loans and car finance in the UK, and has operations in Spain, Italy, France, the US and Mexico, with over five million customers worldwide.
We have an international client portfolio:
(*) Due to confidentiality agreements we are not able to reveal our clients' identity.
We usually get double digit improvement in our models compared with existing solutions.
More than 80 % of our revenues comes from overseas clients.
Data Science Competitions
Competitiveness is the key of evolution in nature. Sport competitions like Formula One or 24 Hours of Le Mans have led to improvements in industry throughout history.
After Netflix Prize, machine learning competitions showed to be the best laboratory to test new ideas and tools. In this respect, Google platform Kaggle has become the reference arena.
SeeClickFix competition: 1st / 532
Would you like to detect which events or topics will be trending in a community before they become widespread?
Predict which ‘311’ issues are most important to citizens. ‘311’ is a
mechanism by which citizens can express their desire to solve a problem
the city or government by submitting a description of what needs to be
done, fixed, or changed.
Keys: Response stacking. Geographical featuring engineering.
Low signal - high noise modeling
Genentech Flu forecasting competition: 2nd / 50
What share earthquakes, markets and pandemic outbreaks?
All these problems share the fact that they are very hard to predict due to very noisy data with low signal.
The goal of this competition is to predict when, where and how strong the flu will be. We worked on this problem with Sergey Yurgenson (currently Director of Advanced Data Science Services at DataRobot).
Keys: Autoregressive models. Lagged variables for time series. Geographical model. Blending.
Deloitte competition: 5th / 37
Would you like to know whether your customers will churn even before they think about it?
The prediction of customers that are likely to churn can enable early interventions in order to retain them.
The goal of this competition is to predict which customers will leave an insurance company in the next 12 months. Customer churning can be modeled as a survival problem.
Keys: Survival modeling. Feature engineering. Lagged variables.
Heritage Heath Prize: 3rd / 1353
Which patients will be admitted to a hospital within the next year?
Heritage Provider Network (HPN) is a limited organization that provides health care in California.
The goal of this challenge is to develop a breakthrough algorithm that uses available historical patient data to predict and prevent unnecessary hospitalizations.
Keys: Survival models. Advanced feature engineering. Advanced categorical feature engineering. Model blending.
CTR (Click Through Rate) prediction
Avito CTR competition: 4th / 414
Which Context ads will earn an user's click?
Avito is the largest general classified website in Russia. In this competition, the challenging was to accurately predict click-through rates for their ads.
Keys: High cardinality levels in categorical variables. Advanced categorical engineering. Lagged variables.
Expedia competition: 6th / 337
Which is the best OTA ranked list for an user search?
Task: provide the best ranking of hotels (“sort”) for specific users with
the best integration of price competitiveness. This gives an OTA (Online Travel Agency) the best
chance of winning the sale.
Keys: Rank learning. Categorical features. Lagged variables.
Multilabel classification
Tradeshift text classification competition: 9th / 375
How to classify an entity in a multiple class system?
Task: predict the probability of a piece of text to belong to one (or more than one) of the given classes. We used this challenge to test new approaches for multiclass problems: iterative fitting
using previous out of fold predictions on each variable response, and response stacking.
Keys: Multilabel - multiclass. Response stacking.
Iterative fitting
using previous out of fold predictions.
Avazu CTR sponsored search: 7th / 1604
Which online advertising will be clicked?
Task: predict Click Through Rate (CTR) in online sponsored advertising. Online prediction requires specific incremental models. In this challenge there were a lot of categorical variables with high cardinality and new levels in test set making the problem hard to solve.
Keys: Factorization machines. Advanced categorical variable management. Incremental learning.
Multiclass classification
Expedia hotel recommendation: 10th / 1974
Which kind of hotel will be reserved by a customer between a set of 100 hotel groups?
The challenge involved high cardinality multiclass response and user history.
Keys: Geographical feature engineering. Lagged variables. Response stacking.