XGBoost reigned king for a while, both in accuracy and performance, until a contender rose to the challenge. LightGBM came out from Microsoft Research as a more efficient GBM which was the need of the hour as datasets kept growing in size. LightGBM was faster than XGBoost and in some cases gave higher accuracy as … Continue reading The Gradient Boosters IV: LightGBM
The Gradient Boosters III: XGBoost
Now let's get the elephant out of the way - XGBoost. This is the most popular cousin in the Gradient Boosting Family. XGBoost with its blazing fast implementation stormed into the scene and almost unanimously turned the tables in its favor. Soon enough, Gradient Boosting, via XGBoost, was the reigning king in Kaggle Competitions and … Continue reading The Gradient Boosters III: XGBoost
The Gradient Boosters II: Regularized Greedy Forest
In 2011, Rie Johnson and Tong Zhang, proposed a modification to the Gradient Boosting model. they called it Regularized Greedy Forest. When they came up with the modification, GBDTs were already, sort of, ruling the tabular world. They tested the new modification of a wide variety of datasets, both synthetic and real world, and found … Continue reading The Gradient Boosters II: Regularized Greedy Forest
The Gradient Boosters I: The Good Old Gradient Boosting
In 2001, Jerome H. Friedman wrote up a seminal paper - Greedy function approximation: A gradient boosting machine. Little did he know that was going to evolve into a class of methods which threatens Wolpert's No Free Lunch theorem in the tabular world. Gradient Boosting and its cousins(XGBoost and LightGBM) have conquered the world by … Continue reading The Gradient Boosters I: The Good Old Gradient Boosting
Deep Learning and Information Theory
If you have tried to understand the maths behind machine learning, including deep learning, you would have come across topics from Information Theory - Entropy, Cross Entropy, KL Divergence, etc. The concepts from information theory is ever prevalent in the realm of machine learning, right from the splitting criteria of a Decision Tree to loss … Continue reading Deep Learning and Information Theory
Practical Debugging for Data Science
Prologue Now before writing about this topic, I did a quick Google Search to see how much of this is already covered and quickly observed a phenomenon that I see increasingly in the field - Data Science = Modelling, at best, Modelling + Data Processing. Open a MOOC, they talk about the different models and … Continue reading Practical Debugging for Data Science
Interpretability: Cracking open the black box – Part III
Previously, we looked at the pitfalls with the default "feature importance" in tree based models, talked about permutation importance, LOOC importance, and Partial Dependence Plots. Now let's switch lanes and look at a few model agnostic techniques which takes a bottom-up way of explaining predictions. Instead of looking at the model and trying to come … Continue reading Interpretability: Cracking open the black box – Part III
Interpretability: Cracking open the black box – Part II
In the last post in the series, we defined what interpretability is and looked at a few interpretable models and the quirks and 'gotchas' in it. Now let's dig deeper into the post-hoc interpretation techniques which is useful when you model itself is not transparent. This resonates with most real world use cases, because whether … Continue reading Interpretability: Cracking open the black box – Part II
Interpretability: Cracking open the black box – Part I
Interpretability is the degree to which a human can understand the cause of a decision - Miller, Tim[1] Explainable AI (XAI) is a sub-field of AI which has been gaining ground in the recent past. And as I machine learning practitioner dealing with customers day in and day out, I can see why. I've been … Continue reading Interpretability: Cracking open the black box – Part I