Manu Joseph

July 1, 2020July 5, 2020 Manu Joseph machine learning

The Gradient Boosters VII: Battle of the Boosters

We have come a long way in the world of Gradient Boosting. If you have followed the whole series, you should have a much better understanding about the theory and practical aspects of the major algorithms in this space. After a grim walk through the math and theory behind these algorithms, I thought it would … Continue reading The Gradient Boosters VII: Battle of the Boosters

June 27, 2020June 30, 2020 Manu Joseph machine learning

The Gradient Boosters VI(B): NGBoost

The reign of the Gradient Boosters were almost complete in the land of tabular data. In most real world as well as competitions, there was hardly a solution which did not have at least one model from one of the gradient boosting algorithms. But as the machine learning community matured, and the machine learning applications … Continue reading The Gradient Boosters VI(B): NGBoost

April 5, 2020April 5, 2020 Manu Joseph deep learning

Does Imagenet Pretraining work for Chest Radiography Images(COVID-19)?

We are at siege. A siege by an unknown enemy. An enemy with which we are befuddled. And unless you were living under a rock for the past couple of months(like Jared Leto), you know what I'm talking about - COVID-19. Whether you turn on the news, or scroll through social media, the majority of … Continue reading Does Imagenet Pretraining work for Chest Radiography Images(COVID-19)?

April 1, 2020June 27, 2020 Manu Joseph deep learning, machine learning

The Gradient Boosters VI(A): Natural Gradient

We are taking a brief detour from the series to understand what Natural Gradient is. The next algorithm we examine in the Gradient Boosting world is NGBoost and to understand it completely, we need to understand what Natural Gradients are. Pre-reads: I would be talking about KL Divergence and if you are unfamiliar with the … Continue reading The Gradient Boosters VI(A): Natural Gradient

February 29, 2020December 25, 2020 Manu Joseph machine learning

The Gradient Boosters V: CatBoost

While XGBoost and LightGBM reigned the ensembles in Kaggle competitions, another contender took its birth in Yandex, the Google from Russia. It decided to take the path less tread, and took a different approach to Gradient Boosting. They sought to fix a key problem, as they see it, in all the other GBMs in the … Continue reading The Gradient Boosters V: CatBoost

February 21, 2020June 27, 2020 Manu Joseph machine learning

The Gradient Boosters IV: LightGBM

XGBoost reigned king for a while, both in accuracy and performance, until a contender rose to the challenge. LightGBM came out from Microsoft Research as a more efficient GBM which was the need of the hour as datasets kept growing in size. LightGBM was faster than XGBoost and in some cases gave higher accuracy as … Continue reading The Gradient Boosters IV: LightGBM

February 12, 2020February 29, 2020 Manu Joseph machine learning

The Gradient Boosters III: XGBoost

Now let's get the elephant out of the way - XGBoost. This is the most popular cousin in the Gradient Boosting Family. XGBoost with its blazing fast implementation stormed into the scene and almost unanimously turned the tables in its favor. Soon enough, Gradient Boosting, via XGBoost, was the reigning king in Kaggle Competitions and … Continue reading The Gradient Boosters III: XGBoost

February 9, 2020February 29, 2020 Manu Joseph machine learning

The Gradient Boosters II: Regularized Greedy Forest

In 2011, Rie Johnson and Tong Zhang, proposed a modification to the Gradient Boosting model. they called it Regularized Greedy Forest. When they came up with the modification, GBDTs were already, sort of, ruling the tabular world. They tested the new modification of a wide variety of datasets, both synthetic and real world, and found … Continue reading The Gradient Boosters II: Regularized Greedy Forest

February 2, 2020February 29, 2020 Manu Joseph machine learning

The Gradient Boosters I: The Good Old Gradient Boosting

In 2001, Jerome H. Friedman wrote up a seminal paper - Greedy function approximation: A gradient boosting machine. Little did he know that was going to evolve into a class of methods which threatens Wolpert's No Free Lunch theorem in the tabular world. Gradient Boosting and its cousins(XGBoost and LightGBM) have conquered the world by … Continue reading The Gradient Boosters I: The Good Old Gradient Boosting

January 9, 2020April 30, 2020 Manu Joseph deep learning, statistics

Deep Learning and Information Theory

If you have tried to understand the maths behind machine learning, including deep learning, you would have come across topics from Information Theory - Entropy, Cross Entropy, KL Divergence, etc. The concepts from information theory is ever prevalent in the realm of machine learning, right from the splitting criteria of a Decision Tree to loss … Continue reading Deep Learning and Information Theory