Notes About the Logics Behind the Development of Tree-Based Models

Tree-based methods contains a lot of tricks that are easily tested in data/machine learning related interviews, but very often mixed up. Go through these tricks while knowing the reasons behind could be very helpful in understanding + memorization. Overview of Tree-based Methods Overall speaking, simple decision/regression trees are for better interpretation (as they can be visualized), with some loss of performance (when compared to regression with regularization and non-linear regression methods, e....

December 8, 2020 · 6 min · Yiheng "Terry" Li

Further Discussion of Relative Importance

In this article, two more methods will be discussed that takes not only linear correlation of a single predictor variable with the dependent variable, but also considers the intertwined effects. These two methods are Commonality Analysis (CA) and Dominance Analysis (DA). They share some similarities yet differs in focuses of analyzing relative importance. Commonality Analysis Idea Commonality Analysis is a statistical technique within multiple linear regression that decomposes a model’s R2 statistic (i....

August 10, 2020 · 10 min · Yiheng "Terry" Li

Multivariate Linear Regression -- Collinearity and Feature Importance

This article will discuss some problems in implementing multivariate linear regression. These problems are: What to do with the collinearity phenomenon in linear regression? How to extract feature importance of a linear regression, is it just the coefficient of the model? Collinearity What is collinearity Collinearity is a condition in which some of the independent variables are highly correlated. Collinearity will occur more often when the number of predictors is very large, as we can imagine that there is a higher chance that some predictor are correlated with others....

July 15, 2020 · 12 min · Yiheng "Terry" Li

Survival Analysis -- the Basics

What is Survival Analysis and When to Use It? Survival analysis can be generalized as time-to-event analysis, which is analyzing the time to certain event (e.g. disease, death, or some positive events, etc.). Survival analysis gives us the estimation of, for example: the time it takes before some events occur; the probability of having experienced certain event at certain time point; or which factors have what effect on the time to certain event....

June 14, 2020 · 8 min · Yiheng "Terry" Li