Posts

Generative Models -- Gaussian Discriminant Analysis

Brief Introduction to Generative Models To talk about generative models (v.s. discriminative models), we can first learn from this story: A father has two kids, Kid A and Kid B. Kid A has a special character whereas he can learn everything in depth. Kid B have a special character whereas he can only learn the differences between what he saw. One fine day, The father takes two of his kids (Kid A and Kid B) to a zoo....

Logistic Regression Updated with Newton's Method

Logistic regression is a very important binary classification algorithm, in this article, some essential details inside the algorithm will be discussed. Plain language will be used to discuss the most detail aspects so that beginners of machine learning can easily get the idea. Assumptions of Logistic Regression Logistic regression does not require as many assumptions as linear regression. There are a few that are interested and we will shortly discussed about....

Store Almost Any Objects of Python in Files

The module pickle implements binary protocols for serializing and de-serializing a Python object structure. We can store almost any type of object of Python using pickle. Quick Example For example, we want to save a dictionary dict_obj to file. # Save file def saveFile(obj): out_file = open('obj.pickle','wb') pickle.dump(obj, out_file) out_file.close() # Read file def readFile(obj_file_name): file = open(obj_file_name, 'rb') obj = pickle.load(file) return obj >>> dict_obj = {'itemA': ['item', 'A'], 'itemB':[1, 3]} >>> saveFile(dict_obj) >>> obj_file_name = 'obj....

Probabilistic Interpretation of Sum of Square Loss Function

Square Loss Function (in Linear Regression) For linear regression, the way that we used to find the optimal parameters $\overrightarrow \theta$ is called gradient descent, which we seek for $\overrightarrow \theta$ that minimize the loss function: $$ \mathcal{J}(\theta) = \frac{1}{2} \sum_{i=1}^{n}(y^{(i)} - \theta^T x^{(i)})^2 $$ That is: $$ \hat \theta = \underset{\theta}{\mathrm{argmin}}[\frac{1}{2} \sum_{i=1}^{n}(y^{(i)} - \theta^T x^{(i)})^2] $$ Interpret the Loss Function as MLE In linear regression, we assume the model to be: $$ \overrightarrow y = \theta^T x^{(i)} + \epsilon^{(i)} $$ where $\epsilon$ is called the error term which conposes of unmodelled factors and random noise....

Positive Semidefinite Matrix in Machine Learning

What is Positive Semidefinite (PSD) Matrix Definition Matrix $A \in \mathbb{R}^{n\times n}$ is positive semi-definite (PSD), denoted $A \succeq 0$, is defined as: $A = A^{T}$ ($A$ is symmetric) $x^{T}Ax \geq 0$ for all $x \in \mathbb{R}$ So from the definition, we can infer some properties of PSD matrix. Properties If $A \succeq 0 $ then $A$ is invertible and $A^{-1} \succeq 0$. If $A \succeq 0 $ , then $\forall Q \in \mathbb{R}^{n\times n}$, we have $Q^{T}AQ \succeq 0$....

SMOTE with Python

Motivation Working on classification problem, especially in medical data practice, we are often faced with problem that there are imbalanced cases in each categories. While it might be OK for other machine learning model builders to overlook this, it is essential that we pay attention to imbalanced data problem in medical applications, because in most scenarios, prediction and accuracy of the minority categories is far more important than the most common classes....