Brief Walkthrough of Famous ImageNet Contenders

Image nets are often refer to neural networks that takes in one image (usually RGB image) and are supposed to output the class of the object shown in the image. There are a lot of famous and published image nets. They were pre-trained on slightly different datasets, developed by different teams in different time, but all widely used in not only object classification, but also many other applications. This article will go through several famous image neural networks (AlexNet, VGG, ResNet, InceptionNet, EfficientNet)....

July 24, 2021 · 8 min · Yiheng "Terry" Li

Using HDF5 format for python file saving and loading

What’s the advantages of using HDF5 for file saving and loading? I wrote something about pickle or JSON before, which are python packages for serialization. More specifically, pickle is a binary serialization format for python objects, saving objects to an unreadable file, can be loaded inside the same machine and is not sharable with other programming languages. And JSON is a text serialization which saves basically python dictionaries, text, list like object in a readable format....

April 21, 2021 · 4 min · Yiheng "Terry" Li

LSTM Walk Through

Thanks to nice illustrative pictures of LSTMs and RNNs by colah’s blog. Recurrent neural networks (RNNs) use the same set of parameters to deal with inputs that are sequential. Inputs are usually broke into pars of same lengths, and fed into RNNs sequentially. In this way, the model learned and preserve the information from sequences of arbitrary lengths. This trait becomes very useful in natural language use cases where a model that is capable of dealing with sentence of any length is needed....

February 6, 2021 · 5 min · Yiheng "Terry" Li

Notes About the Logics Behind the Development of Tree-Based Models

Tree-based methods contains a lot of tricks that are easily tested in data/machine learning related interviews, but very often mixed up. Go through these tricks while knowing the reasons behind could be very helpful in understanding + memorization. Overview of Tree-based Methods Overall speaking, simple decision/regression trees are for better interpretation (as they can be visualized), with some loss of performance (when compared to regression with regularization and non-linear regression methods, e....

December 8, 2020 · 6 min · Yiheng "Terry" Li

Projects Archive

BIOMEDIN 273B: Deep Learning in Genomics and Biomedicine Time: 2020 fall Description: This course was taught by professor Anshul Kundaje and James Zou. In the course, we formed groups and were given data and directions for course projects. Our group had 4 people, with data from an HT-recruit RNA-seq study, we wanted to build models to identify repression effect of protein tiles and Pfam proteins. Responsibility: I was responsible for building coding pipeline for data loading, visualization of figures, model tuning and prediction analysis, finalizing the presentations and final report....

November 29, 2020 · 3 min · Yiheng "Terry" Li

Complex Heatmap, flexible package for heatmap based on R

Heatmap is great tool for visualizing two dimensional data magnitude using colors. This package provides almost complete features for building a heatmap under multiple conditions. Parameters that are used to customize a heatmap could be extremely complicated. This article try to archive part of the features that have been useful in my recently work. Note down their usage and syntax. Data The dataset used in this article comes from project Tycho, and is about recorded cases of measles in the United States....

September 14, 2020 · 8 min · Yiheng "Terry" Li

Pyradiomics Simple Usage

Pyradiomics is an open-source python package for the extraction of radiomics data from medical images. Image loading and preprocessing (e.g. resampling and cropping) are first done using SimpleITK. Loaded data is then converted into numpy arrays for further calculation using multiple feature classes. Optional filters are also built-in. Ways to Deal with Medical Image Data The reasons that we choose pyradiomics could be its openness, widely recognized and reasonably good performance with a variety of features available....

September 11, 2020 · 5 min · Yiheng "Terry" Li