Generative model Generative model is a model for generating all variables including outputs. I will give a very simple example with strong assumptions. Data $\boldsymbol{x^{(n)} } $ are generated by an unknown matrix, $\boldsymbol{G}$. $$ \boldsymbol{x} = \boldsymbol{G}~\boldsymbol{s} $$ The goal is to find the source variable $\boldsymbol{s}$. we assume that the number of sources is equal to the number of observations We assume that the latent variables are independently distributed, with marginal distributions We assume that the vector $\boldsymbol{x}$ is generated without noise for simplicity.

Continue reading

I announce over and over that the chronicle ordering of the post are irrelevant for beginners' favor. There are many blanks I skipped. I would fill the holes later. Variational method During my physics coursework and researches, I used this method countlessly. I even had a book of the name. It is quite simple, but also as big topic as being a book. Simply put, it is a technique to find equations and solutions (sometimes approximate solutions) by extremizing functionals which is mainly just integrals of fields, and treat the functions in the integral, as parameters.

Continue reading

Yay! Finally something more directly from physics to data science. We will also have a chance to see how Metropolis-Hastings algorithm works! The Hamiltonian Monte Carlo method is a kind of Metropolis-Hastings method. One of the weak points of Monte Carlo sampling comes up with random walks. Hamiltonian Monte Carlo method (HMC) is an approach to reducing the randomizing in algorithm of the sampling. The original name was hybrid Monte Carlo method.

Continue reading

In advance, I will proceed in the extension of the previous post. I will use the same target distribution function and the similar Gaussian disposal distribution. Even Python script will be better understood if you’ve already read the previous post about importance sampling. The rejection sampling could be the most familiar Monte Carlo sampling. When need to introduce Monte Carlo method to somebody, it is very intuitive and effective to give an example of computing the area of the circle (or anything) by using random samples.

Continue reading

Importance sampling is the first sampling method I faced when I studied Monte Carlo method. Nevertheless, I haven’t seen many examples for the importance sampling. Maybe it is because the importance sampling is not effective for high dimensional systems. The weak point of the importance sampling is that the performance of it is determined by how well we choose the disposal distribution close to the target distribution. Here, I will present a simple example of the importance sampling.

Continue reading

Author's picture

Namshik Kim

physicist, data scientist

Data Scientist

Vancouver, BC, Canada.