For better experience, future blog articles will be posted at http://kehang.github.io.
How I became a Google AI resident
It's been three months into my Google AI residency journey and sometimes I'm asked about how I became a Google AI resident. I thought it could be a good idea to write it down as my personal reflection. Readers who's interested in applying this program in the future hopefully can find it helpful as a reference point.Continue [...]
2020 End of Year Summary
It’s an understatement to say that 2020 is a challenging year. If I’d known what was going to happen, I’d probably adjust the list of my new year resolution on my 2019 birthday. 2020 unfolded itself anyways as the first year of my 30s.
So many special things took place in this year, I guess it won’t hurt for me to add another one: my first time writing some kind of end of year summary.Continue [...]
Tuning ML Models with optuna
Since my previous writings - Tuning sklearn Models with hyperopt and Tuning keras Models with hyperopt, I've been using
hyperopt library until recently read a well-written post and noticed a similar package
optuna has been gaining a lot of momentum. After first trying it out, I started loving to use it because of its user-friendliness, intuitiveness and importantly minimal required changes to parallelize calculation. So today I decided to make a sibling post to show you how to tune ML models via
Selected Reads on Neural Architecture Search
Neural Architecture Search (NAS), which is closely related to my research work in self-evolving machine, has become increasingly important to online learning and lifelong learning. As data is continuously collected, retraining neural networks with predefined architecture gets more difficult, and it eventually becomes impossible to keep up with the underlying data distribution at some point. Here, a new neural network architecture can come to the rescue. So in the post, I have selected three papers to present the research on this topic. These papers are not the most recent, nonetheless they are important representative in the field. Here we go.
Zoph's paper: Learning Transferable Architectures for Scalable Image Recognition
Liu's paper: DARTS: Differentiable Architecture Search
Gaier's paper: Weight Agnostic Neural Networks
Tuning keras Models with hyperopt
In last post Tuning sklearn Models with hyperopt, I shared how to use
hyperopt to tune hyper-parameters for a traditional sklearn model. Continuing this topic, today's post covers how that works for a keras neural networks model. I trid to write this post in a similar style to Tuning sklearn Models with hyperopt, so that readers can compare them side by side. In this post, I'll also share two scenarios: single-machine (base) scenario and distributed scenario.
Tuning sklearn Models with hyperopt
Hyper-parameter tuning has been one of the least wanted tasks for a data scientist. In the past, people have to do in either grid search or random search strategy. Recently, there's been a trend of using bayesian optimization strategy, e.g., AutoML. I've been wanting to try out this python library called hyperopt. In this post, I'll share what I'd use it in two scenarios.Continue [...]
Demystifying Named Entity Recognition - Part II
As a continuation for Demystifying Named Entity Recognition - Part I, in this post I'll discuss popular models available in the field and try to cover:
popular traditional models
deep learning models
Over the history of NER, there's been three major approaches: grammar-based, dictionary-based and machine-learning-based. Grammar-based approach produces a set of empirical rules hand-crafted by experienced computational linguists, usually takes months of work. Dictionary-based approach basically organizes all the known entities into a lookup table, which can be used to detect whether a candidate belongs to a defined category or not. By design it doesn't work well with newly invented entities. Machine-learning-based approach typically needs annotated data, but doesn't necessarily rely on domain experts to come up with rules or fail on unseen entities.
This post focuses only on machine-learning based models.Continue [...]
Demystifying Named Entity Recognition - Part I
Recently I've been working on a project related to Named Entity Recognition (NER). At the very beginning, I was trying to find a well-explained document to get myself started, but couldn't do so (instead I found redundant pieces here and there on the Internet). My requirement is simple. It should include
what is NER
how to formulate it
what are the traditional and start-of-the-art models
what are the off-the-shelf options
how to build a customized NER model
So this post will try to provide a complete set of explanation on these questions.Continue [...]