My Deep Learning Journey with Fast.AI: Chapter 1

I recently took part in a hackathon and despite little to no guidance and a problem space me and my team weren’t really experts in, we decided to participate anyways. We came up with a simple, implementable in 2 days solution, that solved one of the problems we saw when we Googled “What are some problems with Edge computing”. Was it too simplistic of a solution? Yes. Did we have fun? Sure. But it didn’t seem to garner any interest from the judges who seemed to fumble to try and think of questions because it was so straightforward.

But those that know me know that I’m competitive as hell. For next year, I want an ace up my sleeve. I decided to take the plunge with the Fast.AI Deep Learning Course for Coders. The first lesson has me hooked. They have described that:

  • You do not need lots of math experience
  • You do not need a PHd
  • You don’t even need a lot of data

So even though this blog will probably never be read, I think this is a good place for me to summarize my thoughts and notes for future reference. Hold on tight!

The first project

I was introduced to Jupyter Notebooks, and it really is a neat technology. Very nice mix of code and documentation.

In the first video lesson they guide you through an image classifier and it really is simple. Here’s the Kaggle Jupyter Notebook for it: Smile or Frown Predictor | Kaggle

I did try to branch out on my own and do a Tabular learner (that’s talked about in the book but not in the video), as that’s the data I see most often, but I couldn’t take the example and apply it to a different data set. I kind of let that get to me for a while but I’m back on the horse and trying to push past it and maybe come back later.

My notes from the 1st chapter

  • Image Classifiers can be used for more than images
    • Spectograms of Sound
    • Visualized Data
    • Etc
  • Training Data vs Validation Data vs Test Data
    • Validation Data is needed to make sure that the model isn’t just memorizing data
    • Test Data can be used to make sure that there’s no user bias (like Unit Testing another person’s code)
    • In time series data:
      • Training data should be chunks of time, not random subsets
      • Validation/test data should be at the end to emulate not knowing the future
    • Test data should be data never experienced by the training data before (e.g. in a person classifier, test data may be a person never seen by training data)