It’s been ages since I wrote my last post. I am planning to be more active from now on (I hope).

I’ve been wanting to do a mini-site with machine learning tutorials for years and finally here it is!

The mini-site is ml-tutorials.kyrcha.info and its GitHub repo: https://github.com/kyrcha/ml-tutorials

The main reason for finally getting through it was that I started teaching two data mining courses in two postgraduate programs (one on the fall and one on the spring semester with different audiences) and I wanted to have some notes to give to students with R implementations of the algorithms I teach in theory in the classroom. The mini-site also include introductory material to R to help you get familiar with it.

At the moment I only discuss the R specifics of the algorithms, but my plans are to add some theory in each algorithm as well in order to make the tutorials more standalone.

For creating the site I used the R Markdown for Website and RStudio. A great resource is this cheatsheet.

Jupyter vs. R Markdown

I started this effort by working with Jupyter notebooks with an R-kernel, but the reasons that made me switch to R Markdown and RStudio were that:

  1. You can actually create out of the box mini-sites like that.
  2. I run into problems when I tried to render the Jupyter notebooks into pdf to hand out to students.
  3. R Markdown is in markdown and not in JSON, so it is easier to edit it with a text editor.
  4. Works well with GitHub and GitHub pages project sites.

Deployment

I wanted for GitHub to serve the rendered html pages via the GitHub pages project site functionality, using a custom domain to serve the site: the subdomain ml-tutorials.kyrcha.info. Searching a bit over the internet I set it up as follows:

Step 1: Configured the site rendering tool to put the generated html files to a docs folder

Step 2: Added a footer with a new google analytics property to check out the traffic.

Step 3: In the repo settings in GitHub I added

GitHub pages configuration

The above will add a CNAME file in the docs folder. Since the docs folder is deleted and re-created when rendering the site, I included it in the root folder of the project and in the _site.yml configuration file added: include: ["CNAME"] so that it is transferred in the docs folder every time the site is rendered.

Step 4: Finally I also created an CNAME record in my DNS provider, with name: ml-tutorials and value: kyrcha.github.io.

Custom DNS configuration

Now http://ml-tutorials.kyrcha.info/ shows whatever is served from GitHub pages https://kyrcha.github.io/ml-tutorials and https://kyrcha.github.io/ml-tutorials redirects to http://ml-tutorials.kyrcha.info/

Whenever I want to add a new tutorial or update an older one I:

  1. Make the changes in my Rmd files
  2. Render the site: rmarkdown::render_site()
  3. Do a git add and a git commit in the local repository and push both the source and the rendered html pages to GitHub.
  4. If I want to render a specific page to pdf I enter: rmarkdown::render("knn.Rmd", output_format="pdf_document")