It’s been ages since I wrote my last post. I am planning to be more active from now on (I hope).
I’ve been wanting to do a mini-site with machine learning tutorials for years and finally here it is!
The mini-site is ml-tutorials.kyrcha.info and its GitHub repo: https://github.com/kyrcha/ml-tutorials
The main reason for finally getting through it was that I started teaching two data mining courses in two postgraduate programs (one on the fall and one on the spring semester with different audiences) and I wanted to have some notes to give to students with R implementations of the algorithms I teach in theory in the classroom. The mini-site also include introductory material to R to help you get familiar with it.
At the moment I only discuss the R specifics of the algorithms, but my plans are to add some theory in each algorithm as well in order to make the tutorials more standalone.
For creating the site I used the R Markdown for Website and RStudio. A great resource is this cheatsheet.
Jupyter vs. R Markdown
I started this effort by working with Jupyter notebooks with an R-kernel, but the reasons that made me switch to R Markdown and RStudio were that:
- You can actually create out of the box mini-sites like that.
- I run into problems when I tried to render the Jupyter notebooks into pdf to hand out to students.
- R Markdown is in markdown and not in JSON, so it is easier to edit it with a text editor.
- Works well with GitHub and GitHub pages project sites.
Deployment
I wanted for GitHub to serve the rendered html pages via the GitHub pages project site functionality, using a custom domain to serve the site: the subdomain ml-tutorials.kyrcha.info. Searching a bit over the internet I set it up as follows:
Step 1: Configured the site rendering tool to put the generated html files to a docs folder
Step 2: Added a footer with a new google analytics property to check out the traffic.
Step 3: In the repo settings in GitHub I added
The above will add a CNAME file in the docs folder. Since the docs folder is deleted and re-created when rendering the site, I included it in the root folder of the project and in the _site.yml
configuration file added: include: ["CNAME"]
so that it is transferred in the docs folder every time the site is rendered.
Step 4: Finally I also created an CNAME record in my DNS provider, with name: ml-tutorials
and value: kyrcha.github.io
.
Now http://ml-tutorials.kyrcha.info/ shows whatever is served from GitHub pages https://kyrcha.github.io/ml-tutorials and https://kyrcha.github.io/ml-tutorials redirects to http://ml-tutorials.kyrcha.info/
Whenever I want to add a new tutorial or update an older one I:
- Make the changes in my Rmd files
- Render the site:
rmarkdown::render_site()
- Do a git add and a git commit in the local repository and push both the source and the rendered html pages to GitHub.
- If I want to render a specific page to pdf I enter:
rmarkdown::render("knn.Rmd", output_format="pdf_document")
Comments