fbpx




Advanced R Training

Advanced R Training
 

Due to the COVID-19 our training courses will be taught via an online classroom.

Receive in-depth knowledge from industry professionals, test your skills with hands-on assignments & demos, and get access to valuable resources and tools.

This course is the Advanced version of the R training and includes a Linux CLI and statistics & math classes. R is used for data analysis, in-depth statistics, time series forecasting and, especially, data visualization. The lessons that are presented here focus on Linux Bash scripting, basic statistics, and R libraries such as tidyverse for data processing or ggplot2 for visualization. After this course, you will be able to analyze and visualize data with R. This course is ideal for everyone but especially for Data scientists that want to improve their skills and increase their toolkit.

Are you interested? Contact us and we will get in touch with you.

 

Get in touch for more information

Fill in the form and we will contact you about the Basic ML training:

Academy: Advanced R
I agree to be contacted *

About the training & classes

The Advanced R training is split in 3 days. Click below to see a detailed description of each class: 

 
Data Handling & Visualization with R

R has grown into a well-developed ecosystem with powerful packages for data analysis, data visualization, in-depth statistics, time series forecasting, and machine learning, to mention a few. This training aims to give a quick-paced introduction of R, its most relevant features, and basic workflow, including understanding how to apply them.

We start the training by discussing the basics of the R Programming language and its RStudio IDE, to understand its logic operations, data structures, workflow, etc. We then delve into a number of powerful packages such as dplyr, ggplot2, readr and other tidyverse packages and show how they are used for data preprocessing, analysis, and visualization.

Finally, we apply these concepts and tools in practice during a hands-on lab session. We implement a complete data analysis workflow in R, from retrieving real-time earthquake data from a webservice to preprocessing, analyzing and eventually visualizing this data on an interactive map.

The training includes theory, demos, and hands-on exercises.

After this training you will have gained knowledge about:

  • R Programming Basics
  • Packages: dplyr, ggplot2, readr, tidyverse, etc.
  • Working with Rmarkdown notebooks
  • Tips & conventions
  • Lab session to get hands-on experience with a complete data analysis workflow in R
Linux CLI

Regardless of your OS of choice, knowing how to deal with Linux through the command line is a valuable skill to have for any engineer or scientist. To look under the hood of the application your deployed, to debug that job you had running on one of those nodes that keep crashing, or to simply prepare this dataset that will take longer to download, transform and upload again, being able to utilize the power of Bash will not only often save you, it will actually speed your work up! As with any power tool, it is of course also very easy to cut off your own foot, so join us on this journey towards getting to know Bash and unlocking its power.

The training includes theory, demos, and hands-on exercises.

After this training you will have gained knowledge about:

  • Some concepts behind Linux
  • Everyday Bash tools
  • Tricks that will make Bash use easier
  • Basic Bash scripting
Basic Statistics & Math

In the last training of the series, we expand our knowledge of how to score machine learning models, discuss common pitfalls and show how to deal with them. We will do this by first examining the concepts of bias, variance, overfitting and underfitting, followed by diving into important performance metrics such as accuracy, precision, recall, F1 scores, ROC curves, etc. for classification problems and elaborating on commonly used metrics for regression. This last part in our basic toolkit allows us to properly assess a prediction model that we train to recognize images of handwritten digits during the hands-on lab session.

The training includes theory, demos, and hands-on exercises.

After this training you will have gained knowledge about:

  • Overfitting, underfitting. bias-variance tradeoff
  • Model evaluation in practice using sci-kit learn
  • Evaluation metrics for classification, such as accuracy, precision, recall, F1, area under curve
  • Interpreting confusion matrices, classification reports and ROC curves
  • Decision function and classification probabilities
  • Dealing with unbalanced datasets
  • Evaluation metrics for regression, such as MAE, RMSE, R^2