R Data Pre-Processing & Data Management – Shape your Data!

What you’ll learn
- import data into R in several ways while also beeing able to identify a suitable import tool
- select and implement a proper object class (data.frame, data.table, data_frame)
- convert your data into (and understand) a tidy data format
- filter and query your data based on a wide range of parameters
- join 2 data tables together with dplyr 2 table verb syntax
- use SQL code within R
- translate basic R into SQL
- work with dates and time
- work with strings using regular expressions
- detecting outliers in datasets
Requirements
- Computer with R and RStudio ready to use
- You should have basic R / RStudio knowledge
- Required add on packages will be listed in the course orientation video
Let’s get your data in shape!
Data Pre-Processing is the very first step in data analytics. You cannot escape it, it is too important. Unfortunately this topic is widely overlooked and information is hard to find.
With this course I will change this!
Data Pre-Processing as taught in this course has the following steps:
1. Data Import: this might sound trivial but if you consider all the different data formats out there you can imagine that this can be confusing. In the course we will take a look at a standard way of importing csv files, we will learn about the very fast fread method and I will show you what you can do if you have more exotic file formats to handle.
2. Selecting the object class: a standard data.frame might be fine for easy standard tasks, but there are more advanced classes out there like the data.table. Especially with those huge datasets nowadays, a data.frame might not do it anymore. Alternatives will be demonstrated in this course.
3. Getting your data in a tidy form: a tidy dataset has 1 row for each observation and 1 column for each variable. This might sound trivial, but in your daily work you will find instances where this simple rule is not followed. Often times you will not even notice that the dataset is not tidy in its layout. We will learn how tidyr can help you in getting your data into a clean and tidy format.
4. Querying and filtering: when you have a huge dataset you need to filter for the desired parameters. We will learn about the combination of parameters and implementation of advanced filtering methods. Especially data.table has proven effective for that sort of querying on huge datasets, therefore we will focus on this package in the querying section.
5. There are several methods of data joins in R, but here we will take a look at dplyr and the 2 table verbs which are such a great tool to work with 2 tables at the same time.
How do you best prepare yourself for this course?
You only need a basic knowledge of R to fully benefit from this course. Once you know the basics of RStudio and R you are ready to follow along with the course material. Of course you will also get the R scripts which makes it even easier.
The screencasts are made in RStudio so you should get this program on top of R. Add on packages required are listed in the course.
Again, if you want to make sure that you have proper data with a tidy format, take a look at this course. It will make your analytics with R much easier!
Who this course is for:
- Data pre-processing is a crucial step of data related work – therefore this course is intended for all R users
Source
R Data Pre-Processing & Data Management – Shape your Data! Download
If This Post is Helpful to You Leave a Comment Down Below Also Share This Post on Social Media by Clicking The Button Below