Data Science with R

  • In-depth knowledge of Data Science Life Cycle and Machine Learning Algorithms
  • Comprehensive knowledge of various tools and techniques for Data Transformation
  • The capability to perform Text Mining and Sentimental analyses on text data and gain an insight into Data Visualization and Optimization techniques
Overview
Objectives
Outline
Features
Who should attend
What you will learn

Overview

Data Science with R is useful at multiple career stages, hence it’s recommended for professionals, students, and managers. Data science has become the crux of every matter today. It covers foundational data science tools and techniques, including getting, cleaning, and exploring data, programming in R, and conducting reproducible research. Learners who complete this specialization will be prepared to take the Data Science Statistics and Machine Learning specialization, in which they build a data product using real-world data.

Here you’ll learn how to transform and clean your data, create and interpret descriptive statistics, data visualizations, and statistical models. Finally, you’ll learn how to handle Big Data, make predictions using machine learning algorithms, and deploy R to production. By the end of this course, you’ll have the skills necessary to use R and the principles of data science to transform your data into actionable insight.

Objectives

  • In-depth knowledge of Data Science Life Cycle and Machine Learning Algorithms
  • Comprehensive knowledge of various tools and techniques for Data Transformation
  • The capability to perform Text Mining and Sentimental analyses on text data and gain an insight into Data Visualization and Optimization techniques
  • The exposure to many real-life industry-based projects which will be executed in RStudio
  • Projects which are diverse in nature covering media, healthcare, social media, aviation and HR
  • Rigorous involvement of an SME throughout the Data Science Training to learn industry standards and best practices

Outline

Introduction to Data Science

  • What is Data Science?
  • What is Machine Learning?
  • What is Deep Learning?
  • What is AI?
  • Data Analytics & it’s types

Introduction to R

  • What is R?
  • Why R?
  • Installing R
  • R environment
  • How to get help in R
  • R Studio Overview

R Basics

  • Environment setup
  • Data Types
  • Variables Vectors
  • Lists
  • Matrix
  • Array
  • Factors
  • Data Frames
  • Loops
  • Packages
  • Functions
  • In-Built Data sets

R Packages

  • DMwR
  • Dplyr/plyr
  • Caret
  • Lubridate
  • E1071
  • Cluster/FPC
  • table
  • Stats/utils
  • ggplot/ggplot2
  • Glmnet

Importing Data

  • Reading CSV files
  • Saving in Python data
  • Loading Python data objects
  • Writing data to CSV file

Manipulating Data

  • Selecting rows/observations
  • Rounding Number
  • Selecting columns/fields
  • Merging data
  • Data aggregation
  • Data munging techniques

Statistics Basics

  • Central Tendency
  • Mean
  • Median
  • Mode
  • Skewness
  • Normal Distribution
  • Probability Basics
  • What does it mean by probability?
  • Types of Probability
  • ODDS Ratio?
  • Standard Deviation
  • Data deviation & distribution
  • Variance
  • Bias variance Tradeoff
  • Underfitting
  • Overfitting
  • Distance metrics
  • Euclidean Distance
  • Manhattan Distance
  • Outlier analysis
  • What is an Outlier?
  • Inter Quartile Range
  • Box & whisker plot
  • Upper Whisker
  • Lower Whisker
  • Scatter plot
  • Cook’s Distance
  • Missing Value treatments
  • What is an NA?
  • Central Imputation
  • KNN imputation
  • Dummification
  • Correlation
  • Pearson correlation
  • Positive & Negative correlation

Error Metrics

  • Classification
  • Confusion Matrix
  • Precision
  • Recall
  • Specificity
  • F1 Score
  • Regression
  • MSE
  • RMSE
  • MAPE

Machine Learning

Supervised Learning

  • Linear Regression
  • Linear Equation
  • Slope
  • Intercept
  • R square value
  • Logistic regression
  • ODDS ratio
  • Probability of success
  • Probability of failure
  • ROC curve
  • Bias Variance Tradeoff

Unsupervised Learning

  • K-Means
  • K-Means ++
  • Hierarchical Clustering

Machine Learning using R

  • Linear Regression
  • Logistic Regression
  • K-Means
  • K-Means++
  • Hierarchical Clustering – Agglomerative
  • CART
  • 0
  • Random forest
  • Naïve Bayes

Key Features

Who should attend

  • 24 hours of instructor-led training
  • Course Materials
  • Course Completion Certificate
  • 100% Money Back Guarantee
  • Flexibility to choose classes
  • Post training Support
  • Acquire end-to-end knowledge on Data Analytics and R tools
  • 24 PDUs
  • Master core analysis concepts and apply in predictive data modeling
  • Data Science with R Certified trainer
  • Industry wise real life examples
  • Expert advice and tips to apply theoretical skills
  • 10% discount on any Online Course
  • Professional from any domain who has logical, mathematical and analytical skills
  • Professionals working on Business intelligence, Data Warehousing and reporting tools
  • Statisticians, Economists, Mathematicians
  • Software programmers
  • Business analysts
  • Six Sigma consultants
  • Fresher from any stream with good Analytical and logical skills

What you will learn

  • Gain insight into the ‘Roles’ played by a Data Scientist
  • Analyze several types of data using R
  • Describe the Data Science Life Cycle
  • Work with different data formats like XML, CSV, etc.
  • Learn tools and techniques for Data Transformation
  • Discuss Data Mining techniques and their implementation
  • Analyze data using Machine Learning algorithms in R
  • Explain Time Series and it’s related concepts
  • Perform Text Mining and Sentimental analyses on text data
  • Gain insight into Data Visualization and Optimization techniques
  • Understand the concepts of Deep Learning