R

DATA SCIENCE WITH R

  • PARAMETERS SPECIFICATIONS
  • Tools Used R
  • Learning Mode (Classroom – Instructor based)
  • Duration 58 – 52 Hours
  • Batch size 5- 8 Students
  • Location Delhi (Saket)
  • Course includes Live scenarios, Case Studies, Project, Assessments, Mock Interview.
  • Study Material PPTs, Doc, Data, PDFs etc.

TARGET AUDIENCE:

  • Any graduate - No prior knowledge of Data Science / Analytics is required.

WHAT IS R?

  • R is open source data analysis software: and widely uses by Data scientists, statisticians, Researchers and Data analysts—anyone who needs to make sense/insight of data can use R for Statistical Analysis, Data visualization, and Predictive Modeling. R is created by Ross Ihaka and Robert Gentleman at the University of Auckland in New Zealand in the 1990s as a statistical platform for their students, and thus it has been extended over the decades by thousands of user-created libraries/packages. R is a programming language: An object-oriented language created by statisticians, R provides objects, operators, and functions that allow users to explore, model, and visualize data. R is a vector language, so anyone can add functions to a single Vector without putting in a loop. And at the same time R is powerful and faster than other languages, we can easily implement Machine Learning algorithms in a fast and simple way

JOB PROFILES IN R:

  • R Programmer
  • Data Analyst/Miner
  • Data Modeler
  • Data Scientist
  • ML specialist
  • NLP specialist and many more.

*****COURSE CONTENT*****

FUNDAMENTAL OF STATISTICS:

  • Population and sample
  • Descriptive and Inferential Statistics
  • Statistical data analysis
  • Variables
  • Central Tendency, Sample and Population Distributions
  • Central Limit Theorem (CLT)
  • Estimation & Confidence interval
  • Normal Distribution
  • Skewness.
  • Boxplot
  • Standard deviation
  • Standard Error
  • Hypothesis testing
  • P-value
  • Scatter plot and correlation coefficient
  • Scales of Measurements and Data Types
  • Numerical Summarization
  • Outliers & Summary
  • Data Summarization
  • Visual Summarization

MODULE 1- INTRODUCTION TO R PROGRAMMING

  • Installing & starting with R
  • Basic and environmental features of R.
  • Calculations with R
  • Functions
  • Understanding R language and programming guidelines
  • Listing the objects in the workspace
  • Vectors
  • Extracting elements from vectors
  • Vector arithmetic
  • Simple patterned vectors
  • Missing values and other special values
  • Character vectors Factors
  • More on extracting elements from vectors
  • Matrices and arrays
  • Data frames
  • Dates and times
  • Assignments with Datasets

MODULE 2- INTRODUCTION TO DATA ANALYTICS:

  • This module introduces you to some of the important keywords in R like Business Intelligence, Business
  • Analytics, Data and Information. You can also learn how R can play an important role in solving complex analytical problems
  • This module tells you what is R and how it is used by the giants like Google, Facebook, etc. Also, you will learn use of 'R' in the industry, this module also helps you compare R with other software in analytics, install R and its packages.
  • Business Analytics, Data, Information Understanding Business Analytics and R Compare R with other software in analytics Install R Perform basic operations in R using command line

MODULE 3- IMPORT AND EXPORT DATA IN R

  • Importing data into R
  • CSV File
  • Excel File
  • Import data from text table
  • DATA SCIENCE USING R-PROGRAMMING
  • Topics
  • Variables in R
  • Scalars
  • Vectors
  • R Matrices
  • List
  • R – Data Frames
  • Using c, Cbind, Rbind, attach and detach etc. functions in R
  • R – Factors
  • R – CSV Files
  • R – Excel File
  • Assignments
  • Business Scenario/Group Discussion
  • R Nuts and Bolts
  • Entering Input. – Evaluation- R Objects- Numbers- Attributes- Creating Vectors- Mixing
  • Objects- Explicit Coercion- Summary- Names- Data Frames

MODULE 4- MANAGING DATA FRAMES WITH THE DPLYR PACKAGE:

  • The dplyr Package
  • Installing the dplyr package
  • select()
  • filter()
  • arrange()
  • rename()
  • mutate()
  • group_by()
  • %>%
  • Assignments
  • Business Scenario/Group Discussion

MODULE 5- LOOPS FUNCTIONS:

  • Looping on the Command Line
  • lapply()
  • sapply()
  • tapply()
  • apply()
  • Assignments
  • Business Scenerio/Group Discussion

MODULE 6- DATA MANIPULATION IN R OBJECTIVES:

  • In this module, we start with a sample of a dirty data set and perform Data Cleaning on it, resulting
  • in a data set, which is ready for any analysis
  • Thus using and exploring the popular functions required to clean data in R.
  • Topics
  • Data sorting
  • Find and remove duplicates record
  • Cleaning data
  • Merging data
  • Statistical Plotting
  • Bar charts and dot charts
  • Pie charts
  • Histograms
  • Box plots
  • Scatter plots
  • QQ plots
  • Assignments with Datasets

OBJECTIVES

  • Control Structure Programming with R
  • The for() loop
  • The if() statement
  • The while() loop
  • The repeat loop, and the break and next statements
  • Apply, Sapply, Lapply
  • Assignments with Datasets

FACTORS:

  • Using Factors
  • Manipulating Factors
  • Numeric Factors
  • Creating Factors from Continuous Variables
  • Convert the variables in factors or in others

RESHAPING:

  • Data Modifying
  • Data Frame Variables
  • Recoding Variables
  • The recode Function
  • Reshaping Data Frames
  • The reshape Package
  • Assignments with Datasets

MODULE 7- BASICS OF STATISTICS & LINEAR & MULTIPLE REGRESSION:

  • This module touches the base of Descriptive and Inferential Statistics and Probabilities & 'Regression Techniques'.
  • Linear and logistic regression is explained from the basics with the examples and it is implemented in R using two case studies dedicated to each type of Regression discussed.
  • Assessing the Accuracy of the Coefficient Estimates
  • Assessing the Accuracy of the Model
  • Estimating the Regression Coefficients.
  • Some Important Questions
  • Lab: Linear Regression.
  • Libraries
  • Simple Linear Regression
  • Multiple Linear Regression
  • Interaction Terms
  • Qualitative Predictors
  • Writing Functions
  • Assignments with Different Datasets
  • Business Scenario/Group Discussion