## R

### DATA SCIENCE WITH R

• PARAMETERS SPECIFICATIONS
• Tools Used R
• Learning Mode (Classroom – Instructor based)
• Duration 58 – 52 Hours
• Batch size 5- 8 Students
• Location Delhi (Saket)
• Course includes Live scenarios, Case Studies, Project, Assessments, Mock Interview.
• Study Material PPTs, Doc, Data, PDFs etc.

### TARGET AUDIENCE:

• Any graduate - No prior knowledge of Data Science / Analytics is required.

### WHAT IS R?

• R is open source data analysis software: and widely uses by Data scientists, statisticians, Researchers and Data analysts—anyone who needs to make sense/insight of data can use R for Statistical Analysis, Data visualization, and Predictive Modeling. R is created by Ross Ihaka and Robert Gentleman at the University of Auckland in New Zealand in the 1990s as a statistical platform for their students, and thus it has been extended over the decades by thousands of user-created libraries/packages. R is a programming language: An object-oriented language created by statisticians, R provides objects, operators, and functions that allow users to explore, model, and visualize data. R is a vector language, so anyone can add functions to a single Vector without putting in a loop. And at the same time R is powerful and faster than other languages, we can easily implement Machine Learning algorithms in a fast and simple way

### JOB PROFILES IN R:

• R Programmer
• Data Analyst/Miner
• Data Modeler
• Data Scientist
• ML specialist
• NLP specialist and many more.

### FUNDAMENTAL OF STATISTICS:

• Population and sample
• Descriptive and Inferential Statistics
• Statistical data analysis
• Variables
• Central Tendency, Sample and Population Distributions
• Central Limit Theorem (CLT)
• Estimation & Confidence interval
• Normal Distribution
• Skewness.
• Boxplot
• Standard deviation
• Standard Error
• Hypothesis testing
• P-value
• Scatter plot and correlation coefficient
• Scales of Measurements and Data Types
• Numerical Summarization
• Outliers & Summary
• Data Summarization
• Visual Summarization

### MODULE 1- INTRODUCTION TO R PROGRAMMING

• Installing & starting with R
• Basic and environmental features of R.
• Calculations with R
• Functions
• Understanding R language and programming guidelines
• Listing the objects in the workspace
• Vectors
• Extracting elements from vectors
• Vector arithmetic
• Simple patterned vectors
• Missing values and other special values
• Character vectors Factors
• More on extracting elements from vectors
• Matrices and arrays
• Data frames
• Dates and times
• Assignments with Datasets

### MODULE 2- INTRODUCTION TO DATA ANALYTICS:

• This module introduces you to some of the important keywords in R like Business Intelligence, Business
• Analytics, Data and Information. You can also learn how R can play an important role in solving complex analytical problems
• This module tells you what is R and how it is used by the giants like Google, Facebook, etc. Also, you will learn use of 'R' in the industry, this module also helps you compare R with other software in analytics, install R and its packages.
• Business Analytics, Data, Information Understanding Business Analytics and R Compare R with other software in analytics Install R Perform basic operations in R using command line

### MODULE 3- IMPORT AND EXPORT DATA IN R

• Importing data into R
• CSV File
• Excel File
• Import data from text table
• DATA SCIENCE USING R-PROGRAMMING
• Topics
• Variables in R
• Scalars
• Vectors
• R Matrices
• List
• R – Data Frames
• Using c, Cbind, Rbind, attach and detach etc. functions in R
• R – Factors
• R – CSV Files
• R – Excel File
• Assignments
• Business Scenario/Group Discussion
• R Nuts and Bolts
• Entering Input. – Evaluation- R Objects- Numbers- Attributes- Creating Vectors- Mixing
• Objects- Explicit Coercion- Summary- Names- Data Frames

### MODULE 4- MANAGING DATA FRAMES WITH THE DPLYR PACKAGE:

• The dplyr Package
• Installing the dplyr package
• select()
• filter()
• arrange()
• rename()
• mutate()
• group_by()
• %>%
• Assignments
• Business Scenario/Group Discussion

### MODULE 5- LOOPS FUNCTIONS:

• Looping on the Command Line
• lapply()
• sapply()
• tapply()
• apply()
• Assignments
• Business Scenerio/Group Discussion

### MODULE 6- DATA MANIPULATION IN R OBJECTIVES:

• In this module, we start with a sample of a dirty data set and perform Data Cleaning on it, resulting
• in a data set, which is ready for any analysis
• Thus using and exploring the popular functions required to clean data in R.
• Topics
• Data sorting
• Find and remove duplicates record
• Cleaning data
• Merging data
• Statistical Plotting
• Bar charts and dot charts
• Pie charts
• Histograms
• Box plots
• Scatter plots
• QQ plots
• Assignments with Datasets

### OBJECTIVES

• Control Structure Programming with R
• The for() loop
• The if() statement
• The while() loop
• The repeat loop, and the break and next statements
• Apply, Sapply, Lapply
• Assignments with Datasets

### FACTORS:

• Using Factors
• Manipulating Factors
• Numeric Factors
• Creating Factors from Continuous Variables
• Convert the variables in factors or in others

### RESHAPING:

• Data Modifying
• Data Frame Variables
• Recoding Variables
• The recode Function
• Reshaping Data Frames
• The reshape Package
• Assignments with Datasets

### MODULE 7- BASICS OF STATISTICS & LINEAR & MULTIPLE REGRESSION:

• This module touches the base of Descriptive and Inferential Statistics and Probabilities & 'Regression Techniques'.
• Linear and logistic regression is explained from the basics with the examples and it is implemented in R using two case studies dedicated to each type of Regression discussed.
• Assessing the Accuracy of the Coefficient Estimates
• Assessing the Accuracy of the Model
• Estimating the Regression Coefficients.
• Some Important Questions
• Lab: Linear Regression.
• Libraries
• Simple Linear Regression
• Multiple Linear Regression
• Interaction Terms
• Qualitative Predictors
• Writing Functions
• Assignments with Different Datasets
• Business Scenario/Group Discussion