Introduction to Data Science

Download free course Introduction to Data Science, pdf file on 722 pages by Rafael A Irizarry.
The demand for skilled data science practitioners in industry, academia, and government is rapidly growing. This book introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression and machine learning. It also helps you develop skills such as R programming, data wrangling with dplyr, data visualization with ggplot2, algorithm building with caret, file organization with UNIX/Linux shell, version control with Git and GitHub, and reproducible document preparation with knitr and R markdown. The book is divided into six parts: R, Data Visualization, Data Wrangling, Probability, Inference and Regression with R, Machine Learning, and Productivity Tools. Each part has several chapters meant to be presented as one lecture. The book includes dozens of exercises distributed across most chapters.

Table of contents

  • R
  • Getting Started with R and RStudio
  • R Basics
  • Programming basics
  • The tidyverse
  • Importing data
  • Data Visualization
  • Introduction to data visualization
  • ggplot2
  • Visualizing data distributions
  • Data visualization in practice
  • Data visualization principles
  • Robust summaries
  • Statistics with R
  • Introduction to Statistics with R
  • Probability
  • Random variables
  • Statistical Inference
  • Statistical models
  • Regression
  • Linear Models
  • Association is not causation
  • Data Wrangling
  • Introduction to Data Wrangling
  • Reshaping data
  • Joining tables
  • Web Scraping
  • String Processing
  • Parsing Dates and Times
  • Text mining
  • Machine Learning
  • Introduction to Machine Learning
  • Smoothing
  • Cross validation
  • The caret package
  • Examples of algorithms
  • Machine learning in practice
  • Large datasets
  • Clustering
  • Productivity tools
  • Introduction to productivity tools
  • Organizing with Unix
  • Git and GitHub
  • Reproducible projects with RStudio and R markdown
Pages : 722
Size : 55.8 MB
File type : PDF
Downloads: 163
Created: 2022-02-03
License: CC BY-NC-SA
Author(s): Rafael A Irizarry
Introduction to Data Science

Others Computer science Tutorials

Blazor, A Beginners Guide

Think Bayes

Smooth CoffeeScript

Natural Computing and Beyond

Games and Rules

Others related eBooks about Introduction to Data Science

Digital Video Concepts, Methods, and Metrics

Digital Video Concepts, Methods, and Metrics: Quality, Compression, Performance, and Power..., download free Video Concepts tutorial in PDF (368 pages) created by Shahriar Akramullah ....

An Introduction to Combinatorics and Graph Theory

Combinatorics is a branch of mathematics concerning the study of finite or countable discrete structures. Aspects of combinatorics include counting the structures of a given kind and size (enumerative combinatorics), deciding when certain criteria can be met, and constructing and analyzing objects...

Learning Neo4j

Download free course Learning Neo4j, pdf file on 222 pages by Rik Van Bruggen....

Artificial Intelligence: Foundations of Computational Agents, 2nd Edition

This text is a modern and coherent introduction to the field of Artificial Intelligence that uses rational computational agents and logic as unifying threads in this vast field. Many fully worked out examples, a good collection of paper-and-pencil exercises at various levels of difficulty, programmi...

Getting Started with InnerSource

Download free course Getting Started with InnerSource, pdf file on 22 pages by Andy Oram....

DevOps for Digital Leaders

Download free course DevOps for Digital Leaders, pdf file on 176 pages by Aruna Ravichandran, Kieran Taylor, Peter Waterhouse....

Mathematica® Programming: an Advanced Introduction

Starting from first principles, this book covers all of the foundational material needed to develop a clear understanding of the Mathematica language, with a practical emphasis on solving problems. Concrete examples throughout the text demonstrate how Mathematica language, can be used to solve probl...

Notes on Diffy Qs: Differential Equations for Engineers

An introductory course on differential equations aimed at engineers. The book covers first order ODEs, higher order linear ODEs, systems of ODEs, Fourier series and PDEs, eigenvalue problems, the Laplace transform, and power series methods. The book originated as class notes for Math 286 at the Univ...

Interpretable Machine Learning: A Guide for Making Black Box Models Explainable

This book explains to you how to make (supervised) machine learning models interpretable....

Elements of Robotics

Download free course Elements of Robotics, pdf file on 311 pages by Mordechai Ben-Ari, Francesco Mondada....