# Learning Resources and Roadmap for Data Scientists

The following is a roadmap that will help you enter the data science field. The resources mentioned here are available for free. Data science is a field that has multiple prerequisites: mathematics, programming, and machine learning theory. The fields are highly interconnected. In some cases to understand the mathematics behind it, you will benefit if you know the implementation or vice versa. If you get stuck on a topic and don’t understand it fully, continue learning to the next topic or the same topic in another field (i.e. math vs theory). Think of this learning in a cyclical manner. Go over things, just to grasp the concepts, then revisit them to understand fully.

## Free Resources for Beginners

**1. Programming**

**Online interactive python course:**

Learn Python – Free Interactive Python Tutorial**Corey Schafer (Youtube):**

Python Tutorials

**2. Calculus**

**3. Linear Algebra**

**4. Probability and Statistics**

Khan Academy Statistics and Probability

## Main Topics

**1. Machine Learning**

**Stanford lecture with Andrew Ng on Machine Learning**.

Lecture 1 – Welcome | Stanford CS229: Machine Learning (Autumn 2018)

**Introduction to Statistical Learning(Must Read)**

https://www.statlearning.com/s/ISLRSeventhPrinting.pdf

**Stanford course for Machine learning, which includes R codes.**

In-depth introduction to machine learning in 15 hours of expert videos

**2. Useful Libraries and Youtube Videos**

**numpy**

Working with vectors, numbers, matrices, tensors

Keith Galli (Numpy):

Complete Python NumPy Tutorial (Creating Arrays, Indexing, Math, Statistics, Reshaping)**pandas**

To read csv files and manipulate tabular data

Keith Galli (Pandas):

Complete Python Pandas Data Science Tutorial! (Reading CSV/Excel files, Sorting, Filtering, Groupby)**sklearn**

For classical ML solutions, most algorithms can be found here

Keith Galli (Sklearn):

Real-World Python Machine Learning Tutorial w/ Scikit Learn (sklearn basics, NLP, classifiers, etc)**Matplotlib and seaborn**

For visualizing tabular data

Derek Banas:

Seaborn Tutorial 2021**os**

To create folders, get folder/file names, make usable path strings

Corey Schafer:

Python Tutorial: OS Module – Use Underlying Operating System Functionality**glob**

To get a list of files/folders within a folder. If you see yourself using os.dirlist, os.walk you should consider using glob instead

PythonHumanities:

Python and the Glob Function Easy Tutorial**shutil**

To move and copy files around

shutil — High-level file operations — Python 3.9.6 documentation**cv2**

To read images and visualize them, and a lot more computer vision operations

OpenCV Course – Full Tutorial with Python**PIL**

To read images and visualize them, and detect corrupted images

Corey Schafer:

Python Tutorial: Image Manipulation with Pillow

**3. Deep Learning**

You need to develop intuition and understanding of how neural networks work and are implemented. In this section, I have included courses and materials.

Deep Learning Book

**Coursera Andrew Ng**

This course is a must-watch, it covers a lot of underlying concepts and mathematical tools to understand what deep learning is. You can watch it freely on Coursera.

Deep Learning by deeplearning.ai

I recommend watching the first 4 courses of this specialization.

**Pytorch FrameWork:**

Official tutorials are really good if you have a good understanding of deep learning theory: https://pytorch.org/tutorials/

There are two youtube videos that I found which covers a lot of topics

PyTorch for Deep Learning – Full Course / Tutorial Watch until 8th hour

PyTorch Tutorial 01 – Installation Watch until tutorial 17

**Krish Naik (YouTube)**

Complete Road Map To Prepare For Deep Learning

**3Blue1Brown**

This is an awesome channel about mathematics that visualizes various concepts I highly recommend.__
__Neural Network: https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi