Learning Resources and Roadmap for Data Scientists
The following is a roadmap that will help you enter the data science field. The resources mentioned here are available for free. Data science is a field that has multiple prerequisites: mathematics, programming, and machine learning theory. The fields are highly interconnected. In some cases to understand the mathematics behind it, you will benefit if you know the implementation or vice versa. If you get stuck on a topic and don’t understand it fully, continue learning to the next topic or the same topic in another field (i.e. math vs theory). Think of this learning in a cyclical manner. Go over things, just to grasp the concepts, then revisit them to understand fully.
Free Resources for Beginners
Calculus 1 | Math
Essence of calculus
3. Linear Algebra
Khan Academy Linear Algebra
Essence of linear algebra
4. Probability and Statistics
Khan Academy Statistics and Probability
1. Machine Learning
- Stanford lecture with Andrew Ng on Machine Learning.
Lecture 1 – Welcome | Stanford CS229: Machine Learning (Autumn 2018)
- Introduction to Statistical Learning(Must Read)
- Stanford course for Machine learning, which includes R codes.
In-depth introduction to machine learning in 15 hours of expert videos
2. Useful Libraries and Youtube Videos
Working with vectors, numbers, matrices, tensors
Keith Galli (Numpy):
Complete Python NumPy Tutorial (Creating Arrays, Indexing, Math, Statistics, Reshaping)
To read csv files and manipulate tabular data
Keith Galli (Pandas):
Complete Python Pandas Data Science Tutorial! (Reading CSV/Excel files, Sorting, Filtering, Groupby)
For classical ML solutions, most algorithms can be found here
Keith Galli (Sklearn):
Real-World Python Machine Learning Tutorial w/ Scikit Learn (sklearn basics, NLP, classifiers, etc)
- Matplotlib and seaborn
For visualizing tabular data
Seaborn Tutorial 2021
To create folders, get folder/file names, make usable path strings
Python Tutorial: OS Module – Use Underlying Operating System Functionality
To get a list of files/folders within a folder. If you see yourself using os.dirlist, os.walk you should consider using glob instead
Python and the Glob Function Easy Tutorial
To move and copy files around
shutil — High-level file operations — Python 3.9.6 documentation
To read images and visualize them, and a lot more computer vision operations
OpenCV Course – Full Tutorial with Python
To read images and visualize them, and detect corrupted images
Python Tutorial: Image Manipulation with Pillow
3. Deep Learning
You need to develop intuition and understanding of how neural networks work and are implemented. In this section, I have included courses and materials.
Deep Learning Book
This course is a must-watch, it covers a lot of underlying concepts and mathematical tools to understand what deep learning is. You can watch it freely on Coursera.
Deep Learning by deeplearning.ai
I recommend watching the first 4 courses of this specialization.
Official tutorials are really good if you have a good understanding of deep learning theory: https://pytorch.org/tutorials/
There are two youtube videos that I found which covers a lot of topics
PyTorch for Deep Learning – Full Course / Tutorial Watch until 8th hour
PyTorch Tutorial 01 – Installation Watch until tutorial 17
Complete Road Map To Prepare For Deep Learning
This is an awesome channel about mathematics that visualizes various concepts I highly recommend.
Neural Network: https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi