Top 5 Data-Engineering tools.
1) Prefect :- It is a data-pipeline manager.
2) Dask :- It is a Python library for parallel computing. It is lightweight and fast.
3) DVC :- Data version control, this is a git for data.
4) Great Expectations :- This Python libraries allows to declare rules to which you expect certain datasets and also validate…[Read more]
Reading list for distributed systems shared by Arvind, Sir in linkedln :- https://backendology.com/2018/09/10/distributed-systems-course-reading-list/
Mesos is a cluster resource manager that allows you to treat many servers as a single entity.
Bigdata. Hadoop and Map reduce code By Cloudera and Udacity.
Big Data Analytics using Python and Apache Spark | Machine Learning : https://www.youtube.com/watch?v=5HWveDdrosk&list=LLzVUJI_CHKiqFFBGhstk3vA&index=3&t=9372s