Best Tutorials + Resources to Build from Scratch, Not Copying Code
Built for programmers who learn by doing, not watching or reading. Tutorials cover key skill sets in computer vision, ML + AI, and data engineering.
In this article, I share unique project tutorials designed to help you build projects alongside the tutorials, not by copying code, but by developing the actual idea yourself. Each tutorial here teaches you how to build things natively, avoiding heavy reliance on external APIs.
The Flask Mega-Tutorial: Your Path to Full-Stack Development
I know how to code and have some really cool ideas, but I often struggle to turn code into a usable platform. Enter the Holy Grail that takes you from “I don’t know how to build anything” to “Buy my B2B SaaS product that I built in the last two days”: The Flask Mega-Tutorial
Ideal for those who don’t have the patience to sit through another boring tutorial teaching you to copy code, this tutorial is structured to help you build the boilerplate for the project idea you actually want to create. In my opinion, this is the best and quickest way to transition from coding in notebooks to building full-stack applications. It is ideal for anyone with some knowledge of Python, taking you from a coder to a full-stack developer in a single walkthrough.
The best part is that it explains what each piece of code does, how it works, and why we use it in a concise but effective manner. By the end of it, you’ll be able to walk anyone through your code and how it works.
By the end you’ll know how to. . .
set up your environment
create, use, and migrate databases
build APIs
create basic front-ends
connect your front-end and back-end.
deploy your application.
From here you can learn more complex frameworks like Next.js or Django, but really this all you need.
Computer Vision
The best tutorials for learning unique and advanced skills you can use to build your own product instead of just creating a copy of someone else’s are from the Computer vision engineer on Youtube. Here are some of my favorite tutorials from his channel:
Customized Object Detection:
Train Yolov9 object detection custom data on Google Colab | 21:32 min
Custom Dataset Creation:
Emotion detection synthetic dataset | 35:24 min
Image Segmentation:
Image segmentation with Yolov8 custom dataset | 46:25 min
Facial Analysis:
Face recognition and face matching | 18:17 min
Image Classification:
Image classification + feature extraction | 21:59 min
Object Tracking:
Yolov8 object detection + deep sort object tracking | 34:32 min
AI / ML
Everyone wants to build AI, but few are actually creating AI projects. Here are tutorials that will help you learn how to build AI projects instead of just API wrappers:
Retrieval Augmented Generation (RAG):
Local Retrieval Augmented Generation (RAG) from Scratch (step by step tutorial) | 5:40:58 min
Large Language Models:
Create a Large Language Model from Scratch with Python | 5:43:40 min
Data Extraction + Pipelines
ML and AI models are only as good as the data they’re trained on. Learn how to extract data and create data pipelines with these top tutorials from the CodeWithYu channel on Youtube. Here are some of my favorite tutorials:
Streaming Data:
End to End Realtime Streaming with Unstructured Data | 2:30:31 min
Data Lakehouse:
Building Data Lakehouse from Scratch | 55:05 min
Algorithmic Trading:
Realtime Algorithmic Trading with Apache Flink | 1:41:34 min
Cool Libraries and Frameworks
Babit Multimedia Framework (BMF): BMF is an open source framework built by ByteDance to simplify complex video and audio processing task. BMF enables efficient creation of applications for video transcoding, real-time filtering, live streaming, and more. It offers flexibility for custom module integration and supports GPU acceleration for high-performance processing
YOLOv9: YOLO is a real-time object detection system widely used in computer vision applications. It's renowned for its speed and accuracy, capable of detecting multiple objects in images or video streams in a single forward pass of its neural network. Its ease of use, coupled with pre-trained models for common object classes, makes it accessible to developers with varying levels of machine learning expertise, enabling rapid integration of object detection capabilities into diverse projects.
Video-LLaMA: This is an instruction-tuned audio-visual language model for video understanding. It makes so that LLMS are capable of understanding both visual and auditory content in a video.