Developing a deep learning framework

Status

Last updated

Jan 21, 2026

Topics

Early stages of writing up about my project to create a neural network in c++. over time i want to improve training latency of the net via SW optimisations. But also it serves as a foundation to explore CUDA processing to make incredibly fast training loops - or atleast compared to cpu?

Github link: @codegen-cnn

Will create a series on this - currently the training algorithm is taking a while. I think I need to do vectorised updates for matrix processing with SIMD? Currently i’m doing the naive “update each cell in the matrix on its own” :/

Other notes about AI and/or Deep Learning and/or C++

🌲
Part 1. Thread Pools
Thread pooling with c++20 primitives
🌲
Part 2. Work Stealing Thread Pools
Work Stealing Thread Poools C++20
🌱
MPMC Queue
MPMC Queue
🌱
C++ low-latency design patterns
A brief description of what this note covers
🌱
Atomics
Atomics
🌿
SPSC Queue
SPSC Thread-Safe Queue
🌿
Implementing STL's std::shared_ptr
Implementing STL's std::shared_ptr
🌿
Implementing STL's std::unique_ptr
Implementing STL's std::unique_ptr
🌿
Implementing STL's std::vector
A brief description of what this note covers
🌿
Type Erasure in C++
Type Erasure in C++