How GPT Works: Implementing a Minimal GPT (2023) with 200 Lines of Python Code
Blog: http://arthurchiao.art/blog/gpt-as-a-finite-state-markov-chain-en/
This article is a compilation and translation by arthurchiao from Andrej Karpathy's tweets and an article: GPT as a finite-state Markov chain.
In fact, this article is based on PyTorch and does not solely rely on basic Python packages to implement a GPT. The main purpose is to provide a intuitive understanding of the internal workings of a complex system like GPT (not at a very low level).
It's a bit long...