← Back to Home
Python
A versatile programming language widely used for machine learning, data science, and LLM development due to its simplicity and rich ecosystem.
PyTorch
An open-source deep learning framework that provides flexibility and speed for building and training neural networks, popular in research and industry.
TensorFlow
A powerful open-source platform for machine learning and deep learning, developed by Google, supporting large-scale model training and deployment.
Hugging Face
A company and open-source community known for the Transformers library, providing state-of-the-art NLP models and tools for LLMs.
Transformers
A neural network architecture based on self-attention, enabling parallel processing and powering modern LLMs like BERT and GPT.
Datasets
Large collections of text or data used to train, validate, and test LLMs. Quality and diversity of datasets are crucial for model performance.
Jupyter Notebooks/Google Colab
Interactive environments for writing, running, and sharing code, widely used for prototyping, experimentation, and education in ML and LLMs.
Fundamentals of Machine Learning
Core concepts such as supervised/unsupervised learning, loss functions, optimization, and generalization that underpin LLMs.
Neural Networks
Computational models inspired by the human brain, consisting of layers of interconnected nodes (neurons) for learning complex patterns.
Recurrent Neural Networks (RNNs)
A type of neural network designed for sequential data, where outputs from previous steps are fed as inputs to the next step.
Long Short-Term Memory (LSTMs)
A special kind of RNN capable of learning long-term dependencies, widely used in NLP before the rise of Transformers.
NLP
Natural Language Processing: the field of AI focused on enabling computers to understand, interpret, and generate human language.
Tokenization
The process of splitting text into smaller units (tokens), such as words or subwords, for input into language models.
Word Embeddings
Numerical representations of words in a continuous vector space, capturing semantic relationships and used as input to neural networks.
Language Modeling
The task of predicting the next word or sequence of words in a sentence, fundamental to training LLMs.
Transformer
A deep learning model architecture based on self-attention, enabling efficient parallelization and long-range context understanding.
Self-Attention Mechanism
A technique allowing models to weigh the importance of different words in a sequence when encoding meaning, core to Transformers.
Encoder-Decoder Structure
A neural network design where the encoder processes input data and the decoder generates output, used in translation and summarization.
Positional Encoding
A method for injecting information about the order of tokens into Transformer models, since self-attention is order-agnostic.
LLM Pre-training
The initial phase where a language model learns general language patterns from large datasets, usually in an unsupervised manner.
LLM Fine-tuning
The process of adapting a pre-trained LLM to specific tasks or domains using smaller, task-specific datasets.
Click to explore video resources
Prompt Engineering
The art of designing effective prompts to guide LLMs in generating desired outputs for various applications.
Reinforcement Learning from Human Feedback (RLHF)
A training approach where LLMs are further improved using feedback from humans, often to align outputs with user intent and safety.
Model Evaluation
The process of assessing LLM performance using metrics, benchmarks, and real-world tasks to ensure quality and reliability.