๐ง Resources for Generative AI & LLM

This post shares all the useful Generative AI & LLM resources that I came across or created. Some are held in my GitHub, such as the Intro to AI repository. All tutorials introduced should be beginner-friendly.
โ๏ธ Prompting
In GenAI context, prompt is the input or instruction given to an AI model to generate a specific output, such as text, images, or code. Prompts act as a guide, shaping the AI’s response and influencing its style, content, and level of detail. Effective prompting is crucial for harnessing the full potential of AI model.
To learn the basics of prompting, I recommend this general introduction to prompting by Google Cloud (link). If you need any prompt examples when creating your own, prompt.chat and Useful ChatGPT Prompts maintain a large directory of effective prompts for GenAIs.
Lastly, I am listing a few high-quality prompting guides that I found useful.
๐ฅ๏ธ Platform for LLM Systems
Ollama
Ollama is a free, open-source tool that allows users to run large language models (LLMs) locally on their computers. Details can be found on their GitHub Page.
- ollama-python, the Python library for Ollama.
- Or use Command Line program. Ex., Command Prompt for WindowsOS and Terminal for macOS.
- Please be aware of the security issues of ollama: read the article vulnerabilities in ollama.
Ollama Tutorial 1: get started with Ollama: The tutorial explains how to install Ollama, download and run open-source AI models (e.g., DeepSeek-R1) via command line, and manage them with key CLI and in-model commands. It also covers using Ollama in Python, customizing models with
Modelfiles
, and optional GUI tools like Chatbox and LMStudio.Ollama Tutorial 2: build your custom LLM with Ollama: The tutorial shows how to create a custom Python Instructor AI using Ollama by writing a
Modelfile
based on an open-source model likecodellama:7b
, adjusting parameters, and adding a system prompt to define its teaching style. It also explains deploying the model in the Chatbox AI interface for easier interaction, with guidance on security when connecting to remote Ollama services.
๐๏ธ Context Engineering
With the rising popularity of Agentic AI and multi-agent GenAI systems, managing and optimizing the context window for AI agents has evolved into an art. A leading strategy for context engineering introduces a four-step approach: write, select, compress, and isolate.
I also highly recommend reading the manual for Anthropic Claude Code to understand how this cutting-edge Agentic AI tool structures and optimizes its processes involved with context window.
๐ Speech & Text
Open-source TTS
Iโm particularly interested in text-to-speech (TTS) models and excited about the future of running compact, locally hosted versions. For instance, the top-ranked TTS model on the Hugging Face TTS Arena Leaderboard as of January 15, 2025, Kokoro-TTS, is remarkably small at only 82 million parameters, trained on fewer than 100 hours of audio. Despite its size, it delivers impressive performance, surpassing much larger models such as MetaVoice-1B (a 1.2-billion-parameter model trained on 100,000 hours of speech) and Edge TTS (a proprietary Microsoft model).
If youโre interested, Iโve written a detailed article about it-2025’s Best Text-To-Speech (TTS) Model: Kokoro.
Open-source STT
In the fast-evolving field of simultaneous speech-to-speech translation (SST) and simultaneous speech translation (SST), the forefront is occupied by streaming, end-to-end models that better balance latency, translation quality, and naturalness. Notable recent innovations include StreamSpeech, which unifies offline and simultaneous transcription, translation, and synthesis in a single multitask model and achieves state-of-the-art performance on the CVSS benchmark (aclanthology.org, huggingface.co); Hibiki, a decoder-only, chunk-based model capable of simultaneous voice-preserving translation with human-like naturalness and compatibility with on-device deployment (openreview.net); and SimulS2S-LLM, which equips speech LLMs for streaming translation via boundary-aware speech prompts and achieves better qualityโlatency trade-offs (arxiv.org). Complementing these academic advances, the open-source project WhisperLiveKit (latest release 0.2.8 as of September 2, 2025) provides a practical, real-time, locally runnable toolkit with speech-to-text, translation, and speaker diarization capabilities, featuring improvements like a new “simulstreaming” backend, voice activity control enabled by default, accurate timestamping, efficient backend model reuse, and better speaker turn detection through silence handling (github.com).
๐ Additional References
๐ฐ Media Resources
–>
Did you find this page helpful? Consider sharing it ๐