๐ง Resources for Generative AI & LLM
This post shares all the useful Generative AI & LLM resources that I came across or created. Some are held in my GitHub, such as the Intro to AI repository. All tutorials introduced should be beginner-friendly.
โ๏ธ Prompting
In GenAI context, prompt is the input or instruction given to an AI model to generate a specific output, such as text, images, or code. Prompts act as a guide, shaping the AI’s response and influencing its style, content, and level of detail. Effective prompting is crucial for harnessing the full potential of AI model.
To learn the basics of prompting, I recommend this general introduction to prompting by Google Cloud (link). If you need any prompt examples when creating your own, prompt.chat and Useful ChatGPT Prompts maintain a large directory of effective prompts for GenAIs.
Lastly, I am listing a few high-quality prompting guides that I found useful.
๐ฅ๏ธ Platform for LLM Systems
Ollama
Ollama is a free, open-source tool that allows users to run large language models (LLMs) locally on their computers. Details can be found on their GitHub Page.
- ollama-python, the Python library for Ollama.
- Or use Command Line program. Ex., Command Prompt for WindowsOS and Terminal for macOS.
- Please be aware of the security issues of ollama: read the article vulnerabilities in ollama.
Ollama Tutorial 1: get started with Ollama: The tutorial explains how to install Ollama, download and run open-source AI models (e.g., DeepSeek-R1) via command line, and manage them with key CLI and in-model commands. It also covers using Ollama in Python, customizing models with
Modelfiles, and optional GUI tools like Chatbox and LMStudio.Ollama Tutorial 2: build your custom LLM with Ollama: The tutorial shows how to create a custom Python Instructor AI using Ollama by writing a
Modelfilebased on an open-source model likecodellama:7b, adjusting parameters, and adding a system prompt to define its teaching style. It also explains deploying the model in the Chatbox AI interface for easier interaction, with guidance on security when connecting to remote Ollama services.
Agentic AI & Skills
AI Agents (open- and closed-source)
OpenClaw
OpenClaw is an open-source personal AI assistant that runs on your own devices and integrates directly into messaging platforms like WhatsApp, Telegram, Slack, Discord, Teams, and more, so you can interact with it through the channels you already use. It operates locally and continuously, with voice support on macOS, iOS, and Android, plus a live Canvas interface you can control, while the Gateway serves only as the control plane behind the assistant. Setup is easiest through the terminal onboarding wizard (openclaw onboard), which walks you through configuring the gateway, channels, and skills, and it supports subscriptions like Anthropic Claude and OpenAI models for advanced capabilities.
- Usecase 1 - Automated Quant Finance alpha model
- Usecase 2 - PolyMarket trader agent
n8n
n8n is a closed-source workflow automation platform that enables the development of agentic AI systems. It supports the construction of AI agents and retrieval-augmented generation (RAG) pipelines through extensive, highly flexible integrations with a wide range of software tools.
- Self-hosted AI Starter Kit.
- Usecase 1 - Chat with a database using AI.
- Usecase 2 - Automated competitor pricing monitor with Bright Data MCP & OpenAI
- Usecase 3 - Generate funny AI videos with Sora 2 and auto-publish to TikTok
Agent Skills
Agent Skills provide an open, discoverable standard for extending AI agents with new capabilities. A skill packages instructions, scripts, and supporting resources into a structured format that agents can load to perform specific tasks.
At the core of each skill is a folder anchored by a SKILL.md file. This file defines metadata and step-by-step procedural guidance that instructs the agent how to execute a particular workflow. By standardizing how tasks are described and performed, skills enable reusable, consistent, and reliable task execution.
Skills give agents on-demand access to domain expertise and contextual knowledge, helping bridge capability gaps and support repeatable, auditable workflows across compatible platforms. Because the format is open and portable, skill creators can build capabilities once and deploy them across multiple agentsโwhile preserving organizational knowledge in version-controlled, shareable packages. I created a 2-section tutorial in my INFS348 Advanced Analytics and AI class:
๐๏ธ Context Management
With the rapid rise of agentic AI and multi-agent generative AI systems, managing and optimizing the context window has become less of a technical detail and more of a craft. Effective context management is now a key differentiator in building capable AI agents. Inspired by Anthropic’s flagship Agentic AI product, Claude Code, I created a 4-section tutorial in my INFS348 Advanced Analytics and AI class:
Moreover, several industry leaders have begun formalizing best practices in this space: a leading strategy for context engineering introduces a four-step approach: write, select, compress, and isolate.
I also highly recommend reading through the Claude Code documentation by Anthropic, which provides a practical look at how a state-of-the-art agentic system structures, manages, and optimizes its context.
๐ Speech & Text
Open-source TTS
Iโm particularly interested in text-to-speech (TTS) models and excited about the future of running compact, locally hosted versions. For instance, the top-ranked TTS model on the Hugging Face TTS Arena Leaderboard as of January 15, 2025, Kokoro-TTS, is remarkably small at only 82 million parameters, trained on fewer than 100 hours of audio. Despite its size, it delivers impressive performance, surpassing much larger models such as MetaVoice-1B (a 1.2-billion-parameter model trained on 100,000 hours of speech) and Edge TTS (a proprietary Microsoft model).
If youโre interested, Iโve written a detailed article about it-2025’s Best Text-To-Speech (TTS) Model: Kokoro.
Open-source STT
In the fast-evolving field of simultaneous speech-to-speech translation (SST) and simultaneous speech translation (SST), the forefront is occupied by streaming, end-to-end models that better balance latency, translation quality, and naturalness. Notable recent innovations include StreamSpeech, which unifies offline and simultaneous transcription, translation, and synthesis in a single multitask model and achieves state-of-the-art performance on the CVSS benchmark (aclanthology.org, huggingface.co); Hibiki, a decoder-only, chunk-based model capable of simultaneous voice-preserving translation with human-like naturalness and compatibility with on-device deployment (openreview.net); and SimulS2S-LLM, which equips speech LLMs for streaming translation via boundary-aware speech prompts and achieves better qualityโlatency trade-offs (arxiv.org). Complementing these academic advances, the open-source project WhisperLiveKit (latest release 0.2.8 as of September 2, 2025) provides a practical, real-time, locally runnable toolkit with speech-to-text, translation, and speaker diarization capabilities, featuring improvements like a new “simulstreaming” backend, voice activity control enabled by default, accurate timestamping, efficient backend model reuse, and better speaker turn detection through silence handling (github.com).
๐ Additional References
OpenAI Cookbook: an official OpenAI website resource providing practical examples, code, guides, and prompt engineering tips for developers to use the OpenAI API and models effectively.
๐ฐ Media Resources
–>
Did you find this page helpful? Consider sharing it ๐