Night Field's Blog

Make it work, make it right, make it fast.

LLM Billing: What Are You Actually Paying For?

Without any optimization, input tokens grow linearly with each turn, and the total cost ends up roughly proportional to n squared. In practice it is not that bad. API providers have caching, and t...

LLM Serving: GPU Memory, KV Cache, and Large-Scale Inference

This article covers GPU memory usage, KV Cache, and serving engineering for large language models. No prior knowledge assumed. Model Weights vs KV Cache: A Chef and Their Sticky Notes Analogy...

AI Agent Framework Deep Dive: From Feature Comparison to Layered Architecture

Last updated: 2026-06-20 1. Overview   Nature Vendor Lock-in Typical Use Case DeepAgents Lightweight library (based on LangGraph) ...

The Divide and Convergence of Agents: Coding Agent vs Personal Agent — A Panoramic Comparison

Preface Many people think of OpenClaw and Hermes as “Claude Code + Telegram” or “OpenCode + Slack.” That’s one way to look at it, but not quite accurate. A better way to think about it: Claud...

LLM Agent Context & Memory Management: Problems, Solutions, and Practice

Problem 1: Context Length Explosion Description A user staying in the same session for a long time causes the conversation history to accumulate continuously, eventually exceeding the LLM’s conte...

AG-UI (Agent–User Interaction Protocol) Notes

AG-UI (Agent–User Interaction Protocol) Notes Why AG-UI? The first time someone comes across AG-UI, they usually ask: We already have SSE / WebSocket. Why do we need AG-UI? Here’s the disti...

LangChain vs LangGraph vs DeepAgents vs OpenCode: A Framework Comparison and DeepAgents Architecture Deep Dive

I recently went through the major Agent frameworks on the market. LangChain, LangGraph, DeepAgents, and OpenCode each have their own role, but put them together and it’s easy to get confused. This ...

Transformer Explained: A Plain Introduction

This article explains Transformer fundamentals for readers with no AI background. Some simplifications are made for clarity. Why Transformer Matters In 2017, Google published “Attention Is Al...

What Is Spec-Driven Development?

Everyone uses the word “Spec.” OpenAPI is a spec. Protobuf is a spec. Kubernetes even has a dedicated spec field. But what is a spec, really? One Sentence A Spec (Specification) is a model tha...

GraphRAG & Agentic RAG: From Classic RAG to Intelligent Retrieval

Following up on RAG Learning Notes, this post covers two advanced RAG directions: GraphRAG and Agentic RAG. In short: GraphRAG moves from “text retrieval” to “knowledge structure retrieval” ...