LangChain Skills Pattern in Practice: Building an On-Demand Knowledge Loading SQL Assistant
In the previous article, we explored how to simulate the Deep Agent using the Skills pattern through the Deep Agents CLI. Now, LangChain natively supports this feature, greatly simplifying the development process. This article will guide you through an in-depth experience of this feature, building a more intelligent SQL assistant.
When building complex AI Agents, developers often face a dilemma: should all context (database table structure, API documentation, business rules) be injected into the System Prompt at once, leading to context window overflow and distracting the model's attention? Or should they choose costly and frequent fine-tuning?
Skills Pattern provides an elegant middle ground. It achieves efficient context utilization by dynamically loading the required knowledge. LangChain's native support for this pattern means we can more easily build Agents with "on-demand learning" capabilities.
This article will guide readers from scratch, in conjunction with the official documentation Build a SQL assistant with on-demand skills, to build a SQL Assistant that supports "on-demand knowledge loading".
1. Core Concepts: Why Choose the Skills Pattern?
Limitations of Traditional SQL Agents
In the traditional SQL Agent architecture, we usually need to provide the complete Database Schema in the System Prompt. As the business develops, when the number of tables expands to hundreds, this approach brings significant problems:
-
Huge Token Consumption: Each conversation carries a large number of irrelevant table structures, resulting in wasted resources.
-
Increased Risk of Hallucinations: Too much irrelevant interference information reduces the model's reasoning accuracy.
-
Difficult Maintenance: The knowledge of all business lines is tightly coupled, making it difficult to iterate independently.
Skills Pattern: A Solution Based on Progressive Disclosure
The Skills pattern is based on the principle of Progressive Disclosure, which processes the knowledge acquisition process in layers:
-
Agent Initial State: Only master what "skills" (Skills) are available and their brief descriptions (Description), keeping it lightweight.
-
Runtime Loading: When faced with a specific problem (such as "query inventory"), the Agent actively calls the tool (
load_skill) to load the detailed context of the skill (Schema + Prompt). -
Execute Task: Based on the loaded precise context, execute specific tasks (such as writing and executing SQL).
This pattern effectively supports unlimited expansion and team decoupling, enabling the Agent to adapt to increasingly complex business scenarios.
2. System Architecture Design
This practical project will build a SQL Assistant containing two core Skills to demonstrate the practical application of this pattern:
-
Sales Analytics: Responsible for the
sales_datatable, handling revenue statistics, order trend analysis, etc. -
Inventory Management: Responsible for the
inventory_itemstable, handling inventory level monitoring, location queries, etc.
3. Development Environment Setup
This project uses Pythonuvfor efficient dependency management.
Core Dependency Installation
uv add langchain langchain-openai langgraph psycopg2-binary python-dotenv langchain-community
PostgreSQL Environment Configuration
Start a local Postgres instance and create theagent_platformdatabase. We provide asetup_db.pyscript to automatically initialize the table structure and test data (see the source code at the end of the article for details).
4. Detailed Explanation of Core Implementation Steps### Step 1: Define Domain Skills (The Knowledge)
We define skills as a dictionary structure, simulating the process of loading from a file system or database. Please note the distinction between description (for Agent decision-making) and content (the actual loaded detailed context).
SKILLS = {"sales_analytics": {"description":"Useful for analyzing sales revenue, trends...","content":"""... Table Schema: sales_data ..."" },"inventory_management": {"description":"Useful for checking stock levels...","content":"""... Table Schema: inventory_items ..."" }}
Step 2: Implement Core Tools (The Capabilities)
The Agent relies on two key tools to complete tasks:
-
load_skill(skill_name): Dynamically loads the details of the specified skill at runtime. -
run_sql_query(query): Executes the specific SQL statement.
Step 3: Orchestrate Agent Logic (The Brain)
Use LangGraph to build the ReAct Agent. The System Prompt plays a crucial role here, guiding the Agent to strictly follow the standard operating procedure (SOP) of Identify -> Load -> Query.
system_prompt ="""1. Identify the relevant skill.2. Use 'load_skill' to get schema.3. Write and execute SQL using 'run_sql_query'....Do not guess table names. Always load the skill first.""
5. Verification of Running Results
By running test_agent.py, we tested queries in two different domains: Sales and Inventory. The following is the actual output log from the console, demonstrating how the Agent dynamically loads skills based on the question:
Testing Sales Query...Agent calling tools: [{'name': 'load_skill', 'args': {'skill_name': 'sales_analytics'}, 'id': 'call_f270d76b7ce4404cb5f61bf2', 'type': 'tool_call'}]Tool output:You are a Sales Analytics Expert.You have access to the 'sales_data' table.Table Schema:- id: integer...Agent calling tools: [{'name': 'run_sql_query', 'args': {'query': 'SELECT SUM(amount) as total_revenue FROM sales_data;'}, 'id': 'call_b4f3e686cc7f4f22b3bb9ea7', 'type': 'tool_call'}]Tool output: [(Decimal('730.50'),)]...Agent response: The total revenue is $730.50.Testing Inventory Query...Agent calling tools: [{'name': 'load_skill', 'args': {'skill_name': 'inventory_management'}, 'id': 'call_18c823b2d5064e95a0cfe2e3', 'type': 'tool_call'}]Tool output:You are an Inventory Management Expert.You have access to the 'inventory_items' table.Table Schema...Agent calling tools: [{'name': 'run_sql_query', 'args': {'query': "SELECT warehouse_location FROM inventory_items WHERE product_name = 'Laptop';"}, 'id': 'call_647ee3a444804bd98a045f00', 'type': 'tool_call'}]Tool output: [('Warehouse A',)]...Agent response: The Laptop is located in **Warehouse A**.## 6. Complete Source Code Reference
The following is the complete source code of the project, including the database initialization script and the Agent main program.
1. Database Initialization (setup_db.py)
`import psycopg2 from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT import os from dotenv import load_dotenv
load_dotenv()
Please ensure that the database connection information is configured in .env
DB_HOST = os.getenv("DB_HOST", "localhost") DB_PORT = os.getenv("DB_PORT", "5432") DB_USER = os.getenv("DB_USER", "postgres") DB_PASSWORD = os.getenv("DB_PASSWORD", "your_password") # Please replace with the actual password DB_NAME = os.getenv("DB_NAME", "agent_platform")
def create_database(): try: # Connect to default 'postgres' database to create new db conn = psycopg2.connect( host=DB_HOST, port=DB_PORT, user=DB_USER, password=DB_PASSWORD, dbname="postgres" ) conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT) cur = conn.cursor()
# Check if database exists
cur.execute(f"SELECT 1 FROM pg_catalog.pg_database WHERE datname = '{DB_NAME}'")
exists = cur.fetchone()
if not exists:
print(f"Creating database {DB_NAME}...")
cur.execute(f"CREATE DATABASE {DB_NAME}")
else:
print(f"Database {DB_NAME} already exists.")
cur.close()
conn.close()
except Exception as e: print(f"Error creating database: {e}")
def create_tables_and_data(): try: conn = psycopg2.connect( host=DB_HOST, port=DB_PORT, user=DB_USER, password=DB_PASSWORD, dbname=DB_NAME ) cur = conn.cursor()
# Create Sales Table
print("Creating sales_data table...")
cur.execute(
"""
CREATE TABLE IF NOT EXISTS sales_data (
id SERIAL PRIMARY KEY,
transaction_date DATE,
product_id VARCHAR(50),
amount DECIMAL(10, 2),
region VARCHAR(50)
)
"""
)
# Create Inventory Table
print("Creating inventory_items table...")
cur.execute(
"""
CREATE TABLE IF NOT EXISTS inventory_items (
id SERIAL PRIMARY KEY,
product_id VARCHAR(50),
product_name VARCHAR(100),
stock_count INTEGER,
warehouse_location VARCHAR(50)
)
"""
)
# Insert Mock Data
print("Inserting mock data...")
cur.execute("TRUNCATE sales_data, inventory_items")
sales_data = [
('2023-01-01', 'P001', 100.00, 'North'),
('2023-01-02', 'P002', 150.50, 'South'),
('2023-01-03', 'P001', 120.00, 'East'),
('2023-01-04', 'P003', 200.00, 'West'),
('2023-01-05', 'P002', 160.00, 'North')
]
cur.executemany("INSERT INTO sales_data (transaction_date, product_id, amount, region) VALUES (%s, %s, %s, %s)",
sales_data
)
inventory_data = [
('P001', 'Laptop', 50, 'Warehouse A'),
('P002', 'Mouse', 200, 'Warehouse B'),
('P003', 'Keyboard', 150, 'Warehouse A'),
('P004', 'Monitor', 30, 'Warehouse C')
]
cur.executemany("INSERT INTO inventory_items (product_id, product_name, stock_count, warehouse_location) VALUES (%s, %s, %s, %s)",
inventory_data
)
conn.commit()
cur.close()
conn.close()
print("Database setup complete.")
except Exception as e: print(f"Error setting up tables: {e}")
if name == "main": create_database() create_tables_and_data()````python import os from typing import Annotated, Literal, TypedDict, Union, Dict from dotenv import load_dotenv from langchain_openai import ChatOpenAI from langchain_core.tools import tool from langchain_core.messages import SystemMessage, HumanMessage, AIMessage, ToolMessage from langchain_community.utilities import SQLDatabase from langchain_community.agent_toolkits import SQLDatabaseToolkit from langgraph.graph import StateGraph, START, END, MessagesState from langgraph.prebuilt import ToolNode, tools_condition
load_dotenv()
--- Configuration ---
BASE_URL = os.getenv("BASIC_MODEL_BASE_URL") API_KEY = os.getenv("BASIC_MODEL_API_KEY") MODEL_NAME = os.getenv("BASIC_MODEL_MODEL") DB_URI = f"postgresql://{os.getenv('DB_USER')}:{os.getenv('DB_PASSWORD')}@{os.getenv('DB_HOST')}:{os.getenv('DB_PORT')}/{os.getenv('DB_NAME')}"
--- Database Setup ---
db = SQLDatabase.from_uri(DB_URI)
--- Skills Definition ---
SKILLS: Dict[str, Dict[str, str]] = { "sales_analytics": { "description": "Useful for analyzing sales revenue, trends, and regional performance.", "content": """ You are a Sales Analytics Expert. You have access to the 'sales_data' table. Table Schema:
- id: integer (primary key)
- transaction_date: date
- product_id: varchar(50)
- amount: decimal(10, 2)
- region: varchar(50) Common queries:
- Total revenue: SUM(amount)
- Revenue by region: GROUP BY region
- Sales trend: GROUP BY transaction_date """, }, "inventory_management": { "description": "Useful for checking stock levels, product locations, and warehouse management.", "content": """ You are an Inventory Management Expert. You have access to the 'inventory_item
- product_id: varchar(50)
- product_name: varchar(100)
- stock_count: integer
- warehouse_location: varchar(50)
Common queries:
- Check stock: WHERE product_name = '...'
- Low stock: WHERE stock_count < threshold""" }}# --- Tools ---@tooldefload_skill(skill_name: str)-> str:"""
Load the detailed prompt and schema for a specific skill.
Available skills:
- sales_analytics: For sales, revenue, and transaction analysis.
- inventory_management: For stock, products, and warehouse queries.
"""
skill = SKILLS.get(skill_name)
if not skill:
return f"Error: Skill '{skill_name}' not found. Available skills: {list(SKILLS.keys())}"
return skill["content"]
@tool
def run_sql_query(query: str) -> str:
"""
Execute a SQL query against the database.
Only use this tool AFTER loading the appropriate skill to understand the schema.
"""
try:
return db.run(query)
except Exception as e:
return f"Error executing SQL: {e}"
@tool
def list_tables() -> str:
"""List all available tables in the database."""
return str(db.get_usable_table_names())
tools = [load_skill, run_sql_query, list_tables]
# --- Agent Setup ---
llm = ChatOpenAI(
base_url=BASE_URL,
api_key=API_KEY,
model=MODEL_NAME,
temperature=0
)
llm_with_tools = llm.bind_tools(tools)
# --- Graph Definition ---
class AgentState(MessagesState):
# We can add custom state if needed, but MessagesState is sufficient for simple chat
pass
def agent_node(state: AgentState):
messages = state["messages"]
response = llm_with_tools.invoke(messages)
return {"messages": [response]}
workflow = StateGraph(AgentState)
workflow.add_node("agent", agent_node)e) workflow.add_node(\ke({"messages": messages})
last_msg = final_state["messages"][-1]
if isinstance(last_msg, AIMessage):
print(last_msg.content)
messages = final_state["messages"]
# Update history
print("-"*50)
except Exception as e:
print(f"\nError:{e}")
break




