langchain

davila7

Updated Today

17 views

15,516

1,344

15,516

View on GitHub

MetaAgentsLangChainRAGTool CallingReActMemory ManagementVector StoresLLM ApplicationsChatbotsProduction

About

LangChain is a framework for building LLM applications using agents, chains, and RAG pipelines. It supports multiple LLM providers and offers key features like tool calling, memory management, and vector store retrieval. Use it for rapid prototyping or deploying production systems like chatbots, autonomous agents, and question-answering tools.

Quick Install

Claude Code

Recommended

Plugin CommandRecommended

/plugin add https://github.com/davila7/claude-code-templates

Git CloneAlternative

git clone https://github.com/davila7/claude-code-templates.git ~/.claude/skills/langchain

Copy and paste this command in Claude Code to install this skill

Documentation

LangChain - Build LLM Applications with Agents & RAG

The most popular framework for building LLM-powered applications.

When to use LangChain

Use LangChain when:

Building agents with tool calling and reasoning (ReAct pattern)
Implementing RAG (retrieval-augmented generation) pipelines
Need to swap LLM providers easily (OpenAI, Anthropic, Google)
Creating chatbots with conversation memory
Rapid prototyping of LLM applications
Production deployments with LangSmith observability

Metrics:

119,000+ GitHub stars
272,000+ repositories use LangChain
500+ integrations (models, vector stores, tools)
3,800+ contributors

Use alternatives instead:

LlamaIndex: RAG-focused, better for document Q&A
LangGraph: Complex stateful workflows, more control
Haystack: Production search pipelines
Semantic Kernel: Microsoft ecosystem

Quick start

Installation

# Core library (Python 3.10+)
pip install -U langchain

# With OpenAI
pip install langchain-openai

# With Anthropic
pip install langchain-anthropic

# Common extras
pip install langchain-community  # 500+ integrations
pip install langchain-chroma     # Vector store

Basic LLM usage

from langchain_anthropic import ChatAnthropic

# Initialize model
llm = ChatAnthropic(model="claude-sonnet-4-5-20250929")

# Simple completion
response = llm.invoke("Explain quantum computing in 2 sentences")
print(response.content)

Create an agent (ReAct pattern)

from langchain.agents import create_agent
from langchain_anthropic import ChatAnthropic

# Define tools
def get_weather(city: str) -> str:
    """Get current weather for a city."""
    return f"It's sunny in {city}, 72°F"

def search_web(query: str) -> str:
    """Search the web for information."""
    return f"Search results for: {query}"

# Create agent (<10 lines!)
agent = create_agent(
    model=ChatAnthropic(model="claude-sonnet-4-5-20250929"),
    tools=[get_weather, search_web],
    system_prompt="You are a helpful assistant. Use tools when needed."
)

# Run agent
result = agent.invoke({"messages": [{"role": "user", "content": "What's the weather in Paris?"}]})
print(result["messages"][-1].content)

Core concepts

1. Models - LLM abstraction

from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from langchain_google_genai import ChatGoogleGenerativeAI

# Swap providers easily
llm = ChatOpenAI(model="gpt-4o")
llm = ChatAnthropic(model="claude-sonnet-4-5-20250929")
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash-exp")

# Streaming
for chunk in llm.stream("Write a poem"):
    print(chunk.content, end="", flush=True)

2. Chains - Sequential operations

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

# Define prompt template
prompt = PromptTemplate(
    input_variables=["topic"],
    template="Write a 3-sentence summary about {topic}"
)

# Create chain
chain = LLMChain(llm=llm, prompt=prompt)

# Run chain
result = chain.run(topic="machine learning")

3. Agents - Tool-using reasoning

ReAct (Reasoning + Acting) pattern:

from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain.tools import Tool

# Define custom tool
calculator = Tool(
    name="Calculator",
    func=lambda x: eval(x),
    description="Useful for math calculations. Input: valid Python expression."
)

# Create agent with tools
agent = create_tool_calling_agent(
    llm=llm,
    tools=[calculator, search_web],
    prompt="Answer questions using available tools"
)

# Create executor
agent_executor = AgentExecutor(agent=agent, tools=[calculator], verbose=True)

# Run with reasoning
result = agent_executor.invoke({"input": "What is 25 * 17 + 142?"})

4. Memory - Conversation history

from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

# Add memory to track conversation
memory = ConversationBufferMemory()

conversation = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=True
)

# Multi-turn conversation
conversation.predict(input="Hi, I'm Alice")
conversation.predict(input="What's my name?")  # Remembers "Alice"

RAG (Retrieval-Augmented Generation)

Basic RAG pipeline

from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain.chains import RetrievalQA

# 1. Load documents
loader = WebBaseLoader("https://docs.python.org/3/tutorial/")
docs = loader.load()

# 2. Split into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)
splits = text_splitter.split_documents(docs)

# 3. Create embeddings and vector store
vectorstore = Chroma.from_documents(
    documents=splits,
    embedding=OpenAIEmbeddings()
)

# 4. Create retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

# 5. Create QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    return_source_documents=True
)

# 6. Query
result = qa_chain({"query": "What are Python decorators?"})
print(result["result"])
print(f"Sources: {result['source_documents']}")

Conversational RAG with memory

from langchain.chains import ConversationalRetrievalChain

# RAG with conversation memory
qa = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    memory=ConversationBufferMemory(
        memory_key="chat_history",
        return_messages=True
    )
)

# Multi-turn RAG
qa({"question": "What is Python used for?"})
qa({"question": "Can you elaborate on web development?"})  # Remembers context

Advanced agent patterns

Structured output

from langchain_core.pydantic_v1 import BaseModel, Field

# Define schema
class WeatherReport(BaseModel):
    city: str = Field(description="City name")
    temperature: float = Field(description="Temperature in Fahrenheit")
    condition: str = Field(description="Weather condition")

# Get structured response
structured_llm = llm.with_structured_output(WeatherReport)
result = structured_llm.invoke("What's the weather in SF? It's 65F and sunny")
print(result.city, result.temperature, result.condition)

Parallel tool execution

from langchain.agents import create_tool_calling_agent

# Agent automatically parallelizes independent tool calls
agent = create_tool_calling_agent(
    llm=llm,
    tools=[get_weather, search_web, calculator]
)

# This will call get_weather("Paris") and get_weather("London") in parallel
result = agent.invoke({
    "messages": [{"role": "user", "content": "Compare weather in Paris and London"}]
})

Streaming agent execution

# Stream agent steps
for step in agent_executor.stream({"input": "Research AI trends"}):
    if "actions" in step:
        print(f"Tool: {step['actions'][0].tool}")
    if "output" in step:
        print(f"Output: {step['output']}")

Common patterns

Multi-document QA

from langchain.chains.qa_with_sources import load_qa_with_sources_chain

# Load multiple documents
docs = [
    loader.load("https://docs.python.org"),
    loader.load("https://docs.numpy.org")
]

# QA with source citations
chain = load_qa_with_sources_chain(llm, chain_type="stuff")
result = chain({"input_documents": docs, "question": "How to use numpy arrays?"})
print(result["output_text"])  # Includes source citations

Custom tools with error handling

from langchain.tools import tool

@tool
def risky_operation(query: str) -> str:
    """Perform a risky operation that might fail."""
    try:
        # Your operation here
        result = perform_operation(query)
        return f"Success: {result}"
    except Exception as e:
        return f"Error: {str(e)}"

# Agent handles errors gracefully
agent = create_agent(model=llm, tools=[risky_operation])

LangSmith observability

import os

# Enable tracing
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-api-key"
os.environ["LANGCHAIN_PROJECT"] = "my-project"

# All chains/agents automatically traced
agent = create_agent(model=llm, tools=[calculator])
result = agent.invoke({"input": "Calculate 123 * 456"})

# View traces at smith.langchain.com

Vector stores

Chroma (local)

from langchain_chroma import Chroma

vectorstore = Chroma.from_documents(
    documents=docs,
    embedding=OpenAIEmbeddings(),
    persist_directory="./chroma_db"
)

Pinecone (cloud)

from langchain_pinecone import PineconeVectorStore

vectorstore = PineconeVectorStore.from_documents(
    documents=docs,
    embedding=OpenAIEmbeddings(),
    index_name="my-index"
)

FAISS (similarity search)

from langchain_community.vectorstores import FAISS

vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())
vectorstore.save_local("faiss_index")

# Load later
vectorstore = FAISS.load_local("faiss_index", OpenAIEmbeddings())

Document loaders

# Web pages
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://example.com")

# PDFs
from langchain_community.document_loaders import PyPDFLoader
loader = PyPDFLoader("paper.pdf")

# GitHub
from langchain_community.document_loaders import GithubFileLoader
loader = GithubFileLoader(repo="user/repo", file_filter=lambda x: x.endswith(".py"))

# CSV
from langchain_community.document_loaders import CSVLoader
loader = CSVLoader("data.csv")

Text splitters

# Recursive (recommended for general text)
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    separators=["\n\n", "\n", " ", ""]
)

# Code-aware
from langchain.text_splitter import PythonCodeTextSplitter
splitter = PythonCodeTextSplitter(chunk_size=500)

# Semantic (by meaning)
from langchain_experimental.text_splitter import SemanticChunker
splitter = SemanticChunker(OpenAIEmbeddings())

Best practices

Start simple - Use create_agent() for most cases
Enable streaming - Better UX for long responses
Add error handling - Tools can fail, handle gracefully
Use LangSmith - Essential for debugging agents
Optimize chunk size - 500-1000 chars for RAG
Version prompts - Track changes in production
Cache embeddings - Expensive, cache when possible
Monitor costs - Track token usage with LangSmith

Performance benchmarks

Operation	Latency	Notes
Simple LLM call	~1-2s	Depends on provider
Agent with 1 tool	~3-5s	ReAct reasoning overhead
RAG retrieval	~0.5-1s	Vector search + LLM
Embedding 1000 docs	~10-30s	Depends on model

LangChain vs LangGraph

Feature	LangChain	LangGraph
Best for	Quick agents, RAG	Complex workflows
Abstraction level	High	Low
Code to start	<10 lines	~30 lines
Control	Simple	Full control
Stateful workflows	Limited	Native
Cyclic graphs	No	Yes
Human-in-loop	Basic	Advanced

Use LangGraph when:

Need stateful workflows with cycles
Require fine-grained control
Building multi-agent systems
Production apps with complex logic

References

Agents Guide - ReAct, tool calling, streaming
RAG Guide - Document loaders, retrievers, QA chains
Integration Guide - Vector stores, LangSmith, deployment

Resources

GitHub: https://github.com/langchain-ai/langchain ⭐ 119,000+
Docs: https://docs.langchain.com
API Reference: https://reference.langchain.com/python
LangSmith: https://smith.langchain.com (observability)
Version: 0.3+ (stable)
License: MIT

GitHub Repository

davila7/claude-code-templates

Path: cli-tool/components/skills/ai-research/agents-langchain

anthropicanthropic-claudeclaudeclaude-code

Related Skills

sglang

training-llms-megatron

Design

This skill trains massive language models (2B-462B parameters) using NVIDIA's Megatron-Core framework for maximum GPU efficiency. Use it when training models over 1B parameters, requiring advanced parallelism strategies like tensor or pipeline, or needing production-ready performance. It's a proven framework used for models like Nemotron and LLaMA.

View skill

pinecone

Development

Pinecone is a fully managed vector database for production AI applications, featuring auto-scaling, low latency (<100ms p95), and hybrid search. It's ideal for developers who need a serverless solution for production RAG, semantic search, or recommendation systems without managing infrastructure. Use it when you require metadata filtering, namespaces, and scaling to billions of vectors.

View skill

tensorrt-llm

Other

TensorRT-LLM is an NVIDIA-optimized library for deploying LLMs on NVIDIA GPUs, delivering up to 100x faster inference than PyTorch. Use it for production serving where you need maximum throughput, low latency, and support for features like quantization (FP8/INT4), in-flight batching, and multi-GPU scaling.

View skill