Two powerful paradigms for eliciting structured, step-by-step reasoning from large language models — transforming opaque outputs into transparent, verifiable thinking.
Technique 01
Chain-of-Thought (CoT)
Chain-of-Thought prompting encourages a model to articulate its intermediate reasoning steps before arriving at a final answer. Rather than jumping straight to a conclusion, the model “thinks aloud,” producing a sequential chain of logical steps that mirrors how a human expert would work through a problem.
Core idea: Prefixing or seeding a prompt with a reasoning chain (e.g., “Let’s think step by step…”) dramatically improves accuracy on multi-step math, commonsense, and symbolic reasoning tasks.
How it works
1
Pose the problem
Provide the question or task in natural language, optionally with a few worked examples (few-shot CoT).
2
Elicit reasoning
Include a trigger phrase — “Let’s think step by step” — or supply exemplar chains that demonstrate the desired format.
3
Generate the chain
The model produces intermediate reasoning tokens: sub-calculations, logical deductions, or conceptual bridges.
4
Extract the answer
The final answer follows naturally from the chain, and can be parsed or verified separately.
Prompt example
# Zero-shot CoTQ:If a store sells 3 pens for $4.50, how much do 7 pens cost?A:Let’s think step by step.
→ Cost of 1 pen = $4.50 ÷ 3 = $1.50
→ Cost of 7 pens = $1.50 × 7 = $10.50
The answer is $10.50.
Best for
Arithmetic & math word problemsLogical deductionCommonsense reasoningCode explanationMulti-hop QA
vs.
Technique 02
Tree-of-Thoughts (ToT)
Tree-of-Thoughts generalises CoT from a single linear chain into a branching tree of reasoning paths. The model generates multiple candidate “thoughts” at each step, evaluates them (via scoring or voting), and uses search algorithms such as BFS or DFS to navigate toward the most promising solution.
Core idea: By exploring and pruning a tree of partial solutions rather than committing to one path, ToT enables deliberate, backtracking-capable reasoning that handles ambiguous or creative tasks far better than CoT alone.
Tree diagram — 24 Game example
The four pillars of ToT
T
Thought decomposition
Break the problem into a sequence of intermediate “thought” units — each small enough to generate and evaluate independently.
G
Thought generation
Sample multiple candidate continuations per node using either independent proposals or sequential sampling with temperature.
E
State evaluation
Score each thought via a value function or majority vote across multiple LLM queries — deciding “sure”, “maybe”, or “impossible”.
S
Search algorithm
Apply BFS (breadth-first) to maintain the top-k thoughts per level, or DFS with pruning for deeper exploration with backtracking.
Minimal implementation
# Simplified ToT loop (pseudocode)def tree_of_thoughts(problem, breadth=3, depth=4):
frontier = [“”] # start from empty thoughtfor step in range(depth):
candidates = []
for thought in frontier:
# generate B candidate next-thoughts
next_thoughts = llm_generate(problem, thought, n=breadth)
# evaluate each candidate
scores = [llm_evaluate(problem, thought + t)
for t in next_thoughts]
candidates += zip(scores, next_thoughts)
# keep top-k by score (BFS)
frontier = [t for _, t in
sorted(candidates, reverse=True)[:breadth]]
return frontier[0] # best leaf thought
Best for
Creative writing & storytellingMathematical puzzles (Game of 24)Strategic planningCode generation with constraintsMini crossword solvingTasks requiring backtracking
Side-by-side
Comparison at a Glance
Dimension
CoT
ToT
Structure
Single linear chain
Branching tree of paths
Exploration
One pass, no backtracking
Multi-path with pruning & backtracking
LLM calls
1 (or a few for self-consistency)
Many (generation + evaluation per node)
Latency / cost
Low
High (but parallelisable)
Error recovery
None — errors propagate forward
Built-in via pruning and search
Ideal task type
Sequential, low-ambiguity problems
Combinatorial, creative, high-ambiguity problems
Variants
Zero-shot CoT, Few-shot CoT, Self-Consistency
BFS-ToT, DFS-ToT, MCTS-ToT, Graph-of-Thought
Human analogy
Writing a proof step-by-step
Chess player exploring moves mentally
Decision guide
Which technique should you use?
The choice between CoT and ToT usually comes down to task complexity and your budget for compute.
Use CoT when — the problem has a clear sequential structure, latency matters, you need a quick and transparent rationale, or you’re working with a smaller model that can’t afford multiple calls.
Use ToT when — the solution space is vast or ambiguous, mistakes early on are costly and hard to detect, you need the model to explore creative alternatives, or you can afford the extra API calls and want state-of-the-art accuracy on hard benchmarks.
In practice, combining both is powerful: use CoT to quickly generate candidate thoughts cheaply, then apply ToT-style evaluation to select and refine the best chain. Self-Consistency CoT (sampling multiple chains and majority-voting) is a lightweight middle ground.