Large Language Model Reasoning Process and Prompting techniques Part 1

Make LLM smarter to solve complex tasks

Xin Cheng
8 min readJun 16, 2024

This is the 9th article of building LLM-powered AI applications series. Let’s survey about LLM prompting and reasoning.

Prompt Engineering

Few Shot: After task description, provide some examples

Generated Knowledge: a little like retrieval-augmented generation, use external knowledge as few-shot examples, generate final answer

Chain of Thought (CoT): simplest way to achieve this is by including the instruction “Let’s think step by step”, nstructing the model to decompose the answer process into intermediate steps before providing the final response.

Self Reflection: adding a verification layer to the generated response to detect errors, inconsistencies, etc. (e.g. does the output meet the requirement?), can used in an Iterate-Refine framework

Decomposed: decomposition of the original prompt into different sub-prompts and then combining the results to provide the final response (e.g. “How many Oscars did the main actor of Titanic win?” into “Who was the main actor of Titanic?”/answer1 and “How many Oscars did {answer1} win?’’”)

Self Consistency: involves increasing the model’s temperature (higher temperature equals to more randomness of the model’s answers) to generate different responses to the same question and then providing a final response by combining the results. In the case of classification problems this is done by majority voting.

similar: Least-To-Most Prompting, Decomposed, Self-Ask, chain-of-thought, Iterative

ReAct: reasoning (CoT prompting) & acting (generation of action plans)

Symbolic Reasoning & PAL: not only be able to perform mathematical reasoning, but also symbolic reasoning which involves reasoning pertaining to colours and object types.

ART (Automatic reasoning and tool-use): similar to ReAct (use tool to take actions)

Self-Consistency: prompt the LLM to generate a chain of thought (CoT) reasoning part, generate a diverse set of reasoning paths, select the most consistent output for the final answer.

X of Thought

https://ai.plainenglish.io/chain-tree-and-graph-of-thought-for-neural-networks-6d69c895ba7f

chain of thought: The road is straightforward, with clear signs guiding you from the start to your destination, no detours or intersections, just a direct path, ideal for tasks that require a sequential approach

tree of thought: branching out to multiple sub-ideas, each offering a different perspective or solution, helps organize thoughts or tasks in order of importance or sequence

graph of thought: ideas interconnecting in a dense web, allowing for a rich exploration of topics, mirrors human thought processes’ non-linear and interconnected nature

More complex tasks require more advanced reasoning process. Components needed to solve them, rather than just writing better prompts.

Chain-of-Thought: provide the language model with intermediate reasoning examples to guide its response.

Chain-of-Thought-Self-Consistency: start multiple concurrent reasoning pathways in response to a query and applies weighting mechanisms prior to finalizing an answer

Tree-of-Thoughts: First, the system breaks down a problem and, from its current state, generates a list of potential reasoning steps or ‘thought’ candidates. These thoughts are then evaluated, with the system gauging the likelihood that each one will lead to the desired solution. Standard search algorithms, such as Breadth-first search (BFS) and Depth-first search (DFS), are used to navigate this tree, aiding the model in identifying the most effective sequence of thoughts.

Graph-of-Thoughts: ability to apply transformations to these thoughts, further refining the reasoning process. The cardinal transformations encompass Aggregation, which allows for the fusion of several thoughts into a consolidated idea; Refinement, where continual iterations are performed on a singular thought to improve its precision; and Generation, which facilitates the conception of novel thoughts stemming from extant ones.

Algorithm-of-Thoughts: ToT and GoT pose computational inefficiencies due to multitude of paths and queries. 1) Decomposing complex problems into digestible subproblems, considering both their interrelation and the ease with which they can be individually addressed; 2) Proposing coherent solutions for these subproblems in a continuous and uninterrupted manner; 3) Intuitively evaluating the viability of each solution or subproblem without relying on explicit external prompts; and 4) Determining the most promising paths to explore or backtrack to, based on in-context examples and algorithmic guidelines.

Skeleton-of-Thought: designed not primarily to augment the reasoning capabilities of Large Language Models (LLMs), but to address the pivotal challenge of minimizing end-to-end generation latency. In the initial “Skeleton Stage,” rather than producing a comprehensive response, the model is prompted to generate a concise answer skeleton. In the ensuing “Point-Expanding Stage,” the LLM systematically amplifies each component delineated in the answer skeleton.

Program-of-Thoughts: Formulate the reasoning behind question answering into an executable program, incorporated the program interpreter output as part of the final answer.

CoT/ToT

  1. Represent the reasoning process as a tree, where each node is an intermediate “thought” or coherent piece of reasoning that serves as a step towards the final solution.
  2. Actively generate multiple possible thoughts at each step, rather than just sampling one thought sequentially as in chain-of-thought prompting. This allows the model to explore diverse reasoning paths.
  3. Evaluate the promise of different thoughts/nodes using the LLM itself, by prompting it to assess the validity or likelihood of success of each thought. This provides a heuristic to guide the search through the reasoning tree.
  4. Use deliberate search algorithms like breadth-first search or depth-first search to systematically explore the tree of thoughts. Unlike chain of thought, ToT can look ahead, backtrack, and branch out to consider different possibilities.
  5. The overall framework is general and modular — the thought representation, generation, evaluation, and search algorithm can all be customized for different problems. No extra training of models is needed.

The implementation process

  1. Define the problem input and desired output.
  2. Decompose the reasoning process into coherent thought steps. Determine an appropriate granularity for thoughts based on what the LLM can generate and evaluate effectively.
  3. Design a thought generator prompt to propose k possible next thoughts conditioned on the current thought sequence. This could sample thoughts independently or sequentially in context.
  4. Design a thought evaluation prompt to assess the promise of generated thoughts. This could value thoughts independently or vote/rank thoughts relative to each other.
  5. Choose a search algorithm like BFS or DFS based on the estimated tree depth and branching factor.
  6. Initialize the tree with the problem input as the root state. Use the thought generator to expand the leaf nodes and the thought evaluator to prioritize newly generated thoughts.
  7. Run a search for up to a maximum number of steps or until a solution is found. Extract the reasoning chain from the highest valued leaf node.
  8. Analyze results and refine prompts as needed to improve performance. Adjust search hyperparameters like branching factor and depth as needed.
  9. For new tasks, iterate on the design by adjusting the thought representation, search algorithm, or evaluation prompts. Leverage the LM’s strengths and task properties.
  10. Compare ToT performance to baseline approaches like input-output prompting and analyze errors to identify areas for improvement.

CoT sequential logic

Example Framework of CoT in Marketing Analysis: Identify Target Audience, Analyze Channel Preferences, Evaluate Channel Reach and Engagement, Consider Budget Constraints, Recommend Optimal Marketing Channel; CoT in Customer Feedback Analysis: Categorize Feedback, Sentiment Analysis, Identify Recurring Issues, Suggest Improvements

GoT

Article mentions Graph of Thoughts approaches to enhance LLM reasoning:

  • Knowledge graphs — Represent factual knowledge through entities, relationships and rules. They provide structured external knowledge to guide the LLM.
  • Tree of Thoughts — Decomposes reasoning into a search over thoughts. It provides a framework to explore diverse reasoning paths.
  • Reasoning modes — Deductive (chaining logical rules), inductive (generalizing patterns), abductive (hypothesizing explanations), and analogical reasoning (drawing parallels) can be composed.

But relying solely on the LLM’s generation limits the reasoning.

Components of GoT: Controller, Operations, Prompter, Parser, Graph Reasoning State.

Reasoning Swarm (in conceptual stage) consists of multiple specialized agents that collectively expand the LLM’s graph of thoughts using different reasoning approaches and external knowledge. Agent can include deduction, induction. websearch, vectorsearch

  • Graph-Based Modeling: In GoT, LLM reasoning is represented as a graph where vertices represent “thoughts” or intermediate solutions, and edges indicate dependencies between them.
  • Flexible Reasoning: Unlike linear or tree-based prompting schemes, graph structure allows aggregating the best thoughts, refining thoughts through feedback loops, etc.
  • Advantages in Task Handling: break down complex tasks into smaller subtasks, independently solve subtasks, and incrementally combine solutions. This improves accuracy and reduces inference costs.
  • Suitable tasks and Performance: sorting, set operations, keyword counting, and document merging. For example, it improves sorting accuracy by 62% over tree-of-thoughts while cutting costs by >31%.

AoT

Key Components of AoT
— Decomposing Problems into Subproblems
— Generating Solutions Without Pauses
— Exploring Branches Using Heuristics
— Backtracking to Traverse Promising Paths
— Emulating Algorithmic Search Using LLM Generation

For example, the Tree of Thoughts (ToT) method requires multiple rounds of querying as it traverses dozens of branches and nodes, which is computationally heavy.

Designed to address these challenges, AoT presents a structured path of reasoning for LLMs. It’s a solution that delivers on efficiency without compromising on the quality of outputs.

Mimic algorithmic thinking: Define the Problem, Gather Information, Analyze the Information, Formulate a Hypothesis, Test the Hypothesis, Draw Conclusions, Reflect

Appendix

--

--

Xin Cheng

Multi/Hybrid-cloud, Kubernetes, cloud-native, big data, machine learning, IoT developer/architect, 3x Azure-certified, 3x AWS-certified, 2x GCP-certified