Enhance User Queries with Retrieval-Augmented Generation Using Sensei 7b V1

25 Apr 2025

In today's fast-paced digital world, users are inundated with vast amounts of information. The ability to distill this information into accurate, concise responses is paramount for developers looking to enhance user experiences. The Sensei 7b V1 offers a powerful Cognitive Action that enables retrieval-augmented generation (RAG) over detailed web search results, providing well-cited summaries for user queries. This service not only simplifies the process of information retrieval but also ensures that the responses are both relevant and informative.

Imagine a scenario where users frequently ask questions that require comprehensive answers derived from multiple sources. Instead of manually sifting through search results, developers can leverage the Sensei 7b V1 to automatically generate insightful summaries, making it a valuable tool for applications in customer support, educational platforms, and content creation. By integrating this action, developers can significantly enhance the efficiency and effectiveness of their applications.

Prerequisites

To utilize the Sensei 7b V1 Cognitive Action, you will need an API key and a fundamental understanding of making API calls.

Perform Retrieval-Augmented Generation with Sensei

The Perform Retrieval-Augmented Generation with Sensei action harnesses the capabilities of the Sensei-7B-V1 model to perform RAG over search results. This action addresses the challenge of generating accurate responses from a plethora of online information, ensuring that users receive clear and concise answers.

Input Requirements:

Prompt: This is the primary instruction that guides the model to generate a response. It should include the user query along with the relevant search results.
Stop: A string that indicates where the generation will stop, ensuring the output is clean and focused.
Top K: Specifies the maximum number of top tokens to consider from the model's predictions.
Top P: A cumulative probability threshold for selecting tokens, allowing for more controlled randomness in responses.
Max Tokens: Sets the maximum number of tokens in the generated output, ensuring concise responses.
Temperature: Adjusts the randomness of the model's output, allowing developers to control the variability of responses.
Presence Penalty: Affects the likelihood of repeating tokens already present in the output.
Frequency Penalty: Discourages or encourages the use of frequently occurring tokens in the generated text.

Expected Output: The action returns a structured JSON response that includes a summary of the search results and a list of related queries. This ensures that users receive not just an answer but also context and additional avenues for exploration.

Use Cases for this specific action:

Customer Support: Automatically generate responses to common inquiries, improving response times and user satisfaction.
Educational Tools: Provide students with summarized information on complex topics, leveraging multiple sources to enhance learning.
Content Creation: Assist writers in generating summaries or insights from research, streamlining the content development process.

import requests
import json

# Replace with your actual Cognitive Actions API key and endpoint
# Ensure your environment securely handles the API key
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
# This endpoint URL is hypothetical and should be documented for users
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"

action_id = "689de106-b7a4-453a-9695-419e7e430afb" # Action ID for: Perform Retrieval-Augmented Generation with Sensei

# Construct the exact input payload based on the action's requirements
# This example uses the predefined example_input for this action:
payload = {
  "stop": "### Instruction:",
  "topK": -1,
  "topP": 0.95,
  "prompt": "### Instruction: Your task is to perform retrieval augmented generation (RAG) over the given query and search results. Return your answer in a json format that includes a summary of the search results and a list of related queries.\n\nQuery:\nWhat is Obama's middle name?\n\nSearch Results:\n1. Title: Obama's Middle Name -- My Last Name -- is 'Hussein.' So?\nURL: https://www.cair.com/cair_in_the_news/obamas-middle-name-my-last-name-is-hussein-so/\nText: I wasn’t sure whether to laugh or cry a few days back listening to radio talk show host Bill Cunningham repeatedly scream Barack <strong>Obama</strong>’<strong>s</strong> <strong>middle</strong> <strong>name</strong> — my last <strong>name</strong> — as if he had anti-Muslim Tourette’s. “Hussein,” Cunningham hissed like he was beckoning Satan when shouting the ...\n\n2. Title: What's up with Obama's middle name? - Quora\nURL: https://www.quora.com/Whats-up-with-Obamas-middle-name\nText: Answer (1 of 15): A better question would be, “What’s up with <strong>Obama</strong>’s first <strong>name</strong>?” President Barack Hussein <strong>Obama</strong>’s father’s <strong>name</strong> was Barack Hussein <strong>Obama</strong>. He was <strong>named</strong> after his father. Hussein, <strong>Obama</strong>’<strong>s</strong> <strong>middle</strong> <strong>name</strong>, is a very common Arabic <strong>name</strong>, meaning \"good,\" \"handsome,\" or ...\n\n3. Title: Barack Obama | Biography, Parents, Education, Presidency, Books, ...\nURL: https://www.britannica.com/biography/Barack-Obama\nText: Barack <strong>Obama</strong>, in full Barack Hussein <strong>Obama</strong> II, (born August 4, 1961, Honolulu, Hawaii, U.S.), 44th president of the United States (2009–17) and the first African American to hold the office. Before winning the presidency, <strong>Obama</strong> represented Illinois in the U.S.\n\n\n\nQuery:\nWhat is Obama's middle name?\n### Response:\n{\"summary\":",
  "maxTokens": 256,
  "temperature": 0.8,
  "presencePenalty": 0,
  "frequencyPenalty": 0
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json",
    # Add any other required headers for the Cognitive Actions API
}

# Prepare the request body for the hypothetical execution endpoint
request_body = {
    "action_id": action_id,
    "inputs": payload
}

print(f"--- Calling Cognitive Action: {action.name or action_id} ---")
print(f"Endpoint: {COGNITIVE_ACTIONS_EXECUTE_URL}")
print(f"Action ID: {action_id}")
print("Payload being sent:")
print(json.dumps(request_body, indent=2))
print("------------------------------------------------")

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json=request_body
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully. Result:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body (non-JSON): {e.response.text}")
    print("------------------------------------------------")

Conclusion

The Sensei 7b V1’s retrieval-augmented generation capability offers developers a robust solution for enhancing user interactions by providing accurate and well-cited information. With applications ranging from customer support to educational tools, integrating this action can significantly improve the quality of responses in various contexts. As you explore the possibilities of the Sensei 7b V1, consider how you can leverage its capabilities to create more engaging and informative user experiences.