Harness Text Generation with the OpenHermes Mistral Model Cognitive Actions

25 Apr 2025
Harness Text Generation with the OpenHermes Mistral Model Cognitive Actions

In the ever-evolving landscape of artificial intelligence, text generation models have become pivotal in enhancing user interactions and automating content creation. The OpenHermes 2.5 Mistral 7B AWQ model offers a powerful Cognitive Action designed specifically for text prediction. This action allows developers to generate coherent and contextually relevant text, optimizing the variability of the output through adjustable parameters. In this article, we will explore how to effectively utilize the "Execute Text Prediction with Mistral Model" action.

Prerequisites

Before diving into the integration of the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform, which will be required for authentication.
  • Familiarity with JSON structure, as the input and output will be in this format.

Authentication typically involves passing the API key in the request headers, allowing secure access to the Cognitive Actions endpoints.

Cognitive Actions Overview

Execute Text Prediction with Mistral Model

The Execute Text Prediction with Mistral Model action performs text prediction using the OpenHermes 2.5 Mistral 7B AWQ model. It provides flexibility in generating responses by allowing the adjustment of parameters such as Top K, Top P, and temperature. This action supports both sampling and beam search decoding methods, making it versatile for various text generation needs.

Input

The input schema for this action requires the following fields:

  • prompt (required): A JSON string of message objects (with keys 'role' and 'content') that the model will process. This is similar to the format used by other AI models.
  • topK (optional): Defines the number of top most likely tokens to sample from, with a default value of 50.
  • topP (optional): Specifies the cumulative probability threshold for sampling, defaulting to 0.9.
  • temperature (optional): Controls the randomness of model outputs, with a default of 0.75.
  • useBeamSearch (optional): Indicates whether to use beam search decoding, defaulting to false.
  • maximumNewTokens (optional): Defines the maximum number of tokens to generate, with a default of 512.

Here is an example of the input payload:

{
  "topK": 50,
  "topP": 0.9,
  "prompt": "[\n      {\n        \"role\": \"system\",\n        \"content\": \"You are a helpful assistant.\"\n      },\n      {\n        \"role\": \"user\",\n        \"content\": \"What is Slite?\"\n      }\n    ]",
  "temperature": 0.75,
  "useBeamSearch": false,
  "maximumNewTokens": 512
}

Output

The output of this action typically includes the generated text based on the provided prompt. For instance, the model might return:

[
  "Slite is a collaborative knowledge management platform designed for teams. It allows users to create, organize, and share documents, notes, and knowledge bases within a team or organization. Slite's features include real-time collaboration, version control, search functionality, and integration with other tools like Google Drive, Dropbox, and Microsoft Teams. The platform is accessible through a web browser or mobile app, making it easy for teams to access and contribute to shared knowledge wherever they are."
]

Conceptual Usage Example (Python)

Below is a conceptual Python code snippet demonstrating how to invoke the Execute Text Prediction action via a hypothetical endpoint:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "f3ee52f3-6f5b-4313-81b3-0191f8a8f201" # Action ID for Execute Text Prediction

# Construct the input payload based on the action's requirements
payload = {
    "topK": 50,
    "topP": 0.9,
    "prompt": "[\n      {\n        \"role\": \"system\",\n        \"content\": \"You are a helpful assistant.\"\n      },\n      {\n        \"role\": \"user\",\n        \"content\": \"What is Slite?\"\n      }\n    ]",
    "temperature": 0.75,
    "useBeamSearch": False,
    "maximumNewTokens": 512
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this snippet, you'll need to replace the API key and endpoint with your actual values. The action ID is specified for the text prediction action, and the input payload follows the schema discussed earlier.

Conclusion

The OpenHermes 2.5 Mistral 7B AWQ Cognitive Action for text prediction offers developers a robust tool for generating context-aware text. By leveraging adjustable parameters, you can fine-tune the output to suit your specific application needs. Consider experimenting with different prompts and configurations to unlock the full potential of this powerful model. Happy coding!