Enhance Your Applications with the Llama-3 Cognitive Actions: Extend Context Length and More

23 Apr 2025
Enhance Your Applications with the Llama-3 Cognitive Actions: Extend Context Length and More

In today's landscape of AI and machine learning, the ability to manage and generate long-context text is crucial for applications ranging from chatbots to content generation. The Llama-3 8B model, part of the tomasmcm/llama-3-8b-instruct-gradient-4194k spec, offers a powerful Cognitive Action designed to extend the context length from 8k to a staggering 4194K tokens. This capability enhances how your applications can handle complex tasks and dialogues, all while requiring minimal adjustments.

Prerequisites

Before you get started with the Llama-3 Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Basic knowledge of JSON formatting and Python for making API calls.

Authentication typically involves passing your API key through the request headers, which will grant you access to the Cognitive Actions functionality.

Cognitive Actions Overview

Extend Context Length for Llama-3 Model

The Extend Context Length for Llama-3 Model action allows you to maximize the context length, enabling your application to handle extensive text generation tasks effectively.

  • Category: text-generation
  • Description: This action extends the context length of the Llama-3 8B model, enhancing its capability to manage long-context operations with minimal training adjustments via RoPE theta.

Input

The input schema for this action is structured as follows:

{
  "prompt": "string",
  "maxTokens": "integer",
  "temperature": "number",
  "topP": "number",
  "topK": "integer",
  "presencePenalty": "number",
  "frequencyPenalty": "number",
  "stop": "string"
}

Required Field:

  • prompt: The initial text prompt provided to the model for generating an output sequence.

Optional Fields:

  • maxTokens: Maximum number of tokens the model will generate (default: 128).
  • temperature: Controls randomness in output (default: 0.8).
  • topP: Cumulative probability for top tokens consideration (default: 0.95).
  • topK: Number of top tokens to consider during generation (default: -1 includes all).
  • presencePenalty: Adjusts likelihood of generating new vs. repeated tokens (default: 0).
  • frequencyPenalty: Modifies likelihood based on prior frequency (default: 0).
  • stop: A string that stops the generation process when generated.

Example Input:

{
  "stop": "</s>",
  "topK": -1,
  "topP": 0.95,
  "prompt": "<|start_header_id|>system<|end_header_id|>\nYou are a helpful assistant. Perform the task to the best of your ability.<|eot_id|>\n<|start_header_id|>user<|end_header_id|>\nYou're standing on the surface of the Earth. You walk one mile south, one mile west and one mile north. You end up exactly where you started. Where are you?<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n",
  "maxTokens": 1024,
  "temperature": 0.8,
  "presencePenalty": 0,
  "frequencyPenalty": 0
}

Output

The action typically returns a text output based on the input prompt.

Example Output:

"I would be back where I started!"

This output showcases the model's ability to understand and respond to complex prompts effectively.

Conceptual Usage Example (Python)

To use the Extend Context Length for Llama-3 Model action, you can follow this conceptual Python code snippet:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "032daa01-deed-4e63-abff-0fcbcf15444e" # Action ID for Extend Context Length for Llama-3 Model

# Construct the input payload based on the action's requirements
payload = {
    "stop": "</s>",
    "topK": -1,
    "topP": 0.95,
    "prompt": "<|start_header_id|>system<|end_header_id|>\nYou are a helpful assistant. Perform the task to the best of your ability.<|eot_id|>\n<|start_header_id|>user<|end_header_id|>\nYou're standing on the surface of the Earth. You walk one mile south, one mile west and one mile north. You end up exactly where you started. Where are you?<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n",
    "maxTokens": 1024,
    "temperature": 0.8,
    "presencePenalty": 0,
    "frequencyPenalty": 0
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}
    )
    response.raise_for_status()

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this example, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action ID is set to the specific action you wish to execute, and the payload is structured according to the input schema outlined above. The endpoint URL and request structure are illustrative, focusing on how to format the input correctly.

Conclusion

The Llama-3 Cognitive Actions provide developers with powerful capabilities to enhance text generation and context handling in applications. By leveraging these actions, you can unlock new potential for your tools and services. Explore the various use cases, and consider integrating this functionality into your applications for improved performance and user engagement. Happy coding!