Generate Text with Llama-2-70B: A Guide to Upstage Cognitive Actions

23 Apr 2025
Generate Text with Llama-2-70B: A Guide to Upstage Cognitive Actions

In the ever-evolving landscape of artificial intelligence, the Upstage/Llama-2-70B-instruct-v2 API offers developers the ability to harness the power of advanced text generation. By utilizing the capabilities of the Llama-2-70B model with GPTQ technology, developers can generate high-quality, contextually appropriate text responses tailored to various parameters. This blog post will guide you through the integration of this powerful Cognitive Action into your applications.

Prerequisites

Before diving into the implementation, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Familiarity with making API calls in your preferred programming language.
  • A basic understanding of JSON structure, as you'll be working with JSON payloads for input and output.

For authentication, typically you would pass your API key in the headers of your requests, allowing you to securely access the Cognitive Actions services.

Cognitive Actions Overview

Generate Text with Llama-2-70B Instructions

Description: This action allows you to generate text using the Upstage/Llama-2-70B-instruct-v2 model, focusing on parameters like Top P, Temperature, Max New Tokens, and Repetition Penalty. It ensures that the output is respectful, safe, and unbiased.

Category: Text Generation

Input

The input for this action consists of the following fields:

  • topP (number, optional): The probability threshold for sampling tokens during text generation. Default is 0.95, ranging from 0.01 to 1.
  • prompt (string, required): The input prompt for the model, e.g., "Tell me about AI".
  • temperature (number, optional): Controls randomness in responses. Default is 0.75, with a range from 0 to 5.
  • maxNewTokens (integer, optional): The maximum number of tokens to generate beyond the prompt. Default is 512, allowing 1 to 2048 tokens.
  • systemPrompt (string, optional): Initial instructions guiding the AI's behavior. Default is a detailed prompt encouraging respectful and unbiased responses.
  • repetitionPenalty (number, optional): Adjusts token repetition likelihood. Default is 1.1, with a range from 0 to 5.

Example Input:

{
  "topP": 0.95,
  "prompt": "Tell me about AI",
  "temperature": 0.75,
  "maxNewTokens": 512,
  "systemPrompt": "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe...",
  "repetitionPenalty": 1.1
}

Output

The output is a text response generated by the model based on the provided prompt and parameters. An example output could be:

AI stands for Artificial Intelligence, which refers to the development of computer systems that can perform tasks that would normally require human intelligence...

Conceptual Usage Example (Python)

Here's a conceptual Python code snippet to illustrate how to call the Cognitive Action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "60c1df9f-c2a3-46ed-b0dd-d3b2a6d29a77"  # Action ID for Generate Text with Llama-2-70B Instructions

# Construct the input payload based on the action's requirements
payload = {
    "topP": 0.95,
    "prompt": "Tell me about AI",
    "temperature": 0.75,
    "maxNewTokens": 512,
    "systemPrompt": "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe...",
    "repetitionPenalty": 1.1
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action_id corresponds to the "Generate Text with Llama-2-70B Instructions" action, and the payload is constructed based on the required input schema. The endpoint URL and request structure are illustrative.

Conclusion

The Upstage/Llama-2-70B-instruct-v2 Cognitive Action provides a powerful tool for developers looking to integrate advanced text generation capabilities into their applications. By customizing parameters like temperature, top P, and repetition penalty, you can generate responses that are not only contextually relevant but also safe and respectful.

Explore various use cases, from chatbots to content generation, and leverage the potential of this cognitive action to enhance user experiences. Happy coding!