Create Stunning Videos from Text Prompts with fofr/damo-text2video Cognitive Actions

23 Apr 2025
Create Stunning Videos from Text Prompts with fofr/damo-text2video Cognitive Actions

In today's digital landscape, video content is king. The fofr/damo-text2video API offers developers an innovative way to generate dynamic videos from textual prompts. These pre-built Cognitive Actions allow you to create engaging visual content quickly and efficiently. Whether you're building a creative application or looking to enhance user engagement, integrating these capabilities can take your project to the next level.

Prerequisites

Before diving into the integration, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Basic knowledge of making API requests, particularly in JSON format.
  • Familiarity with Python or any programming language of your choice, as we will provide a conceptual example using Python for the API calls.

To authenticate your requests, you'll typically include your API key in the request headers, allowing you to access the Cognitive Actions capabilities securely.

Cognitive Actions Overview

Generate Video from Text Prompt

The Generate Video from Text Prompt action transforms a textual description into a video. This feature supports customization through various parameters, allowing developers to tailor the video output to their needs.

  • Category: Video Generation
  • Description: This operation generates a video based on a textual prompt by using a customizable number of frames and frame rate settings. Advanced video generation allows for tailored video length and smoothness, as well as seed-based reproducibility.

Input

To invoke this action, you'll need to structure your input according to the following schema:

{
  "prompt": "An astronaut riding a llama",
  "numberOfFrames": 16,
  "framesPerSecond": 8,
  "numberOfInferenceSteps": 50,
  "seed": 12345  // Optional
}
  • prompt (string): Describes the desired output video content (e.g., "An astronaut riding a llama").
  • numberOfFrames (integer): Total frames in the video (default: 16).
  • framesPerSecond (integer): Frame rate of the video (default: 8).
  • numberOfInferenceSteps (integer): Steps used during the denoising process, affecting output quality (default: 50, maximum: 500, minimum: 1).
  • seed (integer, optional): Random seed for reproducibility (leave blank for a random seed).

Example Input

Here is an example JSON payload that could be used to call this action:

{
  "prompt": "An astronaut riding a llama",
  "numberOfFrames": 16,
  "framesPerSecond": 8,
  "numberOfInferenceSteps": 50
}

Output

Upon successful execution, the action returns a URL to the generated video:

"https://assets.cognitiveactions.com/invocations/52a92770-3166-479e-9a0e-7350752552a8/5a0b8707-13b2-49ba-a081-343605151053.mp4"

This URL points to the video created based on your prompt, which you can then use in your applications.

Conceptual Usage Example (Python)

Here’s how you might implement this action in Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "98cb9d5f-d7b9-4adc-b043-c359a9de8f5c"  # Action ID for Generate Video from Text Prompt

# Construct the input payload based on the action's requirements
payload = {
    "prompt": "An astronaut riding a llama",
    "numberOfFrames": 16,
    "framesPerSecond": 8,
    "numberOfInferenceSteps": 50
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this example, replace "YOUR_COGNITIVE_ACTIONS_API_KEY" with your actual API key. The action_id corresponds to the specific action being executed. The payload variable contains the structured input JSON needed for the request. You can see how to handle the response and potential errors effectively.

Conclusion

The fofr/damo-text2video Cognitive Actions provide developers with powerful tools to create videos from simple text prompts. This capability opens up a world of possibilities for enhancing user engagement and delivering unique content. By integrating these actions into your applications, you can streamline video creation and offer dynamic experiences.

Consider exploring additional use cases, such as generating educational content, promotional videos, or even interactive storytelling, to fully leverage the potential of video generation in your projects. Happy coding!