Create Stunning Videos from Text Descriptions with Hunyuan Video

25 Apr 2025
Create Stunning Videos from Text Descriptions with Hunyuan Video

In today's digital landscape, video content is king. Engaging, high-quality videos can capture attention and convey messages in ways that static images and text simply cannot. The Hunyuan Video service offers developers powerful Cognitive Actions that allow them to generate videos directly from text descriptions. This innovative tool not only simplifies the video creation process but also enhances creativity by transforming written prompts into dynamic visual content. Whether you’re creating marketing materials, educational content, or entertainment, Hunyuan Video enables you to produce professional-grade videos quickly and efficiently.

Prerequisites

To get started with Hunyuan Video, you will need a Cognitive Actions API key and a basic understanding of how to make API calls.

Generate Video from Text

The "Generate Video from Text" action is at the heart of the Hunyuan Video service. This action allows users to create high-quality videos with realistic motion based on text descriptions, making it an invaluable tool for content creators and developers alike.

Purpose

This action solves the challenge of video production by allowing users to generate videos from simple text prompts. It supports various resolutions and ensures stable motion generation and temporal consistency, resulting in visually appealing content.

Input Requirements

To use this action, you will need to provide several parameters:

  • Prompt: A text description that guides the video generation process. (e.g., "A cat walks on the grass, realistic style")
  • Width: The width of the video in pixels, which must be divisible by 16 and can range from 16 to 1280 (default is 864).
  • Height: The height of the video in pixels, also following the same divisibility and range rules (default is 480).
  • Video Length: The total number of frames in the video, which should be a form of 4k+1 (e.g., 49 or 129), with a minimum of 1 and a maximum of 200 (default is 129).
  • Frames Per Second: The number of frames displayed per second (default is 24).
  • Inference Steps: Defines the number of denoising steps in the video generation process, with a minimum of 1 (default is 50).
  • Embedded Guidance Scale: Controls the balance between guidance and model creativity, ranging from 1 to 10 (default is 6).
  • Seed: An optional seed for randomization (leave empty for a random seed).

Expected Output

The output will be a URL link to the generated video, allowing users to easily access and share their creations. For example:

https://assets.cognitiveactions.com/invocations/fe7b700a-c5d0-4e03-a7f2-b684b011ec69/19521e2a-4916-4a6b-b00c-de648164242f.mp4

Use Cases for this Action

  • Marketing and Advertising: Create promotional videos that capture product features or brand stories in a visually compelling way.
  • Education and Training: Generate engaging educational content that illustrates concepts through dynamic visuals, enhancing learning experiences.
  • Entertainment: Develop short films or animated stories based on narrative prompts, giving creators a new medium to express their ideas.
  • Social Media Content: Quickly produce eye-catching videos for social media platforms to boost engagement and reach.
import requests
import json

# Replace with your actual Cognitive Actions API key and endpoint
# Ensure your environment securely handles the API key
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
# This endpoint URL is hypothetical and should be documented for users
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"

action_id = "25b386dc-2af6-403a-af2a-a97d20267894" # Action ID for: Generate Video from Text

# Construct the exact input payload based on the action's requirements
# This example uses the predefined example_input for this action:
payload = {
  "width": 864,
  "height": 480,
  "prompt": "A cat walks on the grass, realistic style",
  "inferenceSteps": 50,
  "framesPerSecond": 24,
  "videoFrameCount": 129,
  "embeddedGuidanceScale": 6
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json",
    # Add any other required headers for the Cognitive Actions API
}

# Prepare the request body for the hypothetical execution endpoint
request_body = {
    "action_id": action_id,
    "inputs": payload
}

print(f"--- Calling Cognitive Action: {action.name or action_id} ---")
print(f"Endpoint: {COGNITIVE_ACTIONS_EXECUTE_URL}")
print(f"Action ID: {action_id}")
print("Payload being sent:")
print(json.dumps(request_body, indent=2))
print("------------------------------------------------")

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json=request_body
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully. Result:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body (non-JSON): {e.response.text}")
    print("------------------------------------------------")

Conclusion

Hunyuan Video provides developers with a powerful tool to transform text into stunning video content, simplifying the video creation process while enhancing creativity. With its range of customizable parameters and high-quality output, this action opens up a world of possibilities for content creators across various industries. Start exploring Hunyuan Video today and take your video projects to the next level!