Create Stunning Videos from Text Descriptions with Text2video Zero

26 Apr 2025
Create Stunning Videos from Text Descriptions with Text2video Zero

In today's digital landscape, the ability to quickly generate engaging video content is invaluable. With Text2video Zero, you can transform simple textual descriptions into high-quality videos without the need for extensive video datasets or complex training processes. This innovative service leverages advanced text-to-image synthesis techniques, allowing developers to create compelling visual narratives with minimal effort.

Imagine the possibilities: from crafting promotional videos for products to generating dynamic educational content or even storytelling through visuals, the applications are numerous. Text2video Zero simplifies content creation, enabling you to focus on your creative vision rather than the technical hurdles often associated with video production.

Prerequisites

To get started with Text2video Zero, you'll need a Cognitive Actions API key and a basic understanding of how to make API calls.

Generate Video from Text Zero-Shot

This action utilizes the Picsart Text2Video-Zero model to generate videos based on textual prompts. It effectively solves the challenge of video content creation by eliminating the need for large-scale video datasets and heavy computational resources, making it accessible for developers at any level.

Input Requirements

The action accepts a CompositeRequest object containing the following parameters:

  • Prompt: A descriptive text that guides the video content (e.g., "a beautiful sunset, clouds").
  • Seed: Optional parameter for deterministic results.
  • Chunk Size: Number of frames processed at once (range: 1-10; default: 8).
  • Resolution: Output video resolution (default: 512 pixels).
  • End Timestep: Ending timestep for DDPM steps (default: 47).
  • Video Length: Total number of frames in the video (default: 8).
  • Merging Ratio: Compression ratio for merged tokens (range: 0-0.9; default: 0).
  • Start Timestep: Starting timestep for DDPM steps (default: 44).
  • Negative Prompt: Elements to avoid in the video.
  • Frames Per Second: Frame rate for the video (range: 5-60; default: 15).
  • Motion Field Strength X: Motion field strength along the X-axis (default: 12).
  • Motion Field Strength Y: Motion field strength along the Y-axis (default: 12).

Expected Output

The expected output is a link to the generated video, which can be accessed directly. For example, a successful invocation may return a URL like: https://assets.cognitiveactions.com/invocations/14cf9e92-8284-4209-9e31-5794c4f6e24e/6e7c0041-8864-4a55-97cd-15a5410c3ffd.mp4.

Use Cases for this Action

  • Marketing Content: Quickly create promotional videos for products based on descriptive text, enhancing marketing campaigns.
  • Education and Training: Generate instructional videos that visually explain concepts using simple text prompts.
  • Creative Storytelling: Transform narratives into visual stories, allowing creators to share ideas in an engaging format.
  • Social Media Content: Produce eye-catching videos for platforms like Instagram or TikTok, leveraging trending topics or themes.
import requests
import json

# Replace with your actual Cognitive Actions API key and endpoint
# Ensure your environment securely handles the API key
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
# This endpoint URL is hypothetical and should be documented for users
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"

action_id = "1bc0069d-30aa-4704-b20a-16bc758f09b6" # Action ID for: Generate Video from Text Zero-Shot

# Construct the exact input payload based on the action's requirements
# This example uses the predefined example_input for this action:
payload = {
  "prompt": "a beautiful sunset, clouds",
  "chunkSize": 10,
  "resolution": 512,
  "endTimestep": 47,
  "videoLength": 20,
  "startTimestep": 44,
  "framesPerSecond": 8,
  "motionFieldStrengthX": 12,
  "motionFieldStrengthY": 12
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json",
    # Add any other required headers for the Cognitive Actions API
}

# Prepare the request body for the hypothetical execution endpoint
request_body = {
    "action_id": action_id,
    "inputs": payload
}

print(f"--- Calling Cognitive Action: {action.name or action_id} ---")
print(f"Endpoint: {COGNITIVE_ACTIONS_EXECUTE_URL}")
print(f"Action ID: {action_id}")
print("Payload being sent:")
print(json.dumps(request_body, indent=2))
print("------------------------------------------------")

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json=request_body
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully. Result:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body (non-JSON): {e.response.text}")
    print("------------------------------------------------")

Conclusion

Text2video Zero empowers developers to harness the power of video content creation with ease and efficiency. By converting text prompts into visually stunning videos, this tool opens up a world of creative possibilities across various industries. Whether you’re looking to enhance marketing strategies, develop educational materials, or create engaging social media content, Text2video Zero is a powerful ally in your content creation toolkit.

Ready to elevate your projects with video? Start experimenting with Text2video Zero today!