Transform Static Images into Motion with Video Inference Cognitive Actions

25 Apr 2025
Transform Static Images into Motion with Video Inference Cognitive Actions

In the realm of video generation and transformation, the sebi75/video-inference API offers cutting-edge Cognitive Actions that empower developers to create dynamic video content from static images. With the latest advancements in Stable Video Diffusion, these pre-built actions allow for high-quality video transformations, enabling unique use cases ranging from artistic video effects to practical applications in marketing and content creation.

Prerequisites

Before diving into the integration of these Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform to authenticate your requests.
  • Basic familiarity with JSON and Python for structuring your API calls.

Authentication typically involves passing your API key in the headers of your HTTP requests, ensuring secure and authorized access to the API.

Cognitive Actions Overview

Generate Stable Video Diffusion

Description:
This action implements the latest Stable Video Diffusion model, allowing developers to perform inference on Replicate. It transforms static images into motion by utilizing parameters such as motion buckets, frame numbers, and sizing strategies.

Category: Video Generation

Input

The input for the Generate Stable Video Diffusion action requires various parameters to ensure proper functionality. Below is a breakdown of the required and optional fields:

  • inputImage (required): A URI path to the input image file or directory containing image files.
    Example: "https://replicate.delivery/pbxt/JwbPVJ24A2Igxw2g0NJ1b3LM4XTXj4GvEYpuUOU958IMlt4o/test_image.png"
  • seed (optional): An integer used for random number generation, defaulting to 23.
    Example: 23
  • version (optional): Specifies the version of the model, default is 'svd'.
    Example: "svd_xt"
  • motionBucketId (optional): Identifier for the motion bucket used in video processing, with a default value of 127.
    Example: 127
  • numberOfFrames (optional): Total number of frames to process.
    Example: 25
  • sizingStrategy (optional): Determines the resizing approach for the input image. Options include maintaining aspect ratio, cropping to 16:9, or using original image dimensions. Default is 'maintain_aspect_ratio'.
    Example: "crop_to_16_9"
  • decodingThreshold (optional): Sets the number of frames decoded at a time, influencing VRAM usage. Default is 14.
    Example: 7
  • framesPerSecondId (optional): Specifies frames per second for video playback, must be between 5 and 30. Default is 6.
    Example: 6
  • conditionAugmentation (optional): Factor augmenting the conditioning of inputs, affecting processing intensity. Default value is 0.02.
    Example: 0.02

Example Input

Here’s how the input JSON payload might look for this action:

{
  "seed": 23,
  "version": "svd_xt",
  "inputImage": "https://replicate.delivery/pbxt/JwbPVJ24A2Igxw2g0NJ1b3LM4XTXj4GvEYpuUOU958IMlt4o/test_image.png",
  "motionBucketId": 127,
  "numberOfFrames": 25,
  "sizingStrategy": "crop_to_16_9",
  "decodingThreshold": 7,
  "framesPerSecondId": 6,
  "conditionAugmentation": 0.02
}

Output

The action typically returns a URL to the generated video file. For instance:

Example Output:
"https://assets.cognitiveactions.com/invocations/392f14a1-d975-48e0-9d16-b540c3f374c3/5e647275-ddf3-4890-97e7-222f8931ef36.mp4"

Conceptual Usage Example (Python)

Here is a conceptual Python code snippet that demonstrates how to call the Generate Stable Video Diffusion action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "1c3bfc7a-f011-4045-95e8-847fa50625d2"  # Action ID for Generate Stable Video Diffusion

# Construct the input payload based on the action's requirements
payload = {
    "seed": 23,
    "version": "svd_xt",
    "inputImage": "https://replicate.delivery/pbxt/JwbPVJ24A2Igxw2g0NJ1b3LM4XTXj4GvEYpuUOU958IMlt4o/test_image.png",
    "motionBucketId": 127,
    "numberOfFrames": 25,
    "sizingStrategy": "crop_to_16_9",
    "decodingThreshold": 7,
    "framesPerSecondId": 6,
    "conditionAugmentation": 0.02
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code, replace the placeholder API key with your actual key. The input payload is constructed according to the specifications, and the action ID is provided to execute the Generate Stable Video Diffusion action.

Conclusion

The Cognitive Actions provided by the sebi75/video-inference API open up exciting possibilities for developers looking to create engaging video content from static images. By leveraging the Generate Stable Video Diffusion action, you can easily implement video transformations that enhance your applications. Whether for creative projects or commercial use, these tools provide a powerful way to innovate in the video generation space. Start experimenting with these actions today to unlock their full potential!