Streamline Video Transformations with fofr/lcm-video2video Cognitive Actions

22 Apr 2025
Streamline Video Transformations with fofr/lcm-video2video Cognitive Actions

In the world of multimedia applications, the ability to manipulate and enhance video content is paramount. The fofr/lcm-video2video spec provides a powerful Cognitive Action designed to facilitate fast and efficient video-to-video conversions. Utilizing a latent consistency model, this action allows developers to transform videos seamlessly based on text prompts, all while maintaining high quality and speed. In this article, we will delve deep into this Cognitive Action, discussing its capabilities, input requirements, and how to integrate it into your applications.

Prerequisites

Before you can start using the Cognitive Actions, you need to have a few things in place:

  • An API key for the Cognitive Actions platform to authenticate your requests.
  • Familiarity with making HTTP requests, as you will be sending requests to a Cognitive Actions endpoint.

Conceptually, authentication can be achieved by including the API key in the request headers. This is essential to access the functionalities provided by the Cognitive Actions.

Cognitive Actions Overview

Perform Fast Video-to-Video Conversion

The Perform Fast Video-to-Video Conversion action allows for quick transformations of videos guided by text prompts. This action is particularly useful for applications that require high-quality video editing capabilities without sacrificing performance.

Input

The input schema for this action requires several fields, with video being mandatory:

  • video: (string) The URI of the video to be processed.
    Example: "https://replicate.delivery/pbxt/Jm6KWurXwbMM19Oyu5novOS5wUD0Y8kOQsrlIEpogFC4dSBr/image%20%281%29.mp4"
  • fps: (integer, default: 8) Frames per second for the video export.
  • seed: (integer) Random seed for generating results.
  • prompt: (string, default: "Self-portrait oil painting...") Text prompt guiding the conversion process.
  • maxWidth: (integer, default: 512) Maximum width of the output video.
  • controlnet: (string, default: "none") Type of ControlNet to be used.
  • returnFrames: (boolean, default: false) Whether to return a tar file containing all frames.
  • guidanceScale: (number, default: 8) Controls the strength of classifier-free guidance.
  • promptStrength: (number, default: 0.2) Determines the influence of the prompt on the final output.
  • extractAllFrames: (boolean, default: false) If true, extracts every frame from the video.
  • cannyLowThreshold: (number, default: 100) Lower threshold for the Canny edge detection.
  • numInferenceSteps: (integer, default: 4) Number of denoising steps to perform on each frame.
  • cannyHighThreshold: (number, default: 200) Higher threshold for the Canny edge detection.
  • controlGuidanceEnd: (number, default: 1) Point to end ControlNet guidance.
  • controlGuidanceStart: (number, default: 0) Starting point for ControlNet guidance.
  • controlnetConditioningScale: (number, default: 2) Adjusts the conditioning scale for ControlNet application.

Here is how a typical input JSON payload looks:

{
  "fps": 16,
  "seed": 2000,
  "video": "https://replicate.delivery/pbxt/Jm6KWurXwbMM19Oyu5novOS5wUD0Y8kOQsrlIEpogFC4dSBr/image%20%281%29.mp4",
  "prompt": "stars and nebula",
  "maxWidth": 512,
  "controlnet": "none",
  "returnFrames": false,
  "guidanceScale": 6,
  "promptStrength": 0.6,
  "extractAllFrames": true,
  "cannyLowThreshold": 100,
  "numInferenceSteps": 4,
  "cannyHighThreshold": 200,
  "controlGuidanceEnd": 1,
  "controlGuidanceStart": 0,
  "controlnetConditioningScale": 2
}

Output

The action typically returns a processed video as a URL, which can be accessed directly after the conversion is complete. Here’s an example output:

[
  "https://assets.cognitiveactions.com/invocations/10e1a3e3-f1a4-4115-a687-58ef9415b34e/b8481742-ee19-4975-bbff-b3770bbcb62f.mp4"
]

Conceptual Usage Example (Python)

Below is a conceptual Python code snippet that demonstrates how a developer might invoke the Perform Fast Video-to-Video Conversion action using a hypothetical Cognitive Actions execution endpoint.

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "d8b9deb0-7640-4c82-af2c-fa80d5d3b45f" # Action ID for Perform Fast Video-to-Video Conversion

# Construct the input payload based on the action's requirements
payload = {
    "fps": 16,
    "seed": 2000,
    "video": "https://replicate.delivery/pbxt/Jm6KWurXwbMM19Oyu5novOS5wUD0Y8kOQsrlIEpogFC4dSBr/image%20%281%29.mp4",
    "prompt": "stars and nebula",
    "maxWidth": 512,
    "controlnet": "none",
    "returnFrames": false,
    "guidanceScale": 6,
    "promptStrength": 0.6,
    "extractAllFrames": true,
    "cannyLowThreshold": 100,
    "numInferenceSteps": 4,
    "cannyHighThreshold": 200,
    "controlGuidanceEnd": 1,
    "controlGuidanceStart": 0,
    "controlnetConditioningScale": 2
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, the action_id is set for the video conversion action, and the input payload is structured according to the defined schema. The API key and endpoint URL need to be replaced with actual values for proper execution.

Conclusion

The Perform Fast Video-to-Video Conversion action under the fofr/lcm-video2video spec opens up a wide array of possibilities for developers looking to integrate intelligent video transformations into their applications. With its ability to quickly process videos while maintaining quality, this action is an essential tool for any multimedia project. Consider exploring other potential use cases or combining this action with other Cognitive Actions to create even more sophisticated applications. Happy coding!