Add Karaoke-Style Captions to Your Videos with fictions-ai/autocaption

24 Apr 2025
Add Karaoke-Style Captions to Your Videos with fictions-ai/autocaption

In the digital age, enhancing video content with captions has never been more important. The fictions-ai/autocaption API provides developers with powerful Cognitive Actions to seamlessly integrate automated captioning into their video projects. One such action allows you to add karaoke-style captions, complete with customizable styling options and transcript generation for further editing. This article will guide you through the process of utilizing this action to enrich your video content.

Prerequisites

Before diving into the implementation, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Familiarity with making API calls and handling JSON data.
  • Python installed on your local machine for testing the provided code snippets.

To authenticate your requests, you will typically include your API key in the request headers.

Cognitive Actions Overview

Add Karaoke-Style Captions

The Add Karaoke-Style Captions action automatically overlays karaoke-style captions onto a video. This includes options for various fonts, colors, and even translations, making it highly versatile for different audiences and use cases. Additionally, the action generates a transcript, which can be useful for editing or reuse.

Input

The input for this action requires a JSON object that includes the following fields:

  • videoFileInput (required): URI of the video file to which captions will be added.
  • font (optional): Specifies the font used for the captions (default: Poppins/Poppins-ExtraBold.ttf).
  • color (optional): The color of the captions (default: white).
  • kerning (optional): Adjusts the spacing between characters (default: -5).
  • opacity (optional): Opacity level for the subtitles background (default: 0).
  • fontSize (optional): Font size for subtitles (default: 7).
  • translate (optional): Translates subtitles to English if set to true (default: false).
  • outputVideo (optional): Outputs the video with embedded subtitles if set to true (default: true).
  • rightToLeft (optional): Use right-to-left subtitles for applicable languages (default: false).
  • strokeColor (optional): Outline color of the subtitles (default: black).
  • strokeWidth (optional): Width of the outline around subtitles (default: 2.6).
  • maxCharacters (optional): Maximum characters per line (default: 20).
  • highlightColor (optional): Color used to highlight certain captions (default: yellow).
  • outputTranscript (optional): Outputs a transcript file if set to true (default: true).
  • subtitlesPosition (optional): Position of subtitles within the video frame (default: bottom75).
  • transcriptFileInput (optional): URI of a pre-generated transcript file (if provided).

Example Input:

{
  "color": "white",
  "opacity": 0,
  "fontSize": 7,
  "outputVideo": true,
  "maxCharacters": 20,
  "highlightColor": "yellow",
  "videoFileInput": "https://replicate.delivery/pbxt/K5zuJ6HCdsffhegX0JZwDl10qm7fYAh5txe0FZc7XFccpdtm/kingnobelbig.mp4",
  "outputTranscript": true,
  "subtitlesPosition": "bottom75"
}

Output

Upon successful execution, the action returns two main outputs:

  1. A URI to the video with embedded captions.
  2. A URI to the generated transcript file in JSON format.

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/c4180712-e629-4bc3-88f6-124432214fdd/1a1437dc-b262-46ae-887d-c0426d6af624.mp4",
  "https://assets.cognitiveactions.com/invocations/c4180712-e629-4bc3-88f6-124432214fdd/44b69320-40ce-429f-8ec7-1c080d32a050.json"
]

Conceptual Usage Example (Python)

Below is a conceptual Python code snippet demonstrating how to call the Add Karaoke-Style Captions action through a hypothetical Cognitive Actions endpoint.

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "7c620133-a7cd-42ed-873b-b77c8550f613"  # Action ID for Add Karaoke-Style Captions

# Construct the input payload based on the action's requirements
payload = {
    "color": "white",
    "opacity": 0,
    "fontSize": 7,
    "outputVideo": True,
    "maxCharacters": 20,
    "highlightColor": "yellow",
    "videoFileInput": "https://replicate.delivery/pbxt/K5zuJ6HCdsffhegX0JZwDl10qm7fYAh5txe0FZc7XFccpdtm/kingnobelbig.mp4",
    "outputTranscript": True,
    "subtitlesPosition": "bottom75"
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet:

  • You replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key.
  • The payload variable is structured according to the required input fields for the action.
  • The API call is made to the hypothetical execution endpoint, and the results are printed out.

Conclusion

The fictions-ai/autocaption API's Add Karaoke-Style Captions action empowers developers to enhance video content with customizable, automated captions. By integrating this action into your applications, you can significantly improve accessibility and audience engagement. Explore the various options available to tailor captions to your specific needs and consider additional use cases like educational content or social media posts. Happy coding!