Generate Automatic Subtitles with the razvandrl/subtitler Cognitive Action

In today’s digital landscape, video content has become a cornerstone of communication and storytelling. However, accessibility can be a challenge, especially for non-native speakers or the hearing impaired. This is where the razvandrl/subtitler API comes into play, providing a powerful Cognitive Action to automatically generate subtitles from video files. By leveraging this action, developers can enhance their applications by making video content more accessible, improving user engagement, and creating a better overall experience for viewers.
Prerequisites
Before you start integrating the razvandrl/subtitler Cognitive Actions, ensure you have the following:
- An API key for the Cognitive Actions platform.
- Basic knowledge of working with APIs and JSON.
- Familiarity with Python for testing the integration.
Authentication typically involves passing the API key in the request headers, allowing your application to securely interact with the Cognitive Actions services.
Cognitive Actions Overview
Generate Subtitles from Video
Purpose
This action transcribes audio from a specified video file to generate subtitles. It is designed to optimize performance through configurable batch processing, allowing for efficient transcription of audio in various video formats.
Input
The input for this action consists of a JSON object with the following fields:
- file (required): A URI pointing to the video file location.
- batchSize (optional): An integer defining the number of audio samples to process concurrently during transcription. The default is 32.
Example Input:
{
"file": "https://replicate.delivery/pbxt/KPkWPhskZlqgphzzvsh116zazIqDYcxleNkWuq12grfcTqAF/Two%202-minute%20Rules%20to%20Beat%20Procrastination%20%28in%202%20minutes%29.mp4",
"batchSize": 32
}
Output
The output of this action is a JSON array containing subtitle entries. Each entry includes:
- start: The start time of the subtitle text.
- end: The end time of the subtitle text.
- text: The actual subtitle text.
- words: An array of individual words with their respective timings and confidence scores.
Example Output:
[
{
"start": 0.171,
"end": 7.617,
"text": " After reading tons of productivity books, I came across so many rules like the two year rule, the five minute rule, the five second rule.",
"words": [
{"word": "After", "start": 0.171, "end": 0.351, "score": 0.84},
{"word": "reading", "start": 0.371, "end": 0.611, "score": 0.769},
// Additional words...
]
},
// Additional subtitles...
]
Conceptual Usage Example (Python)
Here’s a conceptual Python snippet demonstrating how to invoke the Generate Subtitles from Video action:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "6f65f314-ace8-4f07-a0eb-a8119738c2f5" # Action ID for Generate Subtitles from Video
# Construct the input payload based on the action's requirements
payload = {
"file": "https://replicate.delivery/pbxt/KPkWPhskZlqgphzzvsh116zazIqDYcxleNkWuq12grfcTqAF/Two%202-minute%20Rules%20to%20Beat%20Procrastination%20%28in%202%20minutes%29.mp4",
"batchSize": 32
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload}
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this example, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The code constructs an input payload according to the action’s requirements and sends a POST request to the hypothetical execution endpoint.
Conclusion
The razvandrl/subtitler Cognitive Action provides a seamless way to generate subtitles from video files, enhancing accessibility and engagement for your audience. By integrating this action into your applications, you can improve user experience and open your content to a wider audience.
Next steps might include exploring additional features of the Cognitive Actions platform, such as optimizing performance with batch processing configurations or integrating other actions to enhance your application's capabilities.