Transform Your Music with the All-in-One Audio Cognitive Actions

23 Apr 2025
Transform Your Music with the All-in-One Audio Cognitive Actions

In the world of audio processing, having the right tools at your disposal can significantly enhance your projects. The erickluis00/all-in-one-audio Cognitive Actions provide powerful pre-built functionalities that allow developers to analyze music structures and split audio into different stems with ease. By leveraging these actions, you can automate complex audio tasks, improve sound quality, and gain deeper insights into your music content.

Prerequisites

To get started with the Cognitive Actions, you will need:

  • An API key for the Cognitive Actions platform to authenticate your requests.
  • A valid URL for the audio file you wish to analyze, ensuring it meets any format requirements.

Authentication typically involves passing the API key in the request headers, allowing you to securely access the functionalities provided by the Cognitive Actions.

Cognitive Actions Overview

Analyze Music Structure and Split Stems

This action analyzes the structure of music and separates audio into different stems using advanced models like Demucs and Mdx-Net. It includes options for sonification, visualization, and the inclusion of embeddings and activations.

  • Category: audio-processing

Input

The input for this action is a JSON object with various parameters:

{
  "model": "harmonix-all",
  "sonify": true,
  "visualize": true,
  "musicInput": "https://replicate.delivery/pbxt/KVM8Zd9OOQ9SqVRnCY0unu2NL0bL9X5drWpGVSsNMItoEmbR/Bruno%20Mars%20-%2024k%20Magic.mp3",
  "audioSeparator": true,
  "includeEmbeddings": false,
  "includeActivations": false,
  "audioSeparatorModel": "Kim_Vocal_2.onnx"
}
  • Required Fields:
    • musicInput: A URI pointing to the audio file to analyze (must be a valid URL).
  • Optional Fields:
    • model: Specifies which pretrained model to use (default is "harmonix-all").
    • sonify: Indicates whether to save the sonifications (default is false).
    • visualize: Indicates whether to save visualizations (default is false).
    • audioSeparator: Determines if the audio should be separated into vocals and instrumental parts (default is false).
    • includeEmbeddings: Specifies whether to include embeddings in the results (default is false).
    • includeActivations: Specifies whether to include activations in the results (default is false).
    • audioSeparatorModel: Specifies the pretrained model for audio separation (default is "Kim_Vocal_2.onnx").

Output

The action typically returns a JSON object that includes various audio outputs and other analysis results:

{
  "mdx_other": [],
  "mdx_vocals": "https://assets.cognitiveactions.com/invocations/97d08c2c-b937-494e-be0b-a93f2c3ffec6/491c87e3-9e83-45ad-a749-81d807a78524.wav",
  "demucs_bass": "https://assets.cognitiveactions.com/invocations/97d08c2c-b937-494e-be0b-a93f2c3ffec6/8d5c91c3-76ef-4323-8ea3-22f90265bd91.wav",
  "demucs_drums": "https://assets.cognitiveactions.com/invocations/97d08c2c-b937-494e-be0b-a93f2c3ffec6/1593bf05-59a8-4935-ac07-728e1e41730b.wav",
  "demucs_other": "https://assets.cognitiveactions.com/invocations/97d08c2c-b937-494e-be0b-a93f2c3ffec6/5459c5d0-127e-45ef-a02e-69f3131e315e.wav",
  "demucs_piano": null,
  "sonification": "https://assets.cognitiveactions.com/invocations/97d08c2c-b937-494e-be0b-a93f2c3ffec6/302e5bbe-4762-414f-917d-b3c3e1a3b04e.mp3",
  "demucs_guitar": null,
  "demucs_vocals": "https://assets.cognitiveactions.com/invocations/97d08c2c-b937-494e-be0b-a93f2c3ffec6/7c6b7892-7c35-49a2-bd4f-90307217ad76.wav",
  "visualization": "https://assets.cognitiveactions.com/invocations/97d08c2c-b937-494e-be0b-a93f2c3ffec6/e664b18e-98fb-414e-b0fd-ce0034cd27ba.png",
  "analyzer_result": "https://assets.cognitiveactions.com/invocations/97d08c2c-b937-494e-be0b-a93f2c3ffec6/4df40439-3d9e-46f3-9b77-2726330eec83.json",
  "mdx_instrumental": "https://assets.cognitiveactions.com/invocations/97d08c2c-b937-494e-be0b-a93f2c3ffec6/e5a21f7e-c451-4729-b2bc-e7dc9c27598a.wav"
}
  • Output Fields:
    • Various audio files such as mdx_vocals, demucs_bass, demucs_drums, and more, each containing links to the separated audio files.
    • Links to the sonification and visualization results.
    • A JSON file containing the analyzer results.

Conceptual Usage Example (Python)

Here's how you could invoke this action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "528e4e28-1ea4-4403-bf71-002ea2ca93b9" # Action ID for Analyze Music Structure and Split Stems

# Construct the input payload based on the action's requirements
payload = {
    "model": "harmonix-all",
    "sonify": True,
    "visualize": True,
    "musicInput": "https://replicate.delivery/pbxt/KVM8Zd9OOQ9SqVRnCY0unu2NL0bL9X5drWpGVSsNMItoEmbR/Bruno%20Mars%20-%2024k%20Magic.mp3",
    "audioSeparator": True,
    "includeEmbeddings": False,
    "includeActivations": False,
    "audioSeparatorModel": "Kim_Vocal_2.onnx"
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet:

  • Replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key.
  • The action_id is set to the ID of the "Analyze Music Structure and Split Stems" action.
  • The payload variable is filled with the required input for the action, and the response from the API is printed in a formatted manner.

Conclusion

The erickluis00/all-in-one-audio Cognitive Actions optimize your audio processing workflow by providing robust functionalities for analyzing and separating music. By integrating these actions into your applications, you can enhance the user experience, deliver tailored audio outputs, and explore new creative possibilities. Consider experimenting with the various options available in the input schema to see how they can best meet your audio processing needs. Happy coding!