Generate Captivating Music with the pphu/musicgen-small Cognitive Actions

22 Apr 2025
Generate Captivating Music with the pphu/musicgen-small Cognitive Actions

In today's digital landscape, the ability to create personalized audio content is transforming the way we engage with media. The pphu/musicgen-small API offers a powerful Cognitive Action that allows developers to generate high-quality music based on simple text prompts or melodies. Utilizing an advanced auto-regressive Transformer model, this action promises a seamless experience for crafting unique soundscapes. In this blog post, we will delve into the capabilities of the Generate Music from Prompt action, explore its input and output specifications, and provide a conceptual example of how to integrate it into your own applications.

Prerequisites

Before diving into the integration of the Cognitive Actions, ensure you have the following:

  • An API key for accessing the Cognitive Actions platform.
  • A basic understanding of JSON and RESTful APIs.

Authentication typically involves passing your API key in the headers of your requests, allowing you to securely interact with the platform.

Cognitive Actions Overview

Generate Music from Prompt

The Generate Music from Prompt action allows developers to create music based on a descriptive text prompt. This action employs a MusicGen model that can produce diverse musical compositions of up to 30 seconds in length. The model is trained on licensed datasets to deliver high-quality audio outputs.

Input

The input for this action is structured as follows:

  • seed (integer, optional): A seed for the random number generator. If set to -1 or not provided, a random seed will be used.
  • topK (integer, optional): Limits sampling to the K most probable tokens. Default is 250.
  • topP (number, optional): Limits sampling to tokens with a cumulative probability of P. When set to 0, top K sampling is used. Default is 0.
  • prompt (string, required): A descriptive text prompt for the music generation.
  • duration (integer, optional): Duration of the generated audio in seconds (maximum 30 seconds). Default is 8 seconds.
  • temperature (number, optional): Determines variability in the sampling process. Default is 1.
  • outputFormat (string, optional): Format of the generated audio output. Options are 'wav' and 'mp3'. Default is 'wav'.
  • normalizationStrategy (string, optional): Method for normalizing the audio. Options include 'loudness', 'clip', 'peak', and 'rms'. Default is 'loudness'.
  • classifierFreeGuidance (integer, optional): Influences how closely the output matches the prompt. Default is 3.

Here is an example of the JSON payload for invoking this action:

{
  "topK": 250,
  "topP": 0,
  "prompt": "Edo25 major g melodies that sound triumphant and cinematic. Leading up to a crescendo that resolves in a 9th harmonic",
  "duration": 20,
  "temperature": 1,
  "outputFormat": "wav",
  "normalizationStrategy": "peak",
  "classifierFreeGuidance": 3
}

Output

When the action is executed successfully, it returns the URL of the generated audio file. For instance, you might receive a response like:

https://assets.cognitiveactions.com/invocations/f415880e-47e9-4975-9804-7f154ea8b2c3/24fae0cf-7c11-4dba-ab9d-ba76681b2066.wav

This URL can be used to access and play the generated music file.

Conceptual Usage Example (Python)

Here’s how you might call the Generate Music from Prompt action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "fc8f664d-bc96-432b-990b-bea464c80391" # Action ID for Generate Music from Prompt

# Construct the input payload based on the action's requirements
payload = {
    "topK": 250,
    "topP": 0,
    "prompt": "Edo25 major g melodies that sound triumphant and cinematic. Leading up to a crescendo that resolves in a 9th harmonic",
    "duration": 20,
    "temperature": 1,
    "outputFormat": "wav",
    "normalizationStrategy": "peak",
    "classifierFreeGuidance": 3
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace "YOUR_COGNITIVE_ACTIONS_API_KEY" with your actual API key. The payload variable is constructed based on the required input schema for the action, and the code handles the response and potential errors effectively.

Conclusion

The pphu/musicgen-small Cognitive Action opens up a world of possibilities for developers looking to integrate music generation into their applications. By leveraging the Generate Music from Prompt action, you can easily create tailored audio experiences that enhance user engagement. Experiment with various prompts and settings to discover the full potential of this innovative tool. Happy coding!