Create Stunning Images with the philipanheuser/philip Cognitive Actions

24 Apr 2025
Create Stunning Images with the philipanheuser/philip Cognitive Actions

In the world of artificial intelligence and machine learning, image generation has become a fascinating area of exploration. The philipanheuser/philip spec offers a powerful set of Cognitive Actions that enable developers to create and customize images using advanced inpainting techniques. These pre-built actions facilitate the integration of image generation capabilities into your applications, providing flexibility in terms of image inputs, aspect ratios, output quality, and more.

Prerequisites

Before diving into the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform, which will be used for authentication.
  • Familiarity with JSON format, as you'll be constructing and sending JSON payloads in API requests.

To authenticate your requests, you'll generally need to include your API key in the headers of your HTTP requests.

Cognitive Actions Overview

Generate Image with Inpainting and Customization

Description: This action allows you to create and customize images using inpainting techniques. You can provide various image and text inputs, and fine-tune settings such as aspect ratio, image format, and output quality. Two models, 'dev' and 'schnell', offer different speeds and quality levels.

Category: image-generation

Input

The input schema for this action requires several fields, with the prompt being mandatory. Below is the schema breakdown along with an example input:

  • prompt (string): A detailed textual prompt that guides the image generation.
  • mask (string, optional): An image mask for inpainting mode.
  • image (string, optional): An input image for the image-to-image or inpainting mode.
  • model (string, optional): Specifies the model ('dev' or 'schnell') to use for inference (default: "dev").
  • width (integer, optional): Width of the generated image.
  • height (integer, optional): Height of the generated image.
  • fastMode (boolean, optional): Run faster predictions (default: false).
  • imageFormat (string, optional): Format of the output images (default: "webp").
  • outputCount (integer, optional): Number of images to generate (default: 1).
  • guidanceLevel (number, optional): Defines the guidance scale for realism (default: 3).
  • outputQuality (integer, optional): Quality of output images (default: 80).
  • inferenceStepCount (integer, optional): Number of denoising steps (default: 28).

Example Input:

{
    "model": "dev",
    "prompt": "Philip in a light blue shirt, sleeves rolled up to the elbows, standing and leaning on a large wooden meeting table with both hands, looking seriously into the camera, modern meeting room with clean lines, glass walls, minimalistic furniture, soft natural light entering through floor-to-ceiling windows, only notebooks and water glasses on the table, candid and documentary feel, slightly desaturated tones, visible muscle tension in forearms, natural skin texture, slight imperfections, subtle stubble, muted background activity, handheld shot with off-center framing, soft depth of field, slight motion blur in background, ISO 400, Canon EOS R5, 50mm f/1.2 lens, warm daylight, light film grain, in the style of Peter Lindbergh\n",
    "fastMode": false,
    "imageFormat": "png",
    "outputCount": 1,
    "guidanceLevel": 3,
    "mainLoraScale": 1,
    "outputQuality": 80,
    "promptIntensity": 0.8,
    "approxMegapixels": "1",
    "imageAspectRatio": "1:1",
    "inferenceStepCount": 28,
    "additionalLoraScale": 1
}

Output

The output of this action typically returns an array of image URLs in the specified format. Here’s an example of what you might receive:

Example Output:

[
    "https://assets.cognitiveactions.com/invocations/dd7c3044-9871-4cb4-b89e-e8d287a02662/3a8d8c54-4b54-44a7-a880-7bb479a05d6b.png"
]

Conceptual Usage Example (Python)

Here’s a conceptual Python code snippet showing how to call the Cognitive Actions execution endpoint:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "5a105b40-2701-4502-a9eb-43530f2ec4f5" # Action ID for Generate Image with Inpainting and Customization

# Construct the input payload based on the action's requirements
payload = {
    "model": "dev",
    "prompt": "Philip in a light blue shirt, sleeves rolled up to the elbows, standing and leaning on a large wooden meeting table with both hands, looking seriously into the camera, modern meeting room with clean lines, glass walls, minimalistic furniture, soft natural light entering through floor-to-ceiling windows, only notebooks and water glasses on the table, candid and documentary feel, slightly desaturated tones, visible muscle tension in forearms, natural skin texture, slight imperfections, subtle stubble, muted background activity, handheld shot with off-center framing, soft depth of field, slight motion blur in background, ISO 400, Canon EOS R5, 50mm f/1.2 lens, warm daylight, light film grain, in the style of Peter Lindbergh\n",
    "fastMode": False,
    "imageFormat": "png",
    "outputCount": 1,
    "guidanceLevel": 3,
    "mainLoraScale": 1,
    "outputQuality": 80,
    "promptIntensity": 0.8,
    "approxMegapixels": "1",
    "imageAspectRatio": "1:1",
    "inferenceStepCount": 28,
    "additionalLoraScale": 1
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In the above code, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action ID for the "Generate Image with Inpainting and Customization" action is included, and the payload is constructed according to the required input schema.

Conclusion

The philipanheuser/philip Cognitive Actions provide an excellent opportunity for developers to leverage advanced image generation capabilities in their applications. By using the "Generate Image with Inpainting and Customization" action, you can create stunning visuals tailored to your specific needs. Whether you're looking to enhance user engagement with custom images or explore creative possibilities in your projects, these actions can be a valuable asset.

Consider experimenting with different prompts and parameters to optimize your results, and explore additional use cases to fully harness the power of these Cognitive Actions!