Create Stunning Images with Tencentarc's PhotoMaker Cognitive Actions

22 Apr 2025
Create Stunning Images with Tencentarc's PhotoMaker Cognitive Actions

In today's digital landscape, the demand for unique and visually appealing content is higher than ever. The Tencentarc/PhotoMaker Cognitive Actions allow developers to easily generate customized photos, paintings, and avatars from human images, utilizing the advanced capabilities of the PhotoMaker model. This integration not only enhances creativity but also saves time by providing pre-built actions to streamline the image generation process.

Prerequisites

Before you can start using the Tencentarc/PhotoMaker Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Basic familiarity with making HTTP requests and handling JSON data.

Authentication typically involves passing your API key in the request headers, allowing secure access to the Cognitive Actions services.

Cognitive Actions Overview

Generate Styled Images

Description:
Create customized photos, paintings, and avatars in various styles from human images using the PhotoMaker model. This action focuses on speed, quality, and accuracy improvements for stylized image generation.

Category: Image Generation

Input

The input schema for this action requires the following fields:

  • inputImage (required): The URI of the primary input image, typically a photo of a face.
  • prompt (optional): A textual description to guide the generation process. Use 'img' as a placeholder for features in the image.
  • styleName (optional): Select a style template to apply to the generated image (default is "(No style)").
  • guidanceScale (optional): Determines the influence of the prompt on the image (default is 5).
  • numberOfSteps (optional): The number of steps in the sampling process (default is 20).
  • negativePrompt (optional): Specify features to exclude from the generated image.
  • numberOfOutputs (optional): How many output images to generate (default is 1).
  • styleStrengthRatio (optional): Strength of the style application as a percentage (default is 20).
  • disableSafetyChecker (optional): If true, disables the safety checker for generated images (default is false).

Example Input:

{
  "prompt": "A girl img riding dragon over a whimsical castle, 3D CGI, art by Pixar, half-body, screenshot from animation",
  "styleName": "(No style)",
  "inputImage": "https://replicate.delivery/pbxt/KFRveCbE71qFTQGSF509CXYC16qB1bcZmAWq8O172ael04Ga/lenna.jpg",
  "guidanceScale": 5,
  "numberOfSteps": 50,
  "negativePrompt": "realistic, photo-realistic, worst quality, greyscale, bad anatomy, bad hands, error, text",
  "numberOfOutputs": 2,
  "styleStrengthRatio": 35
}

Output

The action returns an array of URLs pointing to the generated images. Each element in the array represents a different output image based on your input parameters.

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/65be8e9b-d51a-4b51-aab5-fcc487ff70aa/bd3f856c-7ec3-417d-a098-ee57baec0f2d.png",
  "https://assets.cognitiveactions.com/invocations/65be8e9b-d51a-4b51-aab5-fcc487ff70aa/ceaa66eb-9ba0-4c13-8bf5-79f30b540deb.png"
]

Conceptual Usage Example (Python)

Here’s how you might invoke the Generate Styled Images action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "8c459ec9-fa13-4b7d-922d-e9baf1241f61"  # Action ID for Generate Styled Images

# Construct the input payload based on the action's requirements
payload = {
    "prompt": "A girl img riding dragon over a whimsical castle, 3D CGI, art by Pixar, half-body, screenshot from animation",
    "styleName": "(No style)",
    "inputImage": "https://replicate.delivery/pbxt/KFRveCbE71qFTQGSF509CXYC16qB1bcZmAWq8O172ael04Ga/lenna.jpg",
    "guidanceScale": 5,
    "numberOfSteps": 50,
    "negativePrompt": "realistic, photo-realistic, worst quality, greyscale, bad anatomy, bad hands, error, text",
    "numberOfOutputs": 2,
    "styleStrengthRatio": 35
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this conceptual example, the developer sets up a request to the Cognitive Actions API, specifying the action ID and structuring the input payload based on the requirements of the Generate Styled Images action. The endpoint URL and request structure are illustrative, so be sure to adapt them according to your specific implementation.

Conclusion

The Tencentarc/PhotoMaker Cognitive Actions provide a powerful way to generate stunning images tailored to your creative needs. With the ability to customize styles and prompts, developers can enhance their applications' visual content effortlessly. To explore further, consider integrating this action into your projects or experimenting with different styles and prompts for unique creations. Happy coding!