Unlocking Multilingual Capabilities with the Multilingual-E5-Large-Instruct Cognitive Actions

21 Apr 2025
Unlocking Multilingual Capabilities with the Multilingual-E5-Large-Instruct Cognitive Actions

In today's interconnected world, applications often need to process and understand multiple languages. The Multilingual-E5-large-instruct API provides powerful Cognitive Actions designed to facilitate multilingual text processing. These pre-built actions enable developers to generate language embeddings optimized for semantic similarity and text retrieval tasks, supporting up to 100 languages. By leveraging these actions, developers can enhance their applications' capabilities in understanding and generating text across diverse languages.

Prerequisites

Before you can start using the Cognitive Actions, ensure you have:

  • An API key for the Cognitive Actions platform.
  • Familiarity with making HTTP requests in your programming language of choice.

Conceptually, authentication typically involves passing the API key in the request headers to access the actions securely.

Cognitive Actions Overview

Embed Text with Custom Instructions

This action generates language embeddings using the Multilingual-E5-large-instruct model, tailored for diverse languages with specific query instructions. It is particularly useful for tasks involving semantic similarity and text retrieval.

  • Category: Text Embedding

Input: The action requires the following fields in the input schema:

  • textList (required): A JSON list of strings containing the texts to embed.
  • batchSize (optional): An integer defining the number of text items to process in a single batch (default is 32).
  • normalizeEmbeddings (optional): A boolean indicating whether to normalize embeddings (default is true).

Example Input:

{
  "textList": "[\"As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 is 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're expecting or training for a marathon. Check out the chart below to see how much protein you should be eating each day.\", \"Instruct: Given a web search query, retrieve relevant passages that answer the query\\nQuery: how much protein should a female eat\"]",
  "batchSize": 32,
  "normalizeEmbeddings": true
}

Output: The action returns a list of embeddings, where each embedding is a list of floating-point numbers. Here is a snippet of the output:

[
  [0.03266950324177742, 0.004132503643631935, -0.05025511234998703, ...],
  [0.020185844972729683, 0.011204423382878304, -0.04514245316386223, ...]
]

Each array corresponds to the embedding of a text input from textList.

Conceptual Usage Example (Python): Here’s how you can use the Embed Text with Custom Instructions action in Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "7cfbcacb-1dbb-4558-ac58-bb36ecb2b399" # Action ID for "Embed Text with Custom Instructions"

# Construct the input payload based on the action's requirements
payload = {
    "textList": "[\"As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 is 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're expecting or training for a marathon. Check out the chart below to see how much protein you should be eating each day.\", \"Instruct: Given a web search query, retrieve relevant passages that answer the query\\nQuery: how much protein should a female eat\"]",
    "batchSize": 32,
    "normalizeEmbeddings": true
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this example, replace the YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The payload is structured according to the action's input schema, including the text list to embed, the batch size, and whether to normalize the embeddings.

Conclusion

The Multilingual-E5-large-instruct Cognitive Actions provide a powerful means for developers to integrate multilingual capabilities into their applications. By utilizing the Embed Text with Custom Instructions action, you can enhance your application's ability to understand and process text across various languages, ultimately leading to improved user experiences. Explore various use cases, such as enhancing search functionality or optimizing content recommendations, to fully leverage these capabilities in your projects!