Streamline Receipt Processing with the Donut Cognitive Actions

In today's fast-paced digital landscape, automating data extraction from documents is essential for efficiency. The willywongi/donut API enables developers to harness the power of document understanding through its pre-built Cognitive Actions. One such action is designed to extract structured data from receipt images, which can significantly improve workflow automation in various applications.
Prerequisites
Before diving into the Donut Cognitive Actions, ensure you have the following:
- API Key: You'll need an API key to authenticate your requests to the Cognitive Actions platform.
- Setup: Familiarity with making HTTP requests and handling JSON data will be beneficial.
- Authentication: Typically, authentication is handled by including your API key in the headers of your requests.
Cognitive Actions Overview
Extract Receipt Data with Donut
The Extract Receipt Data with Donut action allows you to extract structured data from receipt images using the Donut 🍩 (Document Understanding Transformer) model. This model provides efficient information extraction without relying on traditional OCR techniques.
Input
This action requires a single input parameter:
- image (string, required): The URI of the input image to be processed. The URL must be accessible and point directly to the image file.
Example Input:
{
"image": "https://replicate.delivery/pbxt/IgCzf30UdmaTYhtlaifqpVA7V7nQf7a8muE6AWie2Fm5bNv3/sample_image_cord_test_receipt_00004.png"
}
Output
The output from this action is a structured JSON object containing the extracted data from the receipt. Here’s an example of what you might expect:
Example Output:
{
"menu": [
{"nm": "ICE BLAOKCOFFE", "cnt": "2", "price": "82,000"},
{"nm": "AVOCADO COFFEE", "cnt": "1", "price": "61,000"},
{"nm": "Oud CHINEN KATSU FF", "cnt": "1", "price": "51,000"}
],
"sub_total": {
"subtotal_price": "194,000",
"discount_price": "19,400"
},
"total": {
"total_price": "174,600",
"cashprice": "200,000",
"changeprice": "25,400"
}
}
Conceptual Usage Example (Python)
Here’s a conceptual example of how you might call the Extract Receipt Data action using Python. This example demonstrates structuring the input payload correctly.
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "d83909cc-47a6-4fe7-a0df-7bcdec907132" # Action ID for Extract Receipt Data with Donut
# Construct the input payload based on the action's requirements
payload = {
"image": "https://replicate.delivery/pbxt/IgCzf30UdmaTYhtlaifqpVA7V7nQf7a8muE6AWie2Fm5bNv3/sample_image_cord_test_receipt_00004.png"
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In the code above:
- Replace
YOUR_COGNITIVE_ACTIONS_API_KEYwith your actual API key. - The
action_idis set to the ID for the Extract Receipt Data action. - The
payloadis structured based on the required input schema.
Conclusion
The Extract Receipt Data with Donut action simplifies the process of extracting structured information from receipt images, making it a powerful tool for developers looking to enhance their applications. With its efficient approach to document understanding, this Cognitive Action can save time and reduce manual data entry errors.
As a next step, consider exploring additional actions within the Donut API to further enhance your document processing capabilities!