Transform Receipt Images into JSON with sulthonmb/ocr-receipt Cognitive Actions

In the realm of digital transformation, the ability to convert physical documents into structured data is invaluable. The sulthonmb/ocr-receipt API offers a powerful Cognitive Action that leverages Optical Character Recognition (OCR) technology to convert receipt images into structured JSON format. This capability enables developers to easily extract and process important data from receipts, streamlining tasks like expense tracking and financial analysis.
Prerequisites
Before you dive into integrating the sulthonmb/ocr-receipt Cognitive Action, ensure you have the following:
- An API key for the Cognitive Actions platform.
- A valid endpoint for accessing the Cognitive Actions API.
- Basic knowledge of handling HTTP requests and JSON data.
For authentication, you will typically need to pass your API key in the request headers. This allows the platform to verify your access and manage your usage effectively.
Cognitive Actions Overview
Convert Receipt Image to JSON
The Convert Receipt Image to JSON action is designed to transform a receipt image into a structured JSON format. This structured output makes it easier to extract detailed information such as item names, quantities, prices, and totals.
- Category: document-ocr
Input
This action requires the following input:
- image (required): A URI pointing to the receipt image. This must be a valid URL.
Example Input:
{
"image": "https://replicate.delivery/pbxt/JBOFOmUfX5CA1S8SSgXT0SpvuYLDchyni4Kgk8yD7srz7EOo/image.jpeg"
}
Output
The output of this action is a structured JSON object that includes detailed information extracted from the receipt, including:
- A list of menu items with names, quantities, and prices.
- Total price, subtotal, tax, and service charges.
Example Output:
{
"menu": [
{
"nm": "Nasi Campur Bali",
"cnt": "1 x",
"price": "75,000"
},
{
"nm": "Bbk Bengil Nasi",
"cnt": "1 x",
"price": "125,000"
},
// Additional items...
],
"total": {
"total_price": "1,591,600"
},
"sub_total": {
"etc": "-45",
"tax_price": "144,695",
"service_price": "100,950",
"subtotal_price": "1,346,000"
}
}
Conceptual Usage Example (Python)
Here’s a conceptual example of how you might invoke the Convert Receipt Image to JSON action using Python. This snippet focuses on structuring the input JSON payload correctly.
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "c1a64c3a-0add-455e-8ca1-ad359be1b2f3" # Action ID for Convert Receipt Image to JSON
# Construct the input payload based on the action's requirements
payload = {
"image": "https://replicate.delivery/pbxt/JBOFOmUfX5CA1S8SSgXT0SpvuYLDchyni4Kgk8yD7srz7EOo/image.jpeg"
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this code snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The input payload is constructed according to the required schema, and the request is sent to the hypothetical endpoint.
Conclusion
The sulthonmb/ocr-receipt Cognitive Action simplifies the process of digitizing receipts, transforming them into structured JSON for easy data extraction. By utilizing this technology, developers can enhance their applications with features that automate receipt processing and improve overall user experience. Consider exploring additional use cases, such as integrating this action into expense management systems or financial applications. Happy coding!