Skip to main content

OCR For Business

AI-powered OCR technology by Shufti Pro helps businesses save operational costs by extracting relevant information from documents in real-time with an accuracy of up to 90%. The instant image-to-text OCR supports multilingual documents and offers global coverage. Shufti Pro’s intelligent OCR services provide businesses with optimized data extraction.


info

The Business OCR ‘service call’ is performed independently without including other services.


Ocr for Business includes two parts:

  1. Training of the Model
  2. Data extraction using Trained models

Training of the Model

For training of the model client needs to follow these steps in backoffice under section OCR FOR BUSINESS:

  • Upload sample document(s) to train the model.
  • Select and name the data fields that the client wishes to extract from that document.
  • Provide a name to the model and train it.
tip

Once the model is trained the client can start data extraction right away or test the trained model in the backoffice.


training_of_the_model

Data extraction using Trained models

For data extraction clients need to send the API request to the server with the following parameters:


ParametersDescription
model_nameRequired: Yes
Type: string
A valid model name provided by the client during the model training phase.
Example: (invoice_sales)
proofRequired: Yes
Type: string
Image Format: JPG, JPEG, PNG, PDF
Maximum image size: 16 MB

In response to API call client will receive JSON response with event of request.received and Shufti Pro will extract required data and send the extracted data to provided callback url (The data is also updated in the client’s backoffice. The client can access that data anytime).


data_extraction_using_trained_models

//POST /service/ocr_for_business/extraction HTTP/1.1
//Host: api.shuftipro.com
//Content-Type: application/json
//Authorization: Basic NmI4NmIyNzNmZjM0ZmNlMTlkNmI4WJRTUxINTJHUw==
//replace "Basic" with "Bearer in case of Access Token"

{
"reference" : "1234567",
"business_ocr" : {
"model_name" : "invoice_sales",
"proof":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAALCAYAAABCm8wlAAAABmJLR0QA/wD/AP+gvaeTAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAB3RJTUUH4QoPAxIb88htFgAAABl0RVh0Q29tbWVudABDcmVhdGVkIHdpdGggR0lNUFeBDhcAAACxSURBVBjTdY6xasJgGEXP/RvoonvAd8hDyD84+BZBEMSxL9GtQ8Fis7i6BkGI4DP4CA4dnQON3g6WNjb2wLd8nAsHWsR3D7JXt18kALFwz2dGmPVhJt0IcenUDVsgu91eCRZ9IOMfAnBvSCz8I3QYL0yV6zfyL+VUxKWfMJuOEFd+dE3pC1Finwj0HfGBeKGmblcFTIN4U2C4m+hZAaTrASSGox6YV7k+ARAp4gIIOH0BmuY1E5TjCIUAAAAASUVORK5CYII="
}
}

OCR-for-Business-response-sample-object
{
"reference": "17374217",
"event": "request.received",
"email": "[email protected]",
"country": "UK"
}

OCR-for-Business-callback-response-sample-object
//Content-Type: application/json
//Signature: NmI4NmIyNzNmZjM0ZmNl

{
"reference": "17374217",
"event": "verification.accepted",
"email": "[email protected]",
"country" : "GB",
"verification_result": {},
"verification_data": {
"business_ocr": {
"last_name": "Doe",
"first_name": "John",
"issue_date": "2018-01-31",
"expiry_date": "2028-01-30",
"nationality": "BRITSH CITIZEN",
"gender": "M",
"place_of_birth": "BRISTOL",
"document_number": "GB1234567",
}
},
"info": {
"agent": {
"is_desktop": true,
"is_phone": false,
"useragent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36",
"device_name": "Macintosh",
"browser_name": "",
"platform_name": "OS X - 10_14_0"
},
"geolocation": {
"host": "212.103.50.243",
"ip": "212.103.50.243",
"rdns": "212.103.50.243",
"asn": "9009",
"isp": "M247 Ltd",
"country_name": "Germany",
"country_code": "DE",
"region_name": "Hesse",
"region_code": "HE",
"city": "Frankfurt am Main",
"postal_code": "60326",
"continent_name": "Europe",
"continent_code": "EU",
"latitude": "50.1049",
"longitude": "8.6295",
"metro_code": "",
"timezone": "Europe/Berlin"
}
},
}