Skip to main content
POST
/
transformations
OCR
curl --request POST \
  --url https://api.gonitro.dev/transformations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form method=ocr \
  --form 'file=<string>' \
  --form 'params={
  "language": "french",
  "quality": "high",
  "isOutputPDFEditable": false,
  "compressionLevel": "low",
  "PDFVersion": "pdf17",
  "pageIndices": [
    0,
    1,
    2
  ]
}' \
  --form file.0='@example-file'
{
  "file": {
    "URL": "<string>",
    "contentType": "application/json",
    "metadata": {
      "fileSizeBytes": 123,
      "pageCount": 123
    }
  }
}
Apply OCR (Optical Character Recognition) to a scanned or image-based PDF, producing a searchable and selectable text layer without altering the document’s visual appearance. This is useful for making archived scans, fax documents, or photo-captured pages fully text-searchable and accessible.
OCR processes the entire document by default. Use pageIndices to target specific pages and reduce processing time on large files.

Parameters

ParameterTypeDefaultDescription
languagestringenglishLanguage of the text to recognise. Choosing the correct language improves accuracy significantly. See supported languages below.
qualitystringhighOCR recognition quality. high is recommended for most documents; low trades accuracy for speed.
isOutputPDFEditablebooleanfalseWhen true, the recognised text is embedded as editable content rather than an invisible search layer.
compressionLevelstringlowCompression applied to the output PDF. low preserves maximum image fidelity; high reduces file size.
PDFVersionstringpdf17PDF specification version of the output file. Defaults to PDF 1.7.
pageIndicesinteger[](all pages)Zero-based indices of the pages to process. Omit or pass null to process the entire document.

Supported languages

ValueLanguage
englishEnglish
germanGerman
frenchFrench
spanishSpanish
italianItalian
finnishFinnish
swedishSwedish
danishDanish
norwegianNorwegian
dutchDutch
portuguesePortuguese
brazilianBrazilian Portuguese
Only one language can be specified per request. If your document contains mixed languages, choose the dominant language for best results.

Output File Format

The endpoint can return output either as JSON or as a binary file. The format depends on the Accept header (details below), which defaults to application/json.

Processing

When requesting JSON, you can run the operation synchronously or asynchronously. This is determined by the Prefer header (details below).
  • In sync mode, the response includes a URL pointing to the processed file.
  • In async mode, the request creates a Job, and the response contains the Job ID and status so you can track progress.
Binary (octet-stream) responses are only available for synchronous operations.

Custom File Delivery

The endpoint supports custom file-delivery destinations through the optional delivery parameter. You can provide an upload target, such as your own PUT endpoint or a pre-signed S3 URL, and Nitro will upload the resulting file there. This works for both synchronous and asynchronous processing.
  • Sync delivery

    In synchronous calls, the delivery parameter lets you direct Nitro to upload the output file to a custom URL endpoint or a pre-signed URL (e.g S3), by providing an upload url in the uploadResultTo or uploadResultsTo properties.

    Custom endpoint

    If implementing the upload endpoint by yourself, make sure your code or middleware configuration accepts requests without content-type headers.

    S3 delivery

    If you are using S3 to manage delivery uploads, follow this AWS documentation to generate a pre-signed PUT URL.
    If using the AWS provided Python script, omit the Content-Type in Params to get the pre-signed url. For example:
        url = generate_presigned_url(
            s3_client,
            "put_object",
            { 
                "Bucket": args.bucket, 
                "Key": args.key,
                # Content-Type: "application/octet-stream" => Omit!
            },
        1000)
    
  • Async delivery

    In asynchronous flows, you can also provide a custom URL or pre-signed S3 object via uploadResultTo or uploadResultsTo, to upload your file(s) once the Job is done processing.

    Callback

    For asynchronous processing, you can also include a callback URL within the delivery parameter. This callback is a POST endpoint that Nitro will call once the Job is created and running, providing details about the file-processing job. Example of Nitro’s callback request body:
    {
        "jobID": "babe2aa7-9b5d-4eb2-a679-5fc12cf0a490",
        "location": "https://api.gonitro.dev/jobs/babe2aa7-9b5d-4eb2-a679-5fc12cf0a490"
    }
    

Response behavior Matrix

This matrix shows the expected response behavior based on content type, sync/async mode, and custom file-delivery settings.
JSON (application/json)Binary (application/octet-stream)
SynchronousReturns a JSON object with a file(s) URL(s).Returns the processed file directly as binary.
Asynchronous (respond-async)Returns a JSON object with a Job ID and status.Async preference ignored, returns sync binary.
Synchronous deliveryFile delivered to custom endpoint / Bucket, returns success confirmation.N/A
Asynchronous deliveryReturns a JSON object with a Job ID and status. The file(s) will be uploaded to the provided PUT endpoint / S3 Bucket at the end of process. If callback url is provided, Nitro will notify the endpoint with a JOB ID and location.N/A

Limits

The Platform API has the following limits:
  • File size: Maximum of 100 MB per request. This applies to single-file and multi-file requests.
  • Page count: Maximum of 500 pages per individual document. This applies to single-file and multi-file requests. Multiple documents may exceed 500 pages in total.
  • Retention time: Inputs and outputs are deleted approximately 15 minutes after the operation completes.

Request

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

Accept
enum<string>
default:*/*

Controls response format and behavior. See endpoint description above for detailed response combinations.

  • application/json: Returns JSON response with operation result
  • application/octet-stream: Returns binary file content
  • */*: Defaults to JSON response
Available options:
application/json,
application/octet-stream,
*/*
Prefer
enum<string>

Controls synchronous vs asynchronous operation. See endpoint description above for behavior details.

  • respond-async: Makes request asynchronous, returns job status for polling
  • No value: Synchronous response
Available options:
respond-async

Body

multipart/form-data
method
enum<string>
default:ocr
required

The Transformations' endpoint method: ocr

Available options:
ocr
file
required

The file to process. It can be provided as a binary upload or as a JSON remote file reference.

params
OCR · object
required
delivery
object

This endpoint lets you supply your own URL to receive the single-file output. The URL may point to a custom API endpoint or a pre-signed S3 URL.

The HTTP method defaults to PUT, but you can change it based on your implementation needs via the verb parameter. You can also provide custom headers, such as authentication headers or any others required by your endpoint.

Response

Returns either JSON or binary output depending on the Accept header (defaults to JSON). JSON responses include a file URL for synchronous tasks or a job status for asynchronous tasks.

file
object