Tables
Extract Tables from PDF document
title: The title of the table (if detected).cells: A 2D array of string values. Indexed ascells[row][column].footers: An array of table footer notes or annotations (if available).confidences: Confidence scores (0 to 1) for each extracted value in thecellsarray.averageConfidence: The mean confidence score for all cell values within the table.headerCells: Coordinates of detected header cells, provided as[[row, column], ...].summaryCells: Coordinates of summary cells, provided as[[row, column], ...].
Uniform Grid
To ensure a consistent and predictable output format, the Extract Tables method processes tables using a uniform grid of cells.Original PDF

Uniform Grid Division

Handling Merged Cells
When a table in a PDF contains merged cells, the content is mapped to the cells array based on their position within the uniform grid. Vertical Overlap (Rows) When a cell merges multiple rows, the remaining indices covered by the original merge are returned as empty placeholders ("") in separate rows, to maintain the grid’s structural integrity. You can see it in thecells field in the example below:
Merged rows PDF

Exported rows

Original PDF

Uniform Grid Division

Output File Format
The endpoint can return output either as JSON or as a binary file. The format depends on theAccept header (details below), which defaults to application/json.
Processing
When requesting JSON, you can run the operation synchronously or asynchronously. This is determined by thePrefer header (details below).
- In sync mode, the response includes a URL pointing to the processed file.
- In async mode, the request creates a Job, and the response contains the Job ID and status so you can track progress.
Custom File Delivery
The endpoint supports custom file-delivery destinations through the optional delivery parameter. You can provide an upload target, such as your own PUT endpoint or a pre-signed S3 URL, and Nitro will upload the resulting file there. This works for both synchronous and asynchronous processing.-
Sync delivery
In synchronous calls, the delivery parameter lets you direct Nitro to upload the output file to a custom URL endpoint or a pre-signed URL (e.g S3), by providing an upload url in theuploadResultTooruploadResultsToproperties.Custom endpoint
S3 delivery
If you are using S3 to manage delivery uploads, follow this AWS documentation to generate a pre-signed PUT URL. -
Async delivery
In asynchronous flows, you can also provide a custom URL or pre-signed S3 object viauploadResultTooruploadResultsTo, to upload your file(s) once the Job is done processing.Callback
For asynchronous processing, you can also include a callback URL within the delivery parameter. This callback is a POST endpoint that Nitro will call once the Job is created and running, providing details about the file-processing job. Example of Nitro’s callback request body:
Response behavior Matrix
This matrix shows the expected response behavior based on content type, sync/async mode, and custom file-delivery settings.JSON (application/json) | Binary (application/octet-stream) | |
|---|---|---|
| Synchronous | Returns a JSON object with a file(s) URL(s). | Returns the processed file directly as binary. |
Asynchronous (respond-async) | Returns a JSON object with a Job ID and status. | Async preference ignored, returns sync binary. |
| Synchronous delivery | File delivered to custom endpoint / Bucket, returns success confirmation. | N/A |
| Asynchronous delivery | Returns a JSON object with a Job ID and status. The file(s) will be uploaded to the provided PUT endpoint / S3 Bucket at the end of process. If callback url is provided, Nitro will notify the endpoint with a JOB ID and location. | N/A |
Limits
The Platform API has the following limits:- File size: Maximum of 100 MB per request. This applies to single-file and multi-file requests.
- Page count: Maximum of 500 pages per individual document. This applies to single-file and multi-file requests. Multiple documents may exceed 500 pages in total.
- Retention time: Inputs and outputs are deleted approximately 15 minutes after the operation completes.
Request
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Headers
Controls response format and behavior. See endpoint description above for detailed response combinations.
application/json: Returns JSON response with operation resultapplication/octet-stream: Returns binary file content*/*: Defaults to JSON response
application/json, application/octet-stream, */* Controls synchronous vs asynchronous operation. See endpoint description above for behavior details.
respond-async: Makes request asynchronous, returns job status for polling- No value: Synchronous response
respond-async Body
The Extractions' endpoint method: extract-tables
extract-tables The file to process. It can be provided as a binary upload or as a JSON remote file reference.
This endpoint lets you supply your own URL to receive the single-file output. The URL may point to a custom API endpoint or a pre-signed S3 URL.
The HTTP method defaults to PUT, but you can change it based on your implementation needs via the verb parameter. You can also provide custom headers, such as authentication headers or any others required by your endpoint.
Response
Returns either JSON or binary output depending on the Accept header (defaults to JSON). JSON responses include a file URL for synchronous tasks or a job status for asynchronous tasks.
- Sync - Tables
- Async - Job