OCR Inferencer Node

Overview

The OCR Inferencer node provides specialized Optical Character Recognition capabilities using PaddleOCR models in Docker containers. It supports full OCR pipelines, text detection, text recognition, and document orientation detection for processing documents and images containing text.

Key Features

Multiple OCR tasks: Full OCR, detection only, recognition only, document rotation
PaddleOCR integration: Industry-leading OCR accuracy and speed
Multi-language support: Configurable detection and recognition models
Document preprocessing: Unwarping, orientation correction, text line rotation
Docker-based execution: Isolated Python OCR server
Promise mode: Asynchronous batch processing
Debug visualization: Display OCR results on canvas
Auto-warmup: Pre-warm models for faster first requests
Configurable thresholds: Fine-tune detection and recognition parameters

Architecture

Components

Node-RED Node: Configuration and message handling
Docker Container: PaddleOCR Python server
gRPC Protocol: Communication layer
PaddleOCR Models: Detection and recognition models

Supported Tasks

Task	Description	Output
OCR	Full pipeline (detect + recognize)	Text boxes with recognized text
Detection	Find text regions only	Text box coordinates
Recognition	Recognize text from images	Text strings
Document Rotation	Detect document orientation	Rotation angle (0°, 90°, 180°, 270°)

Configuration

Settings Tab

Name

Type: String
Optional: Yes
Description: Display name for the node

Task

Type: Select
Options: OCR, Detection, Recognition, Document Rotation
Default: OCR
Description: Type of OCR operation to perform

Input Field

Type: Message property path
Default: payload
Description: Message field containing input images

Output Field

Type: Message property path
Default: payload
Description: Where OCR results will be stored

Detection Model

Type: Select
Visible: OCR and Detection tasks
Options:
- en_PP-OCRv3_det: English optimized
- ch_PP-OCRv4_det: Chinese optimized
- ml_PP-OCRv3_det: Multi-language
Description: Model for text detection

Recognition Model

Type: Select
Visible: OCR and Recognition tasks
Options:
- en_PP-OCRv4_rec: English optimized
- ch_PP-OCRv4_rec: Chinese optimized
- latin_PP-OCRv3_rec: Latin languages
- arabic_PP-OCRv3_rec: Arabic
- cyrillic_PP-OCRv3_rec: Cyrillic
- korean_PP-OCRv3_rec: Korean
- japan_PP-OCRv3_rec: Japanese
Description: Model for text recognition

Use Document Unwarping

Type: Checkbox
Visible: OCR task only
Default: Disabled
Description: Straighten curved/distorted documents

Use Document Orientation

Type: Checkbox
Visible: OCR task only
Default: Disabled
Description: Correct document rotation (0°, 90°, 180°, 270°)

Use Text Line Orientation

Type: Checkbox
Visible: OCR task only
Default: Disabled
Description: Correct individual text line rotation

Promise Mode

Type: Checkbox
Default: Disabled
Description: Return promises for async processing

Promises Field

Type: Message property path
Default: promises
Visible: When Promise enabled

Show Debug Image

Type: Checkbox
Default: Disabled

Debug Interval

Type: Number
Default: 1

Debug Image Width

Type: Number
Default: 200

JSON Config Tab

Advanced PaddleOCR configuration:

OCR Task Configuration

json

{
  "common": {
    "model_name": "PADDLE",
    "config_type": "predict",
    "task": "OCR",
    "device": "auto",
    "verbose": false,
    "max_batch": 100
  },
  "task_specific": {
    "det_model": "en_PP-OCRv3_det",
    "rec_model": "en_PP-OCRv4_rec"
  },
  "model_specific": {
    "required": {
      "use_doc_unwarping": false,
      "use_doc_orientation_classify": false,
      "use_textline_orientation": false
    }
  }
}

Detection Task Configuration

json

{
  "common": {
    "model_name": "PADDLE",
    "task": "Detection"
  },
  "task_specific": {
    "det_model": "en_PP-OCRv3_det"
  }
}

Recognition Task Configuration

json

{
  "common": {
    "model_name": "PADDLE",
    "task": "Recognition"
  },
  "task_specific": {
    "rec_model": "en_PP-OCRv4_rec"
  }
}

Document Rotation Task Configuration

json

{
  "common": {
    "model_name": "PADDLE",
    "task": "Document_Orientation"
  }
}

Note: Document Rotation is a classification task and does NOT accept OCR preprocessing parameters.

Image Format

Same as Inferencer node:

Rosepetal Bitmap Format

javascript

{
  width: 1920,
  height: 1080,
  data: Buffer,
  colorSpace: "RGB",  // or "GRAY", "BGR"
  channels: 3,
  dtype: "uint8"
}

JPEG/PNG Buffers

javascript

msg.payload = imageBuffer;

Array of Images

javascript

msg.payload = [image1, image2, image3];

Input

Basic OCR

javascript

msg.payload = {
  width: 1024,
  height: 768,
  data: documentImageBuffer,
  colorSpace: "RGB"
};
return msg;

Document Rotation Detection

javascript

// Configure node with Task: Document Rotation
msg.payload = documentImage;
return msg;

Batch Processing

javascript

msg.payload = [doc1, doc2, doc3];
return msg;

Output

OCR Results

javascript

{
  payload: [
    {
      text: "Hello World",
      confidence: 0.98,
      box: {
        points: [[x1, y1], [x2, y2], [x3, y3], [x4, y4]],
        xyxy: [x_min, y_min, x_max, y_max]
      }
    },
    {
      text: "Sample Text",
      confidence: 0.95,
      box: {
        points: [[x1, y1], [x2, y2], [x3, y3], [x4, y4]],
        xyxy: [x_min, y_min, x_max, y_max]
      }
    }
  ],
  performance: {
    payload: {
      inferenceTime: 150.5,
      preprocessTime: 12.3,
      postprocessTime: 8.7,
      totalTime: 171.5
    }
  }
}

Detection Results

javascript

{
  payload: [
    {
      box: {
        points: [[100, 50], [300, 50], [300, 80], [100, 80]],
        xyxy: [100, 50, 300, 80]
      }
    },
    {
      box: {
        points: [[100, 100], [400, 100], [400, 140], [100, 140]],
        xyxy: [100, 100, 400, 140]
      }
    }
  ]
}

Recognition Results

javascript

{
  payload: [
    {
      text: "Recognized text line 1",
      confidence: 0.97
    },
    {
      text: "Recognized text line 2",
      confidence: 0.89
    }
  ]
}

Document Rotation Results

javascript

{
  payload: {
    rotation: 90,        // 0, 90, 180, or 270
    confidence: 0.99
  }
}

Usage Examples

Example 1: Extract Text from Document

javascript

// Function node: Load document
const fs = require('fs');
msg.payload = fs.readFileSync('/path/to/document.jpg');
return msg;

OCR Inferencer configuration:

Task: OCR
Detection: en_PP-OCRv3_det
Recognition: en_PP-OCRv4_rec

javascript

// Function node: Extract all text
const allText = msg.payload.map(result => result.text).join('\n');
msg.payload = allText;
return msg;

Example 2: Document Preprocessing Pipeline

[Load Image] → [OCR: Document Rotation] → [Function: Rotate] → [OCR: Full OCR] → [Output]

Step 1: Detect rotation

javascript

// OCR Inferencer (Task: Document Rotation)
msg.rotation = msg.payload.rotation;
msg.payload = msg.originalImage;
return msg;

Step 2: Rotate image

javascript

// Function node: Rotate based on detected angle
const sharp = require('sharp');
msg.payload = await sharp(msg.payload)
  .rotate(-msg.rotation)  // Counter-rotate
  .toBuffer();
return msg;

Step 3: Extract text

javascript

// OCR Inferencer (Task: OCR)
msg.extractedText = msg.payload.map(r => r.text);
return msg;

Example 3: Multi-Language Documents

javascript

// English section
msg.englishPart = {
  width: 800,
  height: 400,
  data: cropEnglishRegion(imageBuffer)
};

// Arabic section
msg.arabicPart = {
  width: 800,
  height: 400,
  data: cropArabicRegion(imageBuffer)
};

Two OCR nodes with different models:

Node 1: english detection + recognition
Node 2: arabic detection + recognition

Example 4: Table Detection and OCR

javascript

// Step 1: Detect text regions
// OCR Inferencer (Task: Detection)
msg.textBoxes = msg.payload;
msg.payload = originalImage;
return msg;

javascript

// Step 2: Filter boxes by position (table columns)
const column1 = msg.textBoxes.filter(box =>
  box.box.xyxy[0] < 200  // X position < 200px
);

const column2 = msg.textBoxes.filter(box =>
  box.box.xyxy[0] >= 200 && box.box.xyxy[0] < 400
);

javascript

// Step 3: Crop and recognize each region
const crops = column1.map(box => cropImage(originalImage, box.box.xyxy));
msg.payload = crops;
return msg;

OCR Inferencer (Task: Recognition)

Example 5: Confidence Filtering

javascript

// Function node: Filter low confidence results
const minConfidence = 0.85;

msg.highConfidence = msg.payload.filter(result =>
  result.confidence >= minConfidence
);

msg.lowConfidence = msg.payload.filter(result =>
  result.confidence < minConfidence
);

// Log low confidence for review
if (msg.lowConfidence.length > 0) {
  node.warn(`${msg.lowConfidence.length} low confidence results`);
}

msg.payload = msg.highConfidence;
return msg;

Example 6: Batch Document Processing

javascript

// Function node: Load all documents
const fs = require('fs');
const files = fs.readdirSync('/documents/inbox');

msg.payload = files
  .filter(f => f.endsWith('.jpg') || f.endsWith('.png'))
  .map(f => fs.readFileSync(`/documents/inbox/${f}`));

msg.fileNames = files;
return msg;

OCR Inferencer (Promise mode enabled)

javascript

// Promise Reader resolves all
msg.ocrResults = await Promise.all(msg.promises);

// Combine with filenames
msg.documents = msg.fileNames.map((name, idx) => ({
  fileName: name,
  text: msg.ocrResults[idx].map(r => r.text).join('\n')
}));

return msg;

Performance Optimization

Model Selection

For speed:

Detection: en_PP-OCRv3_det (faster)
Recognition: en_PP-OCRv4_rec

For accuracy:

Detection: ch_PP-OCRv4_det (more accurate)
Recognition: Match to language

Preprocessing

Enable for challenging documents:

Document unwarping: Curved pages, photos of documents
Document orientation: Scanned documents in wrong rotation
Text line orientation: Mixed orientation text

Disable for clean documents:

All preprocessing OFF for scanned documents
Faster processing, less overhead

Image Preparation

Before OCR:

javascript

const sharp = require('sharp');

// Enhance contrast
msg.payload = await sharp(msg.payload)
  .normalize()
  .toBuffer();

// Increase resolution
msg.payload = await sharp(msg.payload)
  .resize(2000, 2000, { fit: 'inside', withoutEnlargement: true })
  .toBuffer();

Batch Size

Optimal batch sizes:

Document rotation: 10-20 images
Detection only: 10-15 images
Full OCR: 5-10 images
Recognition only: 15-25 images

Docker Container Management

Container Lifecycle

Similar to Inferencer node:

Auto-starts on deploy
Ready indicator in node status
Clean removal on redeploy

Memory Usage

Per container (approximate):

Detection model: 500MB-1GB
Recognition model: 500MB-1GB
Full OCR: 1.5GB-2.5GB
Document rotation: 300MB-500MB

Container Logs

bash

docker logs <ocr-container-id>

Available Models

Detection Models

Model	Language	Version	Use Case
`en_PP-OCRv3_det`	English	v3	English documents, fast
`ch_PP-OCRv4_det`	Chinese	v4	Chinese/mixed, accurate
`ml_PP-OCRv3_det`	Multi-lang	v3	Multiple languages

Recognition Models

Model	Language	Version	Use Case
`en_PP-OCRv4_rec`	English	v4	English text, latest
`ch_PP-OCRv4_rec`	Chinese	v4	Chinese characters
`latin_PP-OCRv3_rec`	Latin	v3	European languages
`arabic_PP-OCRv3_rec`	Arabic	v3	Arabic script
`cyrillic_PP-OCRv3_rec`	Cyrillic	v3	Russian, etc.
`korean_PP-OCRv3_rec`	Korean	v3	Korean characters
`japan_PP-OCRv3_rec`	Japanese	v3	Japanese text

Error Handling

Common Errors

"No text detected"

Cause: Image too low quality, wrong preprocessing
Solution: Enhance image contrast, adjust preprocessing

"Low confidence results"

Cause: Poor image quality, wrong recognition model
Solution: Improve image quality, select correct language model

"Container startup failed"

Cause: Insufficient memory, model download failed
Solution: Check available memory, verify model files

"Invalid task configuration"

Cause: Document rotation with OCR preprocessing params
Solution: Remove preprocessing params from Document_Orientation task

Debugging

Check text detection:

Use Detection task first
Visualize detected boxes
Verify boxes capture all text

Check recognition:

Extract detected regions manually
Test with Recognition task
Validate model matches language

Enable debug visualization:

Shows detected boxes on image
Displays recognized text
Helps identify issues

Best Practices

Image Quality

Minimum resolution: 100 DPI for good results
Contrast: High contrast between text and background
Lighting: Even illumination, no shadows
Focus: Sharp, clear text

Model Selection

Match model to language: Don't use English model for Chinese
Use latest versions: v4 models generally better than v3
Test on sample data: Validate before production

Preprocessing

Start simple: Try without preprocessing first
Add selectively: Enable only needed features
Test impact: Measure accuracy improvement vs speed cost

Confidence Thresholds

Set appropriate filters: 0.8-0.9 for critical applications
Log low confidence: Review for quality issues
Adjust per use case: Medical records need higher threshold than labels

Batch Processing

Group similar documents: Same language, same orientation
Use promise mode: For large batches
Monitor memory: Don't exceed capacity

Troubleshooting

Poor OCR Accuracy

Possible causes:

Wrong language model
Poor image quality
Incorrect preprocessing

Solutions:

Verify model matches text language
Enhance image (contrast, resolution)
Try different preprocessing combinations

Missing Text

Possible causes:

Text too small
Low contrast
Unusual fonts

Solutions:

Increase image resolution
Enhance contrast
Use multi-language detection model

Slow Processing

Possible causes:

Too many preprocessing steps
Large images
Complex documents

Solutions:

Disable unnecessary preprocessing
Resize images before OCR
Process in smaller batches

OCR Inferencer Node ​

Overview ​

Key Features ​

Architecture ​

Components ​

Supported Tasks ​

Configuration ​

Settings Tab ​

Name ​

Task ​

Input Field ​

Output Field ​

Detection Model ​

Recognition Model ​

Use Document Unwarping ​

Use Document Orientation ​

Use Text Line Orientation ​

Promise Mode ​

Promises Field ​

Show Debug Image ​

Debug Interval ​

Debug Image Width ​

JSON Config Tab ​

OCR Task Configuration ​

Detection Task Configuration ​

Recognition Task Configuration ​

Document Rotation Task Configuration ​

Image Format ​

Rosepetal Bitmap Format ​

JPEG/PNG Buffers ​

Array of Images ​

Input ​

Basic OCR ​

Document Rotation Detection ​

Batch Processing ​

Output ​

OCR Results ​

Detection Results ​

Recognition Results ​

Document Rotation Results ​

Usage Examples ​

Example 1: Extract Text from Document ​

Example 2: Document Preprocessing Pipeline ​

Example 3: Multi-Language Documents ​

Example 4: Table Detection and OCR ​

Example 5: Confidence Filtering ​

Example 6: Batch Document Processing ​

Performance Optimization ​

Model Selection ​

Preprocessing ​

Image Preparation ​

Batch Size ​

Docker Container Management ​

Container Lifecycle ​

Memory Usage ​

Container Logs ​

Available Models ​

Detection Models ​

Recognition Models ​

Error Handling ​

Common Errors ​

"No text detected" ​

"Low confidence results" ​

"Container startup failed" ​

"Invalid task configuration" ​

Debugging ​

Best Practices ​

Image Quality ​

Model Selection ​

Preprocessing ​

Confidence Thresholds ​

Batch Processing ​

Troubleshooting ​

Poor OCR Accuracy ​

Missing Text ​

Slow Processing ​

See Also ​

OCR Inferencer Node

Overview

Key Features

Architecture

Components

Supported Tasks

Configuration

Settings Tab

Name

Task

Input Field

Output Field

Detection Model

Recognition Model

Use Document Unwarping

Use Document Orientation

Use Text Line Orientation

Promise Mode

Promises Field

Show Debug Image

Debug Interval

Debug Image Width

JSON Config Tab

OCR Task Configuration

Detection Task Configuration

Recognition Task Configuration

Document Rotation Task Configuration

Image Format

Rosepetal Bitmap Format

JPEG/PNG Buffers

Array of Images

Input

Basic OCR

Document Rotation Detection

Batch Processing

Output

OCR Results

Detection Results

Recognition Results

Document Rotation Results

Usage Examples

Example 1: Extract Text from Document

Example 2: Document Preprocessing Pipeline

Example 3: Multi-Language Documents

Example 4: Table Detection and OCR

Example 5: Confidence Filtering

Example 6: Batch Document Processing

Performance Optimization

Model Selection

Preprocessing

Image Preparation

Batch Size

Docker Container Management

Container Lifecycle

Memory Usage

Container Logs

Available Models

Detection Models

Recognition Models

Error Handling

Common Errors

"No text detected"

"Low confidence results"

"Container startup failed"

"Invalid task configuration"

Debugging

Best Practices

Image Quality

Model Selection

Preprocessing

Confidence Thresholds

Batch Processing

Troubleshooting

Poor OCR Accuracy

Missing Text

Slow Processing

See Also