Inferencer Node

Overview

The Inferencer node provides AI-powered computer vision inference by running machine learning models in Docker containers. It acts as a bridge between Node-RED flows and Python-based ML models, supporting detection, classification, segmentation, and anomaly detection tasks.

Key Features

Multiple ML frameworks: YOLO, PaddlePaddle, RT-DETR, Anomaly detection
Task support: Object detection, image classification, instance/semantic segmentation
Docker-based execution: Isolated Python containers for model inference
Load balancing: Run multiple container instances for higher throughput
Dynamic models: Load models at runtime or configure statically
Auto-warmup: Pre-warm models on startup for faster first predictions
Firebase integration: Automatic model download from Firebase storage
Promise mode: Asynchronous processing with promise-reader integration
Output customization: Configure detection boxes, masks, and result formats
Class mapping: Rename or filter model output classes
Debug visualization: Display processed images on Node-RED canvas
Performance tracking: Built-in metrics for inference timing

Architecture

Components

Node-RED Node: Configuration and message handling
Docker Containers: Python inference servers (1-3 instances)
gRPC Protocol: Communication between Node-RED and containers
Model Storage: /opt/storage/models/ directory
Firebase: Optional model download source

Data Flow

Input Images → Queue → gRPC → Docker Container → ML Model → Predictions → Output
                ↓
         Load Balancing (multiple containers)

Configuration

Settings Tab

Name

Type: String
Optional: Yes
Description: Display name for the node

Input Field

Type: Message property path
Default: payload
Description: Message field containing input images (single image or array)

Output Field

Type: Message property path
Default: payload
Description: Where inference results will be stored
Note: Performance stats saved to msg.performance.<field>

Model Source

Type: Select
Options:
- Select here: Choose from available models in /opt/storage/models
- Dynamic: Provide model name at runtime via message/flow/global property

Model Name (Static Mode)

Type: Dropdown
Description: Select from available models
Display: Shows model name, task type, and framework

Model Field (Dynamic Mode)

Type: TypedInput (msg, flow, global)
Example: msg.modelName, flow.currentModel, global.activeModel
Description: Property path containing model name at runtime
Note: Enables automatic model download if Firebase is configured

Auto Warmup (Static Mode)

Type: Checkbox
Default: Disabled
Description: Trigger synthetic inference on startup to prepare model
Benefit: Eliminates cold-start latency on first real request

Promise Mode

Type: Checkbox
Default: Disabled
Description: Return promises instead of waiting for results
Requires: promise-reader node to resolve promises
Use Case: High-throughput batch processing

Promises Field

Type: Message property path
Default: promises
Visible: When Promise Mode enabled
Description: Array field to store pending promises

Number of Concurrent Servers

Type: Number
Range: 1-3
Default: 1
Description: Docker container instances for load balancing
Impact: Higher = more throughput, more memory usage

Maximum Concurrent Predictions

Type: Number
Range: 1-20
Default: 5
Description: Parallel requests per server
Behavior: Excess requests queue until slot available

Show Debug Image

Type: Checkbox
Default: Disabled
Description: Display processed images on Node-RED canvas

Debug Interval

Type: Number
Default: 1
Visible: When debug enabled
Description: Show every Nth image (1 = all images)

Debug Image Width

Type: Number (pixels)
Default: 200
Visible: When debug enabled
Description: Display width for debug images

JSON Config Tab

Advanced model configuration in JSON format. Structure varies by model type:

Common Fields (All Models)

json

{
  "common": {
    "model_name": "my-model",
    "config_type": "predict",
    "task": "Detection",
    "device": "auto",
    "image_shape": {
      "width": 640,
      "height": 640
    },
    "verbose": false,
    "max_batch": 100
  }
}

Fields:

model_name: Model identifier
config_type: Always "predict"
task: Detection, Classification, Segmentation, Anomaly
device: "auto", "cpu", "cuda", "mps"
image_shape: Target dimensions for resizing
verbose: Enable detailed logging
max_batch: Maximum batch size

Detection Models (YOLO, RT-DETR, PaddlePaddle)

json

{
  "common": { /* ... */ },
  "task_specific": {
    "conf_threshold": 0.5,
    "nms_iou_threshold": 0.7,
    "max_det": 300
  }
}

Task-specific fields:

conf_threshold: Confidence threshold (0.0-1.0)
nms_iou_threshold: Non-maximum suppression IoU threshold
max_det: Maximum detections per image

Classification Models

json

{
  "common": { /* ... */ },
  "task_specific": {
    "top_k": 5
  }
}

Task-specific fields:

top_k: Return top K predictions

Segmentation Models (YOLO)

json

{
  "common": { /* ... */ },
  "task_specific": {
    "conf_threshold": 0.5,
    "retina_masks": true
  }
}

Task-specific fields:

conf_threshold: Detection confidence threshold
retina_masks: High-resolution mask output

Anomaly Detection

json

{
  "common": { /* ... */ },
  "task_specific": {
    "threshold": 0.5,
    "normalize": true
  },
  "model_specific": {
    "optional": {
      "image_threshold": 0.5,
      "pixel_threshold": 0.5
    }
  }
}

Output Formats Tab

Configure detection and segmentation output formats.

Detection Output Formats

boxes_xyxy: Bounding boxes [x1, y1, x2, y2] (top-left, bottom-right)
boxes_xywh: Bounding boxes [x_center, y_center, width, height]
boxes_tlwh: Bounding boxes [x_top_left, y_top_left, width, height]
boxes_corners: Four corners [[x1,y1], [x2,y2], [x3,y3], [x4,y4]]

Segmentation Output Formats

masks_rle: Run-length encoding (compact format)
masks_polygon: Polygon contours
masks_bitmap: Binary pixel masks

Classes Tab

Map or filter model output classes.

Features:

Rename classes to custom labels
Filter results by remapping to empty string
Per-model configuration storage

Example:

Original: "person" → Custom: "human"
Original: "car" → Custom: "vehicle"
Original: "background" → Custom: "" (filtered out)

Image Format

The node accepts multiple image formats:

1. Rosepetal Bitmap Format

javascript

{
  width: 1920,
  height: 1080,
  data: Buffer,           // Raw pixel data
  colorSpace: "RGB",      // "GRAY", "RGB", "RGBA", "BGR", "BGRA"
  channels: 3,            // Auto-inferred if omitted
  dtype: "uint8"          // Currently only uint8 supported
}

Color space mapping:

GRAY: 1 channel
RGB / BGR: 3 channels
RGBA / BGRA: 4 channels

2. JPEG/PNG Buffers

javascript

msg.payload = fs.readFileSync('image.jpg');

The node automatically decodes standard image formats.

3. Array of Images

javascript

msg.payload = [image1, image2, image3];

Processes multiple images in batch for improved performance.

Input

Basic Input

javascript

msg.payload = imageBuffer;
return msg;

Dynamic Model Selection

javascript

// Configure node with Model Source: Dynamic, Model Field: msg.modelName
msg.modelName = "yolo-detection-v8";
msg.payload = imageBuffer;
return msg;

Warmup Request

javascript

msg.warmup = true;
msg.payload = syntheticImage; // Or omit for auto-generated
return msg;

Triggers model warmup without returning results.

Output

Detection Results

javascript

{
  payload: [
    {
      box: {
        xyxy: [100, 150, 300, 400],
        xywh: [200, 275, 200, 250],
        confidence: 0.95
      },
      class: "person",
      class_id: 0
    }
  ],
  performance: {
    payload: {
      inferenceTime: 45.2,
      preprocessTime: 5.1,
      postprocessTime: 3.8,
      totalTime: 54.1
    }
  }
}

Classification Results

javascript

{
  payload: [
    {
      class: "cat",
      class_id: 281,
      confidence: 0.98
    },
    {
      class: "dog",
      class_id: 179,
      confidence: 0.01
    }
  ]
}

Segmentation Results

javascript

{
  payload: [
    {
      box: { xyxy: [100, 150, 300, 400] },
      class: "person",
      mask: {
        rle: "...",           // Run-length encoded
        polygon: [[x1, y1], [x2, y2], ...],
        bitmap: Buffer        // Binary mask
      }
    }
  ]
}

Promise Mode Output

javascript

{
  promises: [
    Promise { <pending> },
    Promise { <pending> }
  ]
}

Resolve with promise-reader node.

Usage Examples

Example 1: Simple Object Detection

javascript

// Function node: Load image
msg.payload = {
  width: 640,
  height: 480,
  data: imageBuffer,
  colorSpace: "RGB"
};
return msg;

Inferencer configuration:

Model: yolo-v8-detection
Input: payload
Output: detections

javascript

// Function node: Filter high confidence
msg.payload = msg.detections.filter(det => det.box.confidence > 0.8);
return msg;

Example 2: Batch Processing

javascript

// Function node: Prepare batch
const images = [
  { width: 640, height: 480, data: buffer1, colorSpace: "RGB" },
  { width: 640, height: 480, data: buffer2, colorSpace: "RGB" },
  { width: 640, height: 480, data: buffer3, colorSpace: "RGB" }
];
msg.payload = images;
return msg;

Output:

javascript

msg.payload = [
  [detection1a, detection1b],  // Results from image 1
  [detection2a],                // Results from image 2
  []                            // No detections in image 3
];

Example 3: Dynamic Model Selection

javascript

// Function node: Choose model based on input
if (msg.topic === "quality-check") {
  msg.modelName = "defect-detection-model";
} else if (msg.topic === "classification") {
  msg.modelName = "product-classifier";
}
msg.payload = imageData;
return msg;

Example 4: Promise Mode for High Throughput

javascript

// Function node: Send batch with promises
msg.payload = imageArray;
return msg;

Inferencer (Promise enabled)

javascript

// Promise Reader node resolves all
msg.results = await Promise.all(msg.promises);
return msg;

Example 5: Class Filtering

Inferencer Classes config:

person → person
car → vehicle
truck → vehicle
bicycle →
motorcycle →

Result: Only "person" and "vehicle" classes in output, bicycle/motorcycle filtered.

Example 6: Multi-Model Pipeline

[Camera] → [Inferencer: Detection] → [Function: Filter] → [Inferencer: Classification] → [Output]

Detection step:

javascript

msg.detections = msg.payload;
msg.payload = cropDetections(msg.payload); // Extract regions
return msg;

Classification step:

javascript

// Classify each detected region
msg.classifications = msg.payload;
return msg;

Performance Optimization

Concurrency Settings

Single server, low concurrency:

Servers: 1
Max concurrent: 5
Use case: Low memory systems, sporadic requests

Multiple servers, high concurrency:

Servers: 3
Max concurrent: 10
Use case: High-throughput production, powerful hardware

Batch Processing

Optimal batch size:

Small models (YOLO-nano): 10-20 images
Medium models (YOLO-v8): 5-10 images
Large models (Segmentation): 2-5 images

Image Preprocessing

Before sending to inferencer:

Resize to model's expected dimensions
Convert color space if needed
Compress if bandwidth limited

javascript

// Resize before inference
const sharp = require('sharp');
msg.payload.data = await sharp(msg.payload.data)
  .resize(640, 640)
  .raw()
  .toBuffer();
msg.payload.width = 640;
msg.payload.height = 640;

Model Warmup

Enable auto-warmup for:

Production environments
Time-sensitive applications
Models with slow cold-start

Disable warmup for:

Development/testing
Dynamic model switching
Memory-constrained systems

Docker Container Management

Container Lifecycle

Startup: Node creates Docker containers
Ready: Containers accept requests
Running: Process inference requests
Shutdown: Clean container removal on deploy

Container Ports

Automatically assigned from available ports. No manual configuration needed.

Container Logs

bash

# View container logs
docker logs <container-id>

# Find inferencer containers
docker ps | grep rosepetal-serving

Memory Usage

Per container (approximate):

YOLO-nano: 500MB-1GB
YOLO-v8: 2GB-4GB
Segmentation: 4GB-8GB
PaddlePaddle OCR: 2GB-3GB

Multiple servers multiply memory usage.

Model Management

Model Directory Structure

/opt/storage/models/
├── yolo-v8-detection/
│   ├── model.pt
│   └── config.json
├── product-classifier/
│   ├── model.onnx
│   └── config.json
└── segmentation-model/
    ├── model.pt
    └── config.json

Adding Models Manually

Create directory: /opt/storage/models/<model-name>/
Copy model file: model.pt, model.onnx, etc.
Create config.json (optional, for auto-detection)
Refresh Node-RED editor

Firebase Model Download

Requirements:

Firebase config node connected
Model exists in Firebase storage
Dynamic mode enabled

Behavior:

Checks local storage first
Downloads if missing
Caches for future use
Shows download progress in logs

Error Handling

Common Errors

"Docker image not available"

Cause: Inference Docker image not pulled
Solution: docker pull <image-name>

"Model not found"

Cause: Model doesn't exist in /opt/storage/models
Solution: Verify model name, check directory, download if needed

"Container failed to start"

Cause: Port conflict, insufficient memory, corrupted model
Solution: Check Docker logs, verify system resources

"gRPC connection failed"

Cause: Container not ready, network issue
Solution: Wait for container startup, check Docker status

"Invalid image format"

Cause: Missing required fields, wrong data type
Solution: Validate image object structure

"CUDA out of memory"

Cause: Batch too large, concurrent requests exceeded capacity
Solution: Reduce batch size, lower concurrency, use CPU

Debugging

Enable verbose logging:

json

{
  "common": {
    "verbose": true
  }
}

Check container logs:

bash

docker logs <container-id>

Enable debug visualization:

Check "Show debug image"
Set debug interval: 1
Verify images display correctly

Promise Mode

Overview

Promise mode enables asynchronous batch processing:

Send multiple images without waiting
Process results when ready
Higher throughput for large batches

Configuration

Inferencer:

Enable "Promise" checkbox
Set "Promises field": promises

Flow:

[Function: Batch] → [Inferencer: Promise] → [Promise Reader] → [Function: Process]

Usage

Send batch:

javascript

msg.payload = arrayOf100Images;
return msg;

Inferencer outputs:

javascript

msg.promises = [Promise, Promise, ...]; // 100 promises

Promise Reader resolves:

javascript

msg.results = [...]; // 100 resolved results

Benefits

Non-blocking operation
Better resource utilization
Simplified batch handling
Automatic parallelization

Best Practices

Model Selection

Match task to model: Detection vs Classification vs Segmentation
Consider speed/accuracy tradeoff: nano (fast) vs large (accurate)
Test on representative data: Validate before production
Version models: Track changes, enable rollback

Configuration

Use JSON config for repeatability: Save configurations
Document class mappings: Clear naming conventions
Set appropriate thresholds: Balance false positives/negatives
Enable warmup in production: Eliminate first-request latency

Performance

Right-size concurrency: Match hardware capabilities
Use batch processing: Process multiple images together
Monitor memory usage: Prevent OOM errors
Profile inference times: Identify bottlenecks

Maintenance

Clean up unused containers: docker system prune
Monitor disk space: Models can be large
Update Docker images: Stay current with improvements
Log performance metrics: Track degradation over time

Troubleshooting

Slow Inference

Possible causes:

Model too large for hardware
CPU inference on GPU model
High concurrent load
Network bottleneck (if remote storage)

Solutions:

Use smaller model variant
Enable GPU if available
Reduce concurrency
Cache models locally

High Memory Usage

Possible causes:

Too many servers
Large batch sizes
Memory leak in model

Solutions:

Reduce number of servers
Process smaller batches
Restart containers periodically
Update to latest image

Inconsistent Results

Possible causes:

Wrong color space
Incorrect image dimensions
Threshold too sensitive

Solutions:

Validate input format
Check preprocessing
Adjust confidence thresholds
Enable debug visualization

Inferencer Node ​

Overview ​

Key Features ​

Architecture ​

Components ​

Data Flow ​

Configuration ​

Settings Tab ​

Name ​

Input Field ​

Output Field ​

Model Source ​

Model Name (Static Mode) ​

Model Field (Dynamic Mode) ​

Auto Warmup (Static Mode) ​

Promise Mode ​

Promises Field ​

Number of Concurrent Servers ​

Maximum Concurrent Predictions ​

Show Debug Image ​

Debug Interval ​

Debug Image Width ​

JSON Config Tab ​

Common Fields (All Models) ​

Detection Models (YOLO, RT-DETR, PaddlePaddle) ​

Classification Models ​

Segmentation Models (YOLO) ​

Anomaly Detection ​

Output Formats Tab ​

Detection Output Formats ​

Segmentation Output Formats ​

Classes Tab ​

Image Format ​

1. Rosepetal Bitmap Format ​

2. JPEG/PNG Buffers ​

3. Array of Images ​

Input ​

Basic Input ​

Dynamic Model Selection ​

Warmup Request ​

Output ​

Detection Results ​

Classification Results ​

Segmentation Results ​

Promise Mode Output ​

Usage Examples ​

Example 1: Simple Object Detection ​

Example 2: Batch Processing ​

Example 3: Dynamic Model Selection ​

Example 4: Promise Mode for High Throughput ​

Example 5: Class Filtering ​

Example 6: Multi-Model Pipeline ​

Performance Optimization ​

Concurrency Settings ​

Batch Processing ​

Image Preprocessing ​

Model Warmup ​

Docker Container Management ​

Container Lifecycle ​

Container Ports ​

Container Logs ​

Memory Usage ​

Model Management ​

Model Directory Structure ​

Adding Models Manually ​

Firebase Model Download ​

Error Handling ​

Common Errors ​

"Docker image not available" ​

"Model not found" ​

"Container failed to start" ​

"gRPC connection failed" ​

"Invalid image format" ​

"CUDA out of memory" ​

Debugging ​

Promise Mode ​

Overview ​

Configuration ​

Usage ​

Benefits ​

Inferencer Node

Overview

Key Features

Architecture

Components

Data Flow

Configuration

Settings Tab

Name

Input Field

Output Field

Model Source

Model Name (Static Mode)

Model Field (Dynamic Mode)

Auto Warmup (Static Mode)

Promise Mode

Promises Field

Number of Concurrent Servers

Maximum Concurrent Predictions

Show Debug Image

Debug Interval

Debug Image Width

JSON Config Tab

Common Fields (All Models)

Detection Models (YOLO, RT-DETR, PaddlePaddle)

Classification Models

Segmentation Models (YOLO)

Anomaly Detection

Output Formats Tab

Detection Output Formats

Segmentation Output Formats

Classes Tab

Image Format

1. Rosepetal Bitmap Format

2. JPEG/PNG Buffers

3. Array of Images

Input

Basic Input

Dynamic Model Selection

Warmup Request

Output

Detection Results

Classification Results

Segmentation Results

Promise Mode Output

Usage Examples

Example 1: Simple Object Detection

Example 2: Batch Processing

Example 3: Dynamic Model Selection

Example 4: Promise Mode for High Throughput

Example 5: Class Filtering

Example 6: Multi-Model Pipeline

Performance Optimization

Concurrency Settings

Batch Processing

Image Preprocessing

Model Warmup

Docker Container Management

Container Lifecycle

Container Ports

Container Logs

Memory Usage

Model Management

Model Directory Structure

Adding Models Manually

Firebase Model Download

Error Handling

Common Errors

"Docker image not available"

"Model not found"

"Container failed to start"

"gRPC connection failed"

"Invalid image format"

"CUDA out of memory"

Debugging

Promise Mode

Overview

Configuration

Usage

Benefits