From Dataset Node
Overview
The From Dataset node streams images from Firebase datasets at a controlled rate. It's designed for processing large datasets without overwhelming the system, featuring intelligent buffering, automatic download management, and flexible filtering options.
Key Features
- Controlled streaming: Set interval between images
- Intelligent buffering: Pre-downloads images for smooth delivery
- Tag filtering: Process only specific labels
- Set filtering: Filter by TRAIN/VALID/TEST sets
- Loop mode: Continuous streaming or one-time processing
- Pause/Resume: Control stream mid-process
- Multiple color spaces: RGB, BGR, RGBA, BGRA
- Progress tracking: Monitor processing status
- BMP decoding: Native BMP file support
Configuration
Properties
Firebase Config
- Type: Node reference
- Required: Yes
- Description: Firebase configuration node
Dataset
- Type: Dropdown
- Required: Yes
- Description: Dataset to stream from
Interval (ms)
- Type: Number
- Default: 5000
- Description: Milliseconds between images
Loop
- Type: Checkbox
- Default: false
- Description: Restart from beginning after completion
Color Space
- Type: Select
- Options: RGB, BGR, RGBA, BGRA
- Default: RGB
- Description: Output image color format
Tag Filter
- Type: Multi-select
- Optional: Yes
- Description: Only process images with selected tags
Set Filter
- Type: Multi-select
- Options: TRAIN, VALID, TEST, PREDETERMINED
- Optional: Yes
- Description: Only process images from selected sets
Input
Control Messages
Start Streaming
javascript
msg.payload = { action: "start" };Pause Streaming
javascript
msg.payload = { action: "pause" };Resume Streaming
javascript
msg.payload = { action: "resume" };Stop Streaming
javascript
msg.payload = { action: "stop" };Output
Image Message
javascript
{
payload: {
width: 1920,
height: 1080,
data: Buffer,
colorSpace: "RGB",
channels: 3,
dtype: "uint8"
},
imageIndex: 0,
totalImages: 1000,
imageId: "img_abc123",
tags: ["OK", "frontal"],
setType: "TRAIN",
progress: {
current: 1,
total: 1000,
percentage: 0.1
}
}Progress Fields
- imageIndex: Current image index (0-based)
- totalImages: Total images to process
- progress.current: Images processed so far
- progress.total: Total images in dataset
- progress.percentage: Completion percentage
Usage Examples
Example 1: Process All Images
[Inject: Start] → [From Dataset] → [Inferencer] → [To Dataset]Inject node:
javascript
msg.payload = { action: "start" };
return msg;Example 2: Filter by Tag
Configuration:
- Tag Filter: ["DEFECT", "SCRATCH"]
- Set Filter: ["TRAIN"]
Result: Only training images tagged as DEFECT or SCRATCH
Example 3: Continuous Loop
Configuration:
- Loop: Enabled
- Interval: 1000ms
Behavior: Cycles through dataset repeatedly
Example 4: Progress Monitoring
javascript
// Function node after From Dataset
const pct = msg.progress.percentage.toFixed(1);
node.status({ fill: "blue", shape: "dot", text: `${pct}%` });
if (msg.progress.current === msg.progress.total) {
msg.payload = "Processing complete!";
return msg;
}Example 5: Batch Processing with Pause
javascript
// Function node: Pause after every 100 images
if (msg.imageIndex % 100 === 0 && msg.imageIndex > 0) {
// Pause for processing
const pauseMsg = { payload: { action: "pause" } };
node.send([msg, pauseMsg]);
// Resume after 5 seconds
setTimeout(() => {
const resumeMsg = { payload: { action: "resume" } };
node.send([null, resumeMsg]);
}, 5000);
} else {
return msg;
}Performance Considerations
Buffer Size
The node automatically calculates optimal buffer size based on:
- Download speed
- Interval time
- Target: Keep 20% interval time worth of images buffered
Example:
- Interval: 5000ms
- Download time: 1000ms
- Buffer size: ceil(1000 / (5000 * 0.2)) = 1 image
Memory Usage
Per buffered image: ~5-10 MB (typical)
- 1920x1080 RGB: ~6 MB
- Buffer of 5 images: ~30 MB
Network Usage
Optimization:
- Pre-downloads images before needed
- Only downloads once (no re-fetching)
- Firebase Storage bandwidth
Best Practices
- Set appropriate intervals: Match to processing speed
- Use tag filters: Reduce unnecessary downloads
- Monitor progress: Track completion percentage
- Handle completion: Check when processing finishes
- Memory management: Don't queue too many processes
- Error handling: Handle download failures gracefully
Troubleshooting
Slow streaming
Causes:
- Slow network connection
- Large images
- Interval too short
Solutions:
- Increase interval
- Check network speed
- Resize images in Firebase
Missing images
Causes:
- Tag filter too restrictive
- Set filter excludes images
- Dataset empty
Solutions:
- Check filter settings
- Verify dataset has images
- Review tag/set assignments
High memory usage
Causes:
- Too many images buffered
- Processing slower than streaming
Solutions:
- Increase interval
- Optimize processing
- Pause during heavy operations
See Also
- To Dataset Node - Save processed images
- List Dataset Node - Browse available datasets
- Dataset Upload Node - Upload images to datasets
- Firebase Config Node - Configure Firebase