Datasets - Data Set Management
What are Datasets?
Datasets are organized sets of images used to train artificial intelligence models. Each dataset contains labeled images that allow the system to learn to recognize specific patterns, objects, or defects.
Creating a Dataset
New Dataset
- (+) Button at the top of the interface
- Creation form with required fields:
- Name: Unique dataset identifier
- Description: Purpose and content of the dataset
- Type: Select the type of model to train
Dataset Types
🏷️ Classification (MULTICLASS)
- Purpose: Categorize images into different classes
- Example: Classify products as "good" or "defective"
- Labeling: One label per image
- Usage: Quality control, automatic categorization
🚨 Anomaly Detection (ANOMALY)
- Purpose: Identify anomalous elements or defects
- Example: Detect scratches, cracks, or deformations
- Labeling: Only "normal" images for training
- Usage: Defect inspection, quality control
🎯 Object Detection (imageObjectDetection)
- Purpose: Locate specific objects in images
- Example: Detect screws, components, or products
- Labeling: Bounding boxes around objects
- Usage: Automatic counting, component localization
🎨 Segmentation (MULTILABEL)
- Purpose: Delimit precise areas of interest
- Example: Mark exactly the defective area
- Labeling: Pixel-by-pixel area selection
- Usage: Area measurement, morphological analysis
Dataset Management
Dataset Visualization
The interface displays all datasets in card format with:
- Preview image: Representative of the content
- Dataset name: Primary identifier
- Type: Classification, detection, etc.
- Training status: Trained/Untrained
- Creation date: When it was created
- Number of images: Total images contained
Visual Indicators
Dataset States
- 🟢 Green border: Trained dataset ready for predictions
- ⚪ Gray border: Untrained dataset in preparation process
Preview
- Representative image: Automatically shows an image from the dataset
- Default logo: If no images, shows platform logo
- Hover effect: Highlighting when cursor hovers
Filters and Search
🔍 Available Filters
Text Search
- Search field: Search by dataset name
- Real-time search: Filters as you type
- Visual highlighting: Active filters shown in blue
Filter by Type
- All types: View all datasets
- Classification: Only classification datasets
- Anomaly detection: Only anomaly datasets
- Object detection: Only detection datasets
- Segmentation: Only segmentation datasets
Filter by Training Status
- All: View all regardless of status
- Trained: Only datasets with trained models
Sorting
- Date descending: Most recent first (default)
- Date ascending: Oldest first
Reset Filters
- Trash button: Clears all applied filters
- Appears automatically: Only when filters are active
Actions on Datasets
Open Dataset
- Click on any card: Opens detailed dataset view
- Full modal view: Opens in full screen
- Navigate between datasets: Arrows to move between datasets
Available Options
- Edit information: Modify name and description
- Delete dataset: Delete completely (requires confirmation)
- Download: Export images and labels
- Duplicate: Create copy for experiments
Dataset Interface
Main Tabs
📊 Overview
- Dataset statistics: Number of images, classes, etc.
- Distribution charts: Class balance
- General information: Description, creation date
- Training status: Associated models
🏷️ Tags
- Tag management: Create, edit, and delete tags
- Mass assignment: Apply tags to multiple images
- Tag combination: Merge similar tags
- Labeling statistics: Distribution by category
🖼️ Labeling
- Annotation tools: According to dataset type
- Image view: Zoom, pan, rotation
- Quick navigation: Between dataset images
- Auto-save: Annotations saved automatically
🤖 Models
- Train new models: Parameter configuration
- View existing models: Training history
- Performance metrics: Precision, recall, F1-score
- Model comparison: Between different versions
📤 Upload
- Drag & drop: Drag images directly
- Multiple selection: Upload several images at once
- Supported formats: JPG, PNG, BMP, TIFF
- Upload progress: Real-time progress bar
✅ Validate Object Detection Only
- Annotation validation: Verify labeling quality
- Error detection: Duplicate or incorrect annotations
- Validation statistics: Percentage of valid images
- Assisted correction: Improvement suggestions
Best Practices
Image Preparation
- Consistent quality: Similar resolution and lighting
- Variety of angles: Different object perspectives
- Class balance: Similar number of images per category
- Edge cases: Include difficult to classify examples
Organization
- Descriptive names: Reflecting dataset purpose
- Detailed descriptions: Document content and objective
- Versioning: Create new datasets for experiments
- Regular cleanup: Review and remove incorrect images
Quality Labeling
- Consistency: Uniform criteria for all classification
- Precision: Exact and careful annotations
- Review: Validate labels before training
- Documentation: Record labeling criteria used