Ready-made dataset
139-Video Dexterous Hand Movements Pose Dataset
Bulk export bundle for pose estimation, locomotion research, and robotics-style training pipelines.
A ready-to-use pose export bundle (JSONL + metadata) generated by the Quality Vision Motion Dataset Engine. Built with the Quality Vision Motion Dataset Engine (layer11_pose): 139 source clips, HQ gating, and standard augmentations where applicable.
One-time: $950Source videos: 139Rows exported (incl. augments): 123,732HQ accepted (pre-aug): 30,933Motion intelligence
Dataset overview
- Action label: Dexterous Hand Movements (bulk job
bulk_139_videos_626d9c41) - Exported rows (incl. augmentations): 123,732 in merged
data.jsonl - HQ frames (pre-augmentation): 30,933 accepted
- Mean quality (accepted): ~0.975 frame-quality score
- Mean landmark visibility (accepted): ~0.950
- Stride: 1 · Gaussian smoothing: —
Technical stack
- Pose: BlazePose-style landmarks (normalized), optional Layer 1.1 scene stats
- Augmentations: horizontal_flip, keypoint_noise, random_scale_translate (see
manifest.json) - Deliverables:
data.jsonl,per_video/,bulk_combined_manifest.json, QA JSON - Motion intelligence enabled (gait proxies, symmetry, etc.)
Structured layout
data.jsonl— merged training fileper_video/*.jsonl— per-source splitsfeatures.json,global_stats.json,export_quality_report.json
ONEPAGER
# Quality Vision — Hand Motion Dataset
**Train your AI on real hand motion data — ready in minutes.**
> Unlike many academic datasets, Quality Vision B2B exports are intended for commercial use under your agreed license terms — see LICENSE_EXPORT.txt in the ZIP and manifest.license / dataset_license (not limited to research-only).
## 1. Features
- 21-point hand keypoints (MediaPipe-style)
- Temporal sequences with timestamps
- Per-frame action field + metadata/actions.csv (optional per-frame taxonomy via job payload)
- Hand–object interaction metadata when hq.object_interaction is configured
- Optional camera_pov: first_person | third_person (manual)
- JSONL + optional COCO-like keypoints export
- PyTorch loader guide: examples/README_PYTORCH.md
## 2. Use cases
- 🤖 Robotics
- 🥽 AR/VR
- ✋ Gesture recognition
- 🧠 Human behavior AI
## 3. What's inside (dataset/)
```
dataset/
├── data.jsonl
├── metadata/
│ ├── actions.csv
│ └── dataset_meta.json
├── per_video/
├── features.json
├── global_stats.json # includes buyer_marketing for hand jobs
└── examples/
└── README_PYTORCH.md
```
## Dataset snapshot (metrics)
- Job ID: `bulk_139_videos_626d9c41`
- Action label: `Dexterous Hand Movements`
- Videos processed: 139
- Exported frames (HQ, incl. augmentations): 123732
- Accepted frames (pre-augmentation): 30933
- Sampled candidate frames: 54141
- Rejected frames: 23208
- Acceptance rate (pre-aug / sampled): 57.13%
## Quality Metrics (Accepted Frames)
- Mean quality score: 0.9747
- Mean landmark visibility: 0.9495
- Mean lower-body visibility: 0.0000
- Mean motion_local: 1.0000
## Processing
- Gaussian temporal smoothing: window=5 (x/y/z only; visibility not smoothed)
- Keypoints: MediaPipe Hands (21 landmarks per frame when detected)
- Dexterous block: `finger_angles_deg`, `grip_type_proxy`, tip velocities, `hand_visibility_mean`, `dexterous_quality_proxy`, hand `motion_intelligence` (v3)
- Augmentations: horizontal flip + keypoint noise
- Motion intelligence: hand-centric (v3) — wrist speed, finger-tip motion, grip heuristic, activity class
## Hand motion intelligence (dexterous)
- Mean finger-tip speed: 0.3599
- Mean wrist speed (norm/sec): 0.2708
- Frames with hand MI: 30933
- Hand activity histogram: {"finger_active": 26601, "hand_translating": 29, "manipulation": 961, "static": 3342}
- Grip type histogram: {"neutral": 1516, "open": 12078, "pinch": 6392, "power": 1740, "precision": 9207}
## Deliverables (inside this ZIP)
- data.jsonl
- per_video/video_*.jsonl
- per_video/rejected_*.jsonl
- features.json
- export_quality_report.json
- manifest.json
- global_stats.json
- runtime_config.json
- SCHEMA.md
- examples/
- examples/sample_rows_accepted.jsonl
- examples/sample_rows_rejected.jsonl
- examples/sample_features.json
- examples/LOAD_EXAMPLE.py
- examples/README_PYTORCH.md
- metadata/actions.csv
- metadata/dataset_meta.json
- SHA256SUMS
## Quality Score Definition
- Per frame: frame_quality_score = 0.5 * avg_landmark_visibility + 0.0 * lower_body_visibility + 0.5 * motion_local.
https://qvision.space/
README
# Motion dataset export ## Overview - **Job**: `bulk_139_videos_626d9c41` - **Source**: **139 videos from Pexels (inferred from filenames)** - **Action label**: `Dexterous Hand Movements` - **HQ exported frames**: **123732** (includes augmentation duplicates) ## Key metrics (quick table) | Metric | Value | |---|---:| | Videos processed | 139 | | Frames exported (HQ, incl. augmentations) | 123732 | | Frames sampled (pre-augmentation candidates) | 54141 | | Frames accepted (pre-augmentation) | 30933 | | Accepted percentage (pre-aug / sampled) | 57.13% | | Mean frame quality score (accepted) | 0.9747 | | Mean avg landmark visibility (accepted) | 0.9495 | | Mean lower-body visibility (accepted) | 0.0000 | | Mean motion_local (accepted) | 1.0000 | | Mean fps | 28.24 | | Stride | 1 | See `dataset/global_stats.json` and `dataset/export_quality_report.json` for full details. ## Buyer marketing (hand / dexterous exports) - **Tagline:** Train your AI on real hand motion data — ready in minutes. - **Commercial use:** Unlike many academic datasets, B2B exports are licensed for agreed commercial use — see **LICENSE_EXPORT.txt** in the ZIP and manifest license fields; structured copy also appears in **features.json** → `aggregate.buyer_marketing` and **global_stats.json** → `buyer_marketing`. - **PyTorch:** see **examples/README_PYTORCH.md**. - **Actions table:** **metadata/actions.csv** (`frame_id`, `action`). ## Processing applied - **Gaussian smoothing** (temporal): enabled (x/y/z only; visibility is not smoothed) - **Body normalization**: hip-centered + torso-scale when full-body landmarks exist (omitted for MediaPipe Hands-only rows); wrist-centric: `wrist_position`, `wrist_relative_keypoints` (subtract wrist xyz) (`keypoints_body_normalized`, `body_normalization`) ## Layout - `data.jsonl` — all accepted frames from all videos (global `frame_id`). - `per_video/video_NNN.jsonl` — same schema, only frames from source index NNN. - `per_video/rejected_NNN.jsonl` — low-quality / no-pose frames for that source. - `low_quality_frames.jsonl` — all rejected rows across sources. - `bulk_combined_manifest.json` — index of combined vs per-video files. - `quality_distribution_histogram.json` — histogram of accepted pre-augmentation quality scores. - `sample_frames_visualized/` — small PNG previews with keypoints overlaid. - `use_cases.md` — practical use-case examples. - `examples/preview_rows.jsonl` — medium sample for generating a ~40s skeleton-only preview. - `viewer.html` — single-file local keypoint viewer (no server). - `data.csv` — optional CSV export (enable with `MDE_EXPORT_CSV=1`). - `coco_keypoints.json` — optional COCO-like keypoints-only JSON (enable with `MDE_EXPORT_COCO=1`). - `features.json` — per-video sequence metrics (motion consistency, velocity proxies, etc.). - `manifest.json` — job metadata and post-processing flags. - `metadata/actions.csv` — `frame_id,action` (job-level label per row unless `hq.per_frame_actions`). - `metadata/hand_activity.csv` — dexterous exports only: per-frame `hand_activity_class`, `grip_type_proxy`, `motion_class_rule`. - `metadata/dataset_meta.json` — optional `camera_pov`, schema notes. ## Accepted percentage - `accepted_percentage` is computed as **accepted (pre-augmentation) / sampled frames**. - Exported frame count can exceed sampled frames when augmentations duplicate accepted rows. ## Dexterous Hand Movements (this export) `manifest.json` sets **`dexterous_hand_export`: true**. Each accepted row in `data.jsonl` includes a **`dexterous_hand`** object and top-level **`motion_intelligence`** (version 3, hand domain) for buyer-ready hand analytics. Fields: - **`finger_angles_deg`** — 2D flexion/spread proxies per finger chain (see `SCHEMA.md` for keys) - **`grip_type_proxy`** — `pinch` | `precision` | `power` | `open` | `neutral` | `unknown` - **`finger_tip_velocity_norm_per_sec`** — per-tip speeds (`*_tip_speed`) in normalized coords/sec - **`finger_tip_speed_mean`** — mean tip speed for the frame - **`hand_visibility_mean`** — mean landmark visibility - **`dexterous_quality_proxy`** — hand-only quality score (0–1) - **`motion_intelligence`** (inside `dexterous_hand`) — wrist speed, finger energy, grip change flag, `hand_activity_class` (`static` / `finger_active` / `hand_translating` / `manipulation`) Aggregate hand motion stats: **`global_stats.json` → `motion_intelligence`** (`hand_activity_histogram`, `grip_type_histogram`). Regular pose-only exports omit this block.
Free samples
Validate parsing on GitHub: QualityVision-Motion-Dataset-Samples.
Purchase & licensing
- Price: $950 one-time (Gumroad)
- Deliverable: engineered exports (JSONL + metadata) — not raw source videos
- Contact: info@qvision.space