Ready-made dataset

139-Video Dexterous Hand Movements Pose Dataset

Bulk export bundle for pose estimation, locomotion research, and robotics-style training pipelines.

A ready-to-use pose export bundle (JSONL + metadata) generated by the Quality Vision Motion Dataset Engine. Built with the Quality Vision Motion Dataset Engine (layer11_pose): 139 source clips, HQ gating, and standard augmentations where applicable.

One-time: $950Source videos: 139Rows exported (incl. augments): 123,732HQ accepted (pre-aug): 30,933Motion intelligence

Dataset overview

  • Action label: Dexterous Hand Movements (bulk job bulk_139_videos_626d9c41)
  • Exported rows (incl. augmentations): 123,732 in merged data.jsonl
  • HQ frames (pre-augmentation): 30,933 accepted
  • Mean quality (accepted): ~0.975 frame-quality score
  • Mean landmark visibility (accepted): ~0.950
  • Stride: 1 · Gaussian smoothing:

Technical stack

  • Pose: BlazePose-style landmarks (normalized), optional Layer 1.1 scene stats
  • Augmentations: horizontal_flip, keypoint_noise, random_scale_translate (see manifest.json)
  • Deliverables: data.jsonl, per_video/, bulk_combined_manifest.json, QA JSON
  • Motion intelligence enabled (gait proxies, symmetry, etc.)

Structured layout

  • data.jsonl — merged training file
  • per_video/*.jsonl — per-source splits
  • features.json, global_stats.json, export_quality_report.json

ONEPAGER

# Quality Vision — Hand Motion Dataset

**Train your AI on real hand motion data — ready in minutes.**

> Unlike many academic datasets, Quality Vision B2B exports are intended for commercial use under your agreed license terms — see LICENSE_EXPORT.txt in the ZIP and manifest.license / dataset_license (not limited to research-only).

## 1. Features
- 21-point hand keypoints (MediaPipe-style)
- Temporal sequences with timestamps
- Per-frame action field + metadata/actions.csv (optional per-frame taxonomy via job payload)
- Hand–object interaction metadata when hq.object_interaction is configured
- Optional camera_pov: first_person | third_person (manual)
- JSONL + optional COCO-like keypoints export
- PyTorch loader guide: examples/README_PYTORCH.md

## 2. Use cases
- 🤖 Robotics
- 🥽 AR/VR
- ✋ Gesture recognition
- 🧠 Human behavior AI

## 3. What's inside (dataset/)
```
dataset/
├── data.jsonl
├── metadata/
│   ├── actions.csv
│   └── dataset_meta.json
├── per_video/
├── features.json
├── global_stats.json   # includes buyer_marketing for hand jobs
└── examples/
    └── README_PYTORCH.md
```

## Dataset snapshot (metrics)
- Job ID: `bulk_139_videos_626d9c41`
- Action label: `Dexterous Hand Movements`
- Videos processed: 139
- Exported frames (HQ, incl. augmentations): 123732
- Accepted frames (pre-augmentation): 30933
- Sampled candidate frames: 54141
- Rejected frames: 23208
- Acceptance rate (pre-aug / sampled): 57.13%

## Quality Metrics (Accepted Frames)
- Mean quality score: 0.9747
- Mean landmark visibility: 0.9495
- Mean lower-body visibility: 0.0000
- Mean motion_local: 1.0000

## Processing
- Gaussian temporal smoothing: window=5 (x/y/z only; visibility not smoothed)
- Keypoints: MediaPipe Hands (21 landmarks per frame when detected)
- Dexterous block: `finger_angles_deg`, `grip_type_proxy`, tip velocities, `hand_visibility_mean`, `dexterous_quality_proxy`, hand `motion_intelligence` (v3)
- Augmentations: horizontal flip + keypoint noise
- Motion intelligence: hand-centric (v3) — wrist speed, finger-tip motion, grip heuristic, activity class

## Hand motion intelligence (dexterous)
- Mean finger-tip speed: 0.3599
- Mean wrist speed (norm/sec): 0.2708
- Frames with hand MI: 30933
- Hand activity histogram: {"finger_active": 26601, "hand_translating": 29, "manipulation": 961, "static": 3342}
- Grip type histogram: {"neutral": 1516, "open": 12078, "pinch": 6392, "power": 1740, "precision": 9207}

## Deliverables (inside this ZIP)
- data.jsonl
- per_video/video_*.jsonl
- per_video/rejected_*.jsonl
- features.json
- export_quality_report.json
- manifest.json
- global_stats.json
- runtime_config.json
- SCHEMA.md
- examples/
- examples/sample_rows_accepted.jsonl
- examples/sample_rows_rejected.jsonl
- examples/sample_features.json
- examples/LOAD_EXAMPLE.py
- examples/README_PYTORCH.md
- metadata/actions.csv
- metadata/dataset_meta.json
- SHA256SUMS

## Quality Score Definition
- Per frame: frame_quality_score = 0.5 * avg_landmark_visibility + 0.0 * lower_body_visibility + 0.5 * motion_local.

https://qvision.space/

README

# Motion dataset export

## Overview

- **Job**: `bulk_139_videos_626d9c41`
- **Source**: **139 videos from Pexels (inferred from filenames)**
- **Action label**: `Dexterous Hand Movements`
- **HQ exported frames**: **123732** (includes augmentation duplicates)

## Key metrics (quick table)

| Metric | Value |
|---|---:|
| Videos processed | 139 |
| Frames exported (HQ, incl. augmentations) | 123732 |
| Frames sampled (pre-augmentation candidates) | 54141 |
| Frames accepted (pre-augmentation) | 30933 |
| Accepted percentage (pre-aug / sampled) | 57.13% |
| Mean frame quality score (accepted) | 0.9747 |
| Mean avg landmark visibility (accepted) | 0.9495 |
| Mean lower-body visibility (accepted) | 0.0000 |
| Mean motion_local (accepted) | 1.0000 |
| Mean fps | 28.24 |
| Stride | 1 |

See `dataset/global_stats.json` and `dataset/export_quality_report.json` for full details.

## Buyer marketing (hand / dexterous exports)

- **Tagline:** Train your AI on real hand motion data — ready in minutes.
- **Commercial use:** Unlike many academic datasets, B2B exports are licensed for agreed commercial use — see **LICENSE_EXPORT.txt** in the ZIP and manifest license fields; structured copy also appears in **features.json** → `aggregate.buyer_marketing` and **global_stats.json** → `buyer_marketing`.
- **PyTorch:** see **examples/README_PYTORCH.md**.
- **Actions table:** **metadata/actions.csv** (`frame_id`, `action`).

## Processing applied

- **Gaussian smoothing** (temporal): enabled (x/y/z only; visibility is not smoothed)
- **Body normalization**: hip-centered + torso-scale when full-body landmarks exist (omitted for MediaPipe Hands-only rows); wrist-centric: `wrist_position`, `wrist_relative_keypoints` (subtract wrist xyz) (`keypoints_body_normalized`, `body_normalization`)

## Layout

- `data.jsonl` — all accepted frames from all videos (global `frame_id`).
- `per_video/video_NNN.jsonl` — same schema, only frames from source index NNN.
- `per_video/rejected_NNN.jsonl` — low-quality / no-pose frames for that source.
- `low_quality_frames.jsonl` — all rejected rows across sources.
- `bulk_combined_manifest.json` — index of combined vs per-video files.
- `quality_distribution_histogram.json` — histogram of accepted pre-augmentation quality scores.
- `sample_frames_visualized/` — small PNG previews with keypoints overlaid.
- `use_cases.md` — practical use-case examples.
- `examples/preview_rows.jsonl` — medium sample for generating a ~40s skeleton-only preview.
- `viewer.html` — single-file local keypoint viewer (no server).
- `data.csv` — optional CSV export (enable with `MDE_EXPORT_CSV=1`).
- `coco_keypoints.json` — optional COCO-like keypoints-only JSON (enable with `MDE_EXPORT_COCO=1`).
- `features.json` — per-video sequence metrics (motion consistency, velocity proxies, etc.).
- `manifest.json` — job metadata and post-processing flags.
- `metadata/actions.csv` — `frame_id,action` (job-level label per row unless `hq.per_frame_actions`).
- `metadata/hand_activity.csv` — dexterous exports only: per-frame `hand_activity_class`, `grip_type_proxy`, `motion_class_rule`.
- `metadata/dataset_meta.json` — optional `camera_pov`, schema notes.

## Accepted percentage

- `accepted_percentage` is computed as **accepted (pre-augmentation) / sampled frames**.
- Exported frame count can exceed sampled frames when augmentations duplicate accepted rows.

## Dexterous Hand Movements (this export)

`manifest.json` sets **`dexterous_hand_export`: true**. Each accepted row in `data.jsonl` includes a **`dexterous_hand`** object and top-level **`motion_intelligence`** (version 3, hand domain) for buyer-ready hand analytics. Fields:

- **`finger_angles_deg`** — 2D flexion/spread proxies per finger chain (see `SCHEMA.md` for keys)
- **`grip_type_proxy`** — `pinch` | `precision` | `power` | `open` | `neutral` | `unknown`
- **`finger_tip_velocity_norm_per_sec`** — per-tip speeds (`*_tip_speed`) in normalized coords/sec
- **`finger_tip_speed_mean`** — mean tip speed for the frame
- **`hand_visibility_mean`** — mean landmark visibility
- **`dexterous_quality_proxy`** — hand-only quality score (0–1)
- **`motion_intelligence`** (inside `dexterous_hand`) — wrist speed, finger energy, grip change flag, `hand_activity_class` (`static` / `finger_active` / `hand_translating` / `manipulation`)

Aggregate hand motion stats: **`global_stats.json` → `motion_intelligence`** (`hand_activity_histogram`, `grip_type_histogram`). Regular pose-only exports omit this block.

Free samples

Validate parsing on GitHub: QualityVision-Motion-Dataset-Samples.

Purchase & licensing

  • Price: $950 one-time (Gumroad)
  • Deliverable: engineered exports (JSONL + metadata) — not raw source videos
  • Contact: info@qvision.space