joshua
Human data for autonomous agents.
Reproducible demonstrations, rigorous evaluations, and human feedback - so your agents act with precision.
What we make
We capture how people actually work on the web - clicks, keys, timing, outcomes - then package it as clean trajectories for training and testing agents.
Demos
JSONL + MP4 + DOM metadata
Eval
pass/fail, action scoring, Playwright replays
RLHF
pairwise rankings with rationales
Safety
adversarial and jailbreak scenarios
Deterministic replays ≥90% · First-pass acceptance ≥95% · PII-safe by design.
Example: Frontier Lab Order
A recent large-scale dataset we delivered
Request a Sample
See real examples of our human-generated web agent training data. Get access to a curated dataset with 200 task demonstrations.
Includes JSONL trajectories, MP4 recordings, and evaluation metrics
Process
Targets, constraints, success.
Humans complete tasks. Dual-pass QA. PII scrubbed. Deterministic replays.
JSONL + MP4 + metrics. Ready for training, eval, or fine-tuning.
Services
Tailored data solutions for your agent development needs
Human Web Demonstrations
Screen recordings with precise action sequences for training web agents
- JSONL trajectories
- MP4 recordings
- DOM metadata
Complex Workflow Capture
Multi-step processes across enterprise applications and platforms
- End-to-end workflows
- Error handling paths
- Edge case coverage
Agent Evaluation Sets
Comprehensive test suites to validate agent performance
- Pass/fail criteria
- Action scoring
- Playwright replays
RLHF & Safety Data
Human feedback for reinforcement learning and safety alignment
- Pairwise rankings
- Rationale generation
- Adversarial scenarios
Frontier labs · Agent platforms · Enterprise AI teams.
Contact Us
Ready to enhance your agent with high-quality human data? Let's discuss your specific needs.