Multimodal Data
Our dataset offers video + audio + transcript bundles, enriched with detailed metadata for AI training. Here’s a sample of the data we capture:
Data Package
- Audio & Video: 1080p video, clear audio, natural settings
- Transcripts: Dialect + MSA + English + Phonetic versions
- Metadata: Location, dialect, demographics, environment
- Contextual Tags: Topic, sentiment, emotion, code-switching, non-verbal cues
- Annotations: Gestures, facial expressions, lighting
- Consent: Verbal + model release signed
- Diarization: Speaker turns with timestamps
Metadata Example
{ "recording_id": "LEB_AKKAR_001", "file_paths": { ... }, "dialect": { "country": "Lebanon", "region": "Akkar" }, "speakers": [ { "age": 45, "gender": "Male" }, { "age": 38, "gender": "Female" } ], "recording_environment": { "location_type": "Outdoor farm" }, "conversation_context": { "topic": "Agriculture", "code_switching": true }, "annotations": { "face_visible": true, "emotion_tags": ["happy", "curious"] }, "consent": { "form_signed": true } }
Want to learn more or request a sample? Contact us.