Multimodal Data
Our dataset offers video + audio + transcript bundles, enriched with detailed metadata for AI training. Here’s a sample of the data we capture:
Data Package
- Audio & Video: 1080p video, clear audio, natural settings
- Transcripts: Dialect + MSA + English + Phonetic versions
- Metadata: Location, dialect, demographics, environment
- Contextual Tags: Topic, sentiment, emotion, code-switching, non-verbal cues
- Annotations: Gestures, facial expressions, lighting
- Consent: Verbal + model release signed
- Diarization: Speaker turns with timestamps
Metadata Example
{
"recording_id": "LEB_AKKAR_001",
"file_paths": { ... },
"dialect": { "country": "Lebanon", "region": "Akkar" },
"speakers": [ { "age": 45, "gender": "Male" }, { "age": 38, "gender": "Female" } ],
"recording_environment": { "location_type": "Outdoor farm" },
"conversation_context": { "topic": "Agriculture", "code_switching": true },
"annotations": { "face_visible": true, "emotion_tags": ["happy", "curious"] },
"consent": { "form_signed": true }
}Want to learn more or request a sample? Contact us.