Skip to main content

Join Us in Building Arabic Dialect Data for Production AI

Current in-hand inventory includes conversational Lebanese Arabic and licensed Egyptian academic text (MSA). We’re expanding responsibly across Iraq, Yemen, Algeria, and Morocco.

At Dialect Data, we build licensed and rights-cleared, commercially licensable Arabic datasets for production AI. Our currently published inventory includes conversational Lebanese Arabic from Mount Lebanon and licensed Egyptian academic text in MSA.

We are a small operating team focused on reproducible collection, metadata consistency, and licensing clarity. If you want to work on speech/data operations with concrete delivery standards, we’re happy to hear from you.

These roles map to current operations and approved custom scopes. On-request expansion geographies are currently Iraq, Yemen, Algeria, and Morocco.

Open Positions

Project Manager

Coordinate field operations for the Mount Lebanon conversational data program, manage local teams, and keep collection and delivery timelines on track.

Talk to us

Country Expansion Lead

Build local partner networks and guide collection readiness for approved on-request country or dialect scopes.

Talk to us

Data Collectors

Capture conversations in homes, markets, and cafés. Explain the project and secure consent.

Talk to us

Annotators

Transcribe dialect Arabic, normalize to MSA, and validate metadata fields and consent/license flags for each record.

Talk to us

Technical Lead

Build data pipelines, manage file structures, and ensure quality.

Talk to us

AI Research Advisor

Guide dataset development for AI readiness.

Talk to us

Marketing & Outreach

Grow our community and public presence. Drive campaigns to engage dialect speakers, and communicate our mission to partners and the public.

Talk to us

To apply, email team@dialectdata.com with your CV and a short note on why you care about dialects and AI.

Why Dialect Data?

  • Shape the future of AI with real, diverse voices.
  • Be part of an ethical, impact-driven project.
  • Work remotely, collaborate globally.

Interested? Email us at team@dialectdata.com. Tell us why you care about dialects and AI!

Prefer a quick message? Use our contact form.

Updated Mar 2026 – If none of the above roles fit but you’re passionate about our mission, feel free to drop us a line. We’re growing and new opportunities may arise!