Join Us in Building the First Arabic Dialect AI Dataset

At Dialect Data, we’re on a mission to make AI truly understand the Middle East—its languages, cultures, and people. We capture real conversations in Levantine, Iraqi, and Yemeni dialects to build datasets that are ethical, authentic, and deeply human.

We’re not a big corporation. We’re a small, scrappy team—passionate about inclusion, fairness, and amplifying real voices. If you’re mission-driven and want to build something that doesn’t exist yet, we’d love to hear from you.

Open Positions

  • Project Manager (Beirut-based)
    Lead daily operations for Lebanon & Syria pilots. Coordinate teams, budgets, and timelines.
  • Syria Country Lead (Remote + Field Visits)
    Build networks, guide data collection, and ensure cultural and ethical fit in Syria.
  • Data Collectors (Lebanon & Syria)
    Capture conversations in homes, markets, and cafés. Explain the project and secure consent.
  • Annotators (Levantine Dialects)
    Transcribe, paraphrase to MSA, translate to English, and tag metadata like code-switching, emotions, and cultural notes.
  • Technical Lead (Remote)
    Build data pipelines, manage file structures, and ensure quality.
  • AI Research Advisor (Part-Time)
    Guide dataset development for AI readiness.
  • Marketing & Outreach (Remote)
    Share our story with the world. Build our brand and engage communities.

Why Dialect Data?

  • Shape the future of AI with real, diverse voices.
  • Be part of an ethical, impact-driven project.
  • Work remotely, collaborate globally.

Interested? Email us at team@dialectdata.com. Tell us why you care about dialects and AI!