Junior AI & Evaluation Engineer

Confidential   Amman - Jordan  Date Posted: 2025/11/18   Login to Apply

Responsibilities

  • Collaborate with the AI Product team to deliver high-quality, scope-aligned features that meet product acceptance criteria.
  • Own end-to-end delivery: define problems, write documentation, design prompts/RAG/tools, and ship features with clear success metrics.
  • Build and maintain RAG pipelines, including chunking/embedding, hybrid retrieval (keyword + dense), reranking, vector indexing, and enforcing tenant isolation.
  • Implement safe and reliable tool/function calling into internal APIs with proper guardrails, monitoring, and tracing.
  • Develop golden test sets and regression suites; run offline/online evaluations for groundedness, factuality, safety, latency, and cost.
  • Monitor retrieval performance (recall@k, MRR/NDCG, coverage, drift) and ensure launch readiness based on retrieval and answer quality.
  • Uphold safety and compliance by mitigating injections/jailbreaks, applying policy filters, maintaining audit logs, and ensuring privacy alignment.
  • Instrument token usage, latency, and caching strategies; implement semantic/fallback caching; manage vendor SLAs and cost optimization.
  • Support LLMOps processes: version prompts/datasets/configurations, maintain CI/CD pipelines, run canary/shadow tests, and enforce “evals-as-gates.”
  • Produce design documents and model cards; build reusable components, SDKs, and templates; mentor junior team members.

Qualifications

  • Bachelor’s degree in computer science, Information Systems, or a related field.
  • 3+ years of experience building production-grade AI/ML backend features, including recent LLM-based applications.
  • Experience working at a leading MENA company or a high-bar startup.
  • Native-level proficiency in Arabic and English, both written and spoken.
  • Strong technical stack: Python and/or Node.js, AWS/GCP, containers, IaC, and modern CI/CD pipelines.
  • Hands-on experience with LLM productization (prompt engineering, tool/function calling, RAG architectures).
  • Strong evaluation capabilities: test-set creation, offline metrics, online A/B testing, regression suites, and quality dashboards.
  • Familiarity with LLM and embedding models, as well as frameworks like LangChain, LlamaIndex, and Haystack.
  • Experience with vector databases (Pinecone, Milvus, pgvector, OpenSearch), SQL/NoSQL, Redis, and data schema/versioning.
  • Knowledge of governance principles: privacy, security, PDPL compliance, multi-tenant isolation, and auditability.
  • Demonstrated AI fluency—actively uses AI/LLMs to enhance research, coding, testing, and documentation in a secure manner.
Required Skills
  • AI
Job Details
  • Location Amman - Jordan
  • Industry Information & Communication Technologies
  • Job Type Full-Time
  • Degree Bachelor
  • Experience 3+
  • Nationality Unspecified
Login to Apply

Similar Jobs