Jobs / Ant***

Software Engineer, RL Data

Ant*** · London, ENG, United Kingdom
Visa sponsorship details are locked. Unlock company name and apply link with .
London, ENG, United Kingdom320,000-485,000 GBP/yearlyHybrid
Remuneration
320,000-485,000 GBP/yearly
Location
London, ENG, United Kingdom
Visa sponsorship
Sponsors visa

Job summary

ABOUT Ant*** Ant***’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

Benefits

Guidance on Candidates' AI Usage: Learn about our policy for using AI in our app

Qualifications

  • We're looking for experienced engineers who own outcomes end-to-end — down to reading transcripts, supporting users, and wrangling vendors.
  • Trusted to run key projects: you lead and inspire others, plan workstreams effectively, collaborate with cross-functional stakeholders, and proactively eliminate or escalate blockers.
  • Strong software engineering
  • Experience with reinforcement learning on LLMs, particularly on the data side: creating evals, environments, rewards, graders, or training data.
  • Experience helping organizations use AI more effectively, including integrating with third-party
  • for the position
  • Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time.
  • However, some roles may require more time in our offices.
  • Visa sponsorship: We do sponsor visas!
  • However, we aren't able to successfully sponsor visas for every role and every candidate.
  • But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.
  • We encourage you to apply even if you do not believe you meet every single qualification.

Responsibilities

  • This is a senior, foundational role on a new team: you'll make architecture decisions the rest of the team builds on, and help shape what we build first.
  • Own significant parts of our stack end-to-end, from technical architecture through the unglamorous operational work that makes it succeed.
  • Build data collection pipelines, read the transcripts they produce, and iterate on prompts, evals, and graders until the output is good.
  • Develop and improve QA frameworks to catch reward hacking and ensure environment quality.
  • Build interfaces that make collecting human data fast and painless for the people providing it.
  • Harden execution environments — sandboxing, snapshotting, tool coverage — so tasks hold up at training scale.
  • Embed with the teams and domain experts who use our systems day-to-day, and work with operations, security, and compliance partners to roll our systems out to new users and vendors.
  • MINIMUM

Skills

Communication

Degrees

AssociateDegree

Industry

AutomotiveBankingEducationEnergyGamingLogistics

Company size

Startup