Jobs / Ant***
Software Engineer, RL Data
Ant*** · London, ENG, United Kingdom
Visa sponsorship details are locked. Unlock company name and apply link with .
London, ENG, United Kingdom320,000-485,000 GBP/yearlyHybrid
Remuneration
320,000-485,000 GBP/yearly
Location
London, ENG, United Kingdom
Visa sponsorship
Sponsors visa
Job summary
ABOUT Ant*** Ant***’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.
Benefits
Guidance on Candidates' AI Usage: Learn about our policy for using AI in our app
Qualifications
- We're looking for experienced engineers who own outcomes end-to-end — down to reading transcripts, supporting users, and wrangling vendors.
- Trusted to run key projects: you lead and inspire others, plan workstreams effectively, collaborate with cross-functional stakeholders, and proactively eliminate or escalate blockers.
- Strong software engineering
- Experience with reinforcement learning on LLMs, particularly on the data side: creating evals, environments, rewards, graders, or training data.
- Experience helping organizations use AI more effectively, including integrating with third-party
- for the position
- Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time.
- However, some roles may require more time in our offices.
- Visa sponsorship: We do sponsor visas!
- However, we aren't able to successfully sponsor visas for every role and every candidate.
- But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.
- We encourage you to apply even if you do not believe you meet every single qualification.
Responsibilities
- This is a senior, foundational role on a new team: you'll make architecture decisions the rest of the team builds on, and help shape what we build first.
- Own significant parts of our stack end-to-end, from technical architecture through the unglamorous operational work that makes it succeed.
- Build data collection pipelines, read the transcripts they produce, and iterate on prompts, evals, and graders until the output is good.
- Develop and improve QA frameworks to catch reward hacking and ensure environment quality.
- Build interfaces that make collecting human data fast and painless for the people providing it.
- Harden execution environments — sandboxing, snapshotting, tool coverage — so tasks hold up at training scale.
- Embed with the teams and domain experts who use our systems day-to-day, and work with operations, security, and compliance partners to roll our systems out to new users and vendors.
- MINIMUM
Skills
Communication
Degrees
AssociateDegree
Industry
AutomotiveBankingEducationEnergyGamingLogistics
Company size
Startup