Senior Platform Software Engineer
Job description
Shepherd is OCI’s release orchestration platform, responsible for safely coordinating software, infrastructure, configuration, and operational changes across regions, realms, and critical cloud services. As OCI grows across commercial, sovereign, disconnected, regulated, and large-scale region-build environments, Shepherd must continue evolving as a reliable platform for cloud-scale change. We are seeking an IC3 Senior Platform Software Engineer to design, build, operate, and improve core Shepherd platform capabilities. This role is suited for an engineer who can independently own bounded components, write production-quality code, debug distributed systems, improve APIs and workflows, and collaborate with partner teams to solve practical release-orchestration problems. The successful candidate will help Shepherd improve deployment safety, workflow reliability, approval automation, dependency modeling, rollback behavior, operational visibility, and developer experience. They should care about readable code, clear abstractions, strong tests, simple interfaces, secure implementation, and systems that are maintainable under real production pressure. This is a hands-on software engineering role. The engineer will deliver features and fixes, participate in design and code reviews, improve team practices, write useful documentation, and contribute to automation that helps Shepherd scale without normalizing manual toil. Own and evolve bounded Shepherd platform components, services, APIs, workflows, or integration surfaces used by OCI service teams. Design, implement, test, and operate software features that improve release orchestration, deployment safety, approval workflows, dependency handling, rollback behavior, or operational automation. Write clean, maintainable, production-ready code with clear boundaries, practical abstractions, meaningful tests, and appropriate observability. Debug moderately complex issues across Shepherd services, databases, APIs, workflow execution paths, deployment integrations, and downstream service interactions. Improve API contracts, service interfaces, compatibility behavior, error handling, versioning, and integration patterns for Shepherd consumers. Contribute to design documents, implementation plans, code reviews, operational guides, and adoption documentation for team-owned platform capabilities. Lead or contribute to code reviews in owned areas, helping improve correctness, readability, test coverage, maintainability, security, and production readiness. Collaborate with SRE, security, compliance, region build, infrastructure, and service teams to understand requirements and deliver safe, practical platform improvements. Use AI-assisted engineering and automation responsibly where useful, including test generation, refactoring support, documentation, migration assistance, and productivity tooling, while validating output carefully. Improve system resilience through better retry behavior, idempotency, failure handling, telemetry, capacity awareness, and safer rollout mechanisms. Help reduce operational load by automating repetitive workflows, improving diagnostics, removing confusing behavior, and simplifying support paths. Participate in incident follow-up and defect resolution, translating production feedback into durable code, test, documentation, or process improvements. Contribute to a team culture that values thoughtful design, clear communication, constructive review, disciplined delivery, continuous improvement, and shared ownership of Shepherd’s long-term quality.