Stellenbeschreibung:
GitLab is an open-core software company that develops the most comprehensive AI-powered DevSecOps Platform, used by more than 100,000 organizations. Our mission is to enable everyone to contribute to and co-create the software that powers our world. When everyone can contribute, consumers become contributors, significantly accelerating human progress. Our platform unites teams and organizations, breaking down barriers and redefining what's possible in software development. Thanks to products like Duo Enterprise and Duo Agent Platform, customers get AI benefits at every stage of the SDLC.
The same principles built into our products are reflected in how our team works: we embrace AI as a core productivity multiplier, with all team members expected to incorporate AI into their daily workflows to drive efficiency, innovation, and impact. GitLab is where careers accelerate, innovation flourishes, and every voice is valued. Our high-performance culture is driven by our values and continuous knowledge exchange, enabling our team members to reach their full potential while collaborating with industry leaders to solve complex problems. Co-create the future with us as we build technology that transforms how the world develops software.
An Overview Of This Role
As a Staff Engineer within the GitLab Operate team, you will lead the technical direction for GitLab's self-managed deployment strategy, with a particular focus on solving zero-downtime upgrades and operational excellence at scale. This is a high-impact technical leadership role where you'll architect and implement the systems that enable thousands of organizations to deploy, upgrade, and operate GitLab reliably in their own infrastructure.
You'll be the technical anchor for our newly formed Operate team, driving the evolution of GitLab's deployment tooling from traditional packaging approaches toward cloud-native, operator-driven automation. Your work will directly impact GitLab's ability to deliver new features to self-managed customers faster while dramatically reducing operational complexity and upgrading friction.
The GitLab Operate team serves as a critical bridge between GitLab engineering and our self-managed customers, ensuring our products are easily deployable, secure, and scalable across a range of environments—from single-node VM deployments to large-scale Kubernetes clusters supporting tens of thousands of users.
What You'll Do
Technical Leadership & Architecture
Define the technical vision for GitLab's cloud-native deployment and upgrades future, balancing operational simplicity, customer needs, and engineering constraintsLead the design and implementation of the new tooling, including Operator(s), enabling automated lifecycle management and zero-downtime upgradesArchitect upgrade orchestration systems that safely coordinate complex multi-component upgrades across databases, application services, and auxiliary componentsEstablish operational maturity standards and guidance for new services being integrated into GitLab's deployment tooling and empowering development teams for the end-to-end of their componentsDrive technical decisions around service integration patterns, deployment models, and operational interfaces
Lead complex initiatives overarching multiple groups and be the technical leadership voice that set the direction and drives technical decisions
Platform Engineering & Development
Design production-grade Kubernetes Operators that aims to reliable reconciliation logic for complex stateful applicationsDesign and implement upgrade orchestration that handles database migrations, rolling deployments, compatibility checks, and rollback capabilitiesDevelop tooling and automation to reduce the operational complexity of running GitLab at scaleCreate integration frameworks that enable development teams to ship new services with standardized deployment patternsMaintain and evolve GitLab Helm Charts to support both simple and complex deployment topologies
Database & Application Lifecycle Management
Contribute to safe database migration strategies for zero-downtime upgrades across PostgreSQL and other stateful componentsImplement compatibility layers that enable incremental upgrades without requiring simultaneous updates across all componentsDesign and contribute to build validation and pre-flight check systems that detect potential upgrade issues before they impact production
Cross-Functional Collaboration & Enablement
Partner with development teams to define integration requirements for new services and featuresCollaborate with GitLab Dedicated and Gitlab.com SRE teams to align deployment patterns and operational practicesWork with Product Management to translate customer needs into technical requirementsMentor and guide other engineers on the team, establishing technical standards and best practicesCreate technical documentation and runbooks that enable customer success and support teams
Production Operations & Reliability
Define and implement observability standards for self-managed deployments, including metrics, logging, and alertingBuild automated testing frameworks that validate deployment and upgrade scenarios across reference architecturesEstablish performance benchmarks and capacity planning guidance for different deployment scalesDesign resilience patterns for handling failures during upgrades and operationsContribute to incident response and post-mortems for self-managed deployment issues
What You'll Bring
Required Experience & Skills
8+ years of software engineering experience with at least 3+ years in platform engineering or infrastructure rolesExpert-level Go proficiency (Ruby and Rails as a plus) with demonstrated ability to work in large, complex codebasesProduction Kubernetes experience, including:Building and maintaining Kubernetes OperatorsDesigning Helm charts for complex stateful applicationsUnderstanding of Custom Resource Definitions (CRDs), admission controllers, and controller patternsExperience with stateful workloads, persistent volumes, and storage classesCloud-native architecture experience, including service mesh, observability stacks, and infrastructure as codeExperience shipping production software that customers install and operate in their own infrastructureUnderstanding of Linux systems, including package management, systemd, and system-level debugging
Highly Valued Experience
Experience building or maintaining Operators for complex stateful applications (databases, message queues, etc.)Ruby on Rails expertise and understanding of Rails application architectureInfrastructure automation using Terraform, Ansible, or similar toolsBackground in Site Reliability Engineering or DevOps with production on-call experienceUnderstanding of compliance and security requirements for enterprise software deploymentsExperience with observability platformsOpen source contribution history, particularly in infrastructure or deployment tooling
Technical Leadership Qualities
Technical influence and communication: Ability to design holistic solutions balancing multiple constraints, write clear technical proposals and documentation, and work across teams influencing without direct authorityTeam development and execution: Track record of mentoring and elevating team capabilities through teaching and code review, combined with pragmatic decision-making and bias for action when facing incomplete information
What Makes You Stand Out
You've built Kubernetes Operators in production and dealt with the operational complexities of stateful workload managementYou have deep PostgreSQL expertise, including schema design and migration strategies, replication, backup, and recovery, handling database upgrades with minimal downtimeYou have deep experience with database migrations at scale and understand the tradeoffs between downtime and complexityYou've shipped software that customers install on-premises and have felt the pain of upgrade friction firsthandYou contribute to open source infrastructure projects and understand community dynamicsYou can explain complex technical concepts clearly to both technical and non-technical audiencesYou have experience with zero-downtime deployment strategies for monolithic applications transitioning to microservicesYou've been on-call for production systems and understand what makes software operable
About The Team
The Operate team is part of GitLab Delivery and focuses on delivering GitLab to self-managed users through supported and validated tooling. This includes maintaining and evolving the GitLab Omnibus package, Helm Charts, GitLab Operator, and the GitLab Environment Toolkit (GET).
We partner with SRE, Release, Security, and Development teams to ensure GitLab is easily deployable, supportable, and production-ready in diverse environments—from small single-node deployments to large enterprise-scale Kubernetes clusters.
Current Challenges We're Tackling
Zero-downtime upgrades: Enabling self-managed customers to upgrade GitLab without service interruptionOperational complexity: Reducing the burden of managing GitLab at scale while expanding our service architectureCloud-native transition: Building the next generation of deployment tooling while supporting existing customersUpgrade velocity: Reducing the time it takes for 80% of self-managed customers to adopt new releases from 7.8 months to 4 months
Team Structure
You'll be joining a newly consolidated Operate team that is building the capability to deliver GitLab's expanding service architecture to self-managed customers. As a Staff engineer, you'll work closely with the engineering manager and product manager to define technical direction while mentoring other engineers on the team.
How GitLab Will Support You
Benefits to support your health, finances, and well-beingFlexible Paid Time Off Team Member Resource GroupsEquity Compensation & Employee Stock Purchase PlanGrowth and Development FundParental leave Home office support
Please note that we welcome interest from candidates with varying levels of experience; many successful candidates do not meet every single requirement. Additionally, studies have shown that people from underrepresented groups are less likely to apply to a job unless they meet every single qualification. If you're excited about this role, please apply and allow our recruiters to assess your application.
Country Hiring Guidelines: GitLab hires new team members in countries around the world. All of our roles are remote, however some roles may carry specific location-based eligibility requirements. Our Talent Acquisition team can help answer any questions about location after starting the recruiting process.
Privacy Policy: Please review our Recruitment Privacy Policy. Your privacy is important to us.
GitLab is proud to be an equal opportunity workplace and is an affirmative action employer. GitLab’s policies and practices relating to recruitment, employment, career development and advancement, promotion, and retirement are based solely on merit, regardless of race, color, religion, ancestry, sex (including pregnancy, lactation, sexual orientation, gender identity, or gender expression), national origin, age, citizenship, marital status, mental or physical disability, genetic information (including family medical history), discharge status from the military, protected veteran status (which includes disabled veterans, recently separated veterans, active duty wartime or campaign badge veterans, and Armed Forces service medal veterans), or any other basis protected by law. GitLab will not tolerate discrimination or harassment based on any of these characteristics. See also GitLab’s EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know during the recruiting process.
NOTE / HINWEIS:
EN: Please refer to Fuchsjobs for the source of your application
DE: Bitte erwähne Fuchsjobs, als Quelle Deiner Bewerbung