Our client is a global UK-based financial services and investment banking organization developing an AI-powered document processing and automation platform used across multiple business units. The platform enables large-scale data processing, workflow automation, and secure integration with internal systems, supporting the bank’s digital transformation initiatives.
Position Overview:
We are looking for a Senior DevOps (GCP) Engineer to design, build, and maintain cloud-native infrastructure and CI/CD pipelines for a large-scale AI-driven document processing platform. The ideal candidate has extensive hands‑on experience with Google Cloud Platform (GCP) , container orchestration, automation, observability, and DevOps best practices. Experience supporting AI/ML or GenAI workloads is a strong advantage. You will work closely with engineering, architecture, platform, and GenAI teams to ensure secure, scalable, and reliable delivery environments across development, testing, and production.
Key Responsibilities:
- Design, implement, and maintain cloud infrastructure on Google Cloud (GKE, Cloud Run, Cloud Functions, Pub/Sub, Cloud Storage).
- Build and optimize CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, or similar).
- Develop infrastructure-as-code using Terraform or similar tools.
- Set up and maintain container orchestration (Kubernetes, GKE) and automated deployment workflows.
- Implement monitoring, alerting, and observability using tools such as Prometheus, Grafana, ELK/Elastic, Stackdriver, or OpenTelemetry .
- Ensure compliance with security and governance standards across all environments.
- Collaborate closely with engineering teams to ensure scalable, high‑performance deployment architectures.
- Support AI/ML and GenAI workloads (Vertex AI pipelines, model hosting, GPU workloads, inference optimization).
- Manage environment strategies , release pipelines, configuration management, and secrets management.
- Optimize cloud costs and recommend improvements for performance and reliability.
- Troubleshoot production issues, ensure system uptime, and improve incident response processes.
Requirements:
- 5+ years of experience in DevOps, Cloud Engineering, or SRE roles.
- Strong hands‑on experience with Google Cloud Platform , including:
- Cloud Run, Cloud Functions
- VPC, IAM, networking, security
- Expertise in Terraform , Helm, or other IaC tools.
- Strong understanding of containerization and orchestration: Docker, Kubernetes .
- Solid experience with monitoring, observability, and logging stacks.
- Familiarity with networking, load balancing, security hardening, and zero‑trust principles .
- Experience supporting production systems in high‑availability, distributed environments.
- Strong scripting skills (Python, Bash, or similar).
- Experience working with agile engineering teams.
- Excellent communication skills and ability to collaborate across technical and business teams.
Nice to Have:
- Experience supporting AI/ML or GenAI pipelines , especially Vertex AI, model deployment, or GPU workloads.
- Experience with service mesh (Istio, Linkerd) or advanced networking patterns.
- Familiarity with FinTech, banking, or regulated environments .
- Experience with policy enforcement tools (OPA, Gatekeeper) or secrets solutions (Vault, Secret Manager).
- Experience with cost optimization for large‑scale cloud workloads.
We offer:
- Flexible working format - remote, office‑based or flexible
- A competitive salary and good compensation package
- Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
- Active tech communities with regular knowledge sharing
Project: Largest student transportation provider
#J-18808-Ljbffr