Job Overview
At Moss, we give finance professionals the power to automate their day-to-day and make forward-thinking decisions.
Our team and culture make us unique — we’re driven by impact and growth, where every one of us strives to learn and excel. Recognised by Sifted’s Rising 100 and LinkedIn's Top Startups , we’re here to help propel your career and together, make Moss a lasting success.
As a Senior Platform Engineer , you will join our core Platform team that designs, builds, and maintains the infrastructure powering Moss. You will work on critical systems that must be updated without downtime , ensuring our services remain secure, scalable, and resilient. You’ll collaborate closely with product, data, and security teams, balancing planned initiatives with incident response, cloud engineering, and regular maintenance.
Your responsibilities • Design, build, and operate cloud-native infrastructure (GKE, Kubernetes, networking, databases) supporting a high-availability, low-latency FinTech platform processing real-time payments across Europe.
• Own the reliability and scalability of 100+ microservices - including defining and enforcing SLOs, managing autoscaling strategies, and driving resilience patterns like circuit breakers, bulkheads, and graceful degradation.
• Lead safe, continuous deployment practices across a fully automated CD pipeline - including rollout strategies, rollback mechanisms, and deployment observability at scale.
• Drive observability across the platform - metrics, distributed tracing, and structured logging - with a focus on reducing MTTR and enabling engineers to self-serve incident diagnosis.
• Manage and evolve infrastructure-as-code (Terraform, Helm) with a no-ClickOps discipline - every change peer-reviewed, version-controlled, and auditable.
• Champion security and compliance practices including Zero Trust architecture, Workload Identity, dynamic secrets via Vault, network policies, and audit readiness (ISO27001, SOC2).
• Own incident response across networking, load balancing, Kubernetes, and cloud services - and drive post-incident improvements that prevent recurrence.
• Raise the engineering bar - actively contribute to architectural decisions, review platform changes, and help grow the early-senior engineers on the team.
About you
• 7+ years total experience with at least 4+ years in platform, infrastructure, or SRE roles in a cloud-native environment.
• Deep Kubernetes expertise - scheduling internals, autoscaling (HPA/VPA/KEDA), pod lifecycle, network policies, PodDisruptionBudgets, and multi-zone topology. Not just operational familiarity, you understand what breaks and why.
• Strong grasp of microservices operational challenges at scale - service mesh, inter-service resilience patterns, connection pool management, graceful shutdown, and database migration safety in a continuous deployment model.
• Solid CI/CD experience - designing pipelines for 100+ services, immutable artefact management, Workload Identity Federation, and automated rollback. GitHub Actions experience is a plus.
• Hands-on observability experience - building platforms covering metrics, logs, and distributed traces including across async boundaries (e.g. Kafka). Able to connect instrumentation to incident workflow, not just tooling setup.
• Proficiency in infrastructure-as-code - Terraform and Helm as primary tools, with a strong IaC-first mindset.
• Programming proficiency in Golang and/or shell scripting for platform tooling; familiarity with Java/SpringBoot operational characteristics is a plus.
• Proven troubleshooting skills across distributed systems - latency contagion, cascading failures, connection exhaustion, and autoscaling lag under traffic spikes.
• Collaborative, low-ego working style - comfortable in a small, high-trust team where engineers raise PRs instead of tickets.
Nice to Have • Experience with dynamic secrets management via HashiCorp Vault, including database credential rotation.
• Familiarity with GCP-specific primitives - Workload Identity, GKE Autopilot vs. Standard tradeoffs, Cloud Armor, VPC-native networking.
• Experience with KEDA or scheduled scaling strategies for predictable traffic spikes.
• Cloud cost optimisation - spot/preemptible node strategies, resource right-sizing, log volume and cardinality management.
• Prior experience in a regulated FinTech or financial services environment.
About Moss
Moss is a SaaS scale-up founded in Berlin, with a team of 300+ people from 50+ nationalities in 5 offices across Europe.
Our ambition is bold: to power every SMB’s spend across Europe - fully digital, AI-driven, and seamlessly integrated for complete control. To date, over 5000 businesses in Germany, Netherlands and the UK use Moss’ leading spend management product, with modules such as corporate cards , accounts payables , employee cash reimbursements and procurement .
Moss has raised a total of €180 million in funding and is backed by the most renowned tech investors including Valar Ventures, Tiger Global, Global Founders Capital, Cherry Ventures and A-Star.
Be part of a culture that thrives on impact and speed, where you can take bold moves, learn fast and accomplish more. We’re a place where you can fast track your career - here's what else to expect:
• Top-of-market compensation package, including equity.
• Our vibrant offices are at the heart of our culture, where in-person time fuels collaboration and connection over weekly breakfasts and Friday demos.
• Additional benefits include: 20 days “work from abroad”, 600EUR/GBP Learning & Development Budget, and other local benefits.
Unless stated otherwise, benefits apply to full-time positions (interns and working students receive a tailored package).
By applying for the above position, you will confirm that you have reviewed and agreed to our Data Privacy Policy .