Job Overview
As an engineer focused on node operations and infrastructure at Alchemy, you'll work within a fast-paced engineering team on the design, deployment, and continuous improvement of the blockchain infrastructure that powers our developer platform used globally. You'll operate state-of-the-art systems for running blockchain RPC nodes at scale across many chains and regions, and leverage your experience to keep a highly available live system running for our customers.
Responsibilities:
• Deploy, operate, and maintain blockchain RPC nodes across multiple chains and multiple geographic regions
• Manage Kubernetes clusters that represent the underlying platform for all blockchain nodes
• Perform rolling upgrades and hard fork migrations for blockchain clients across EVM and non-EVM chains
• Operate on-call rotations, triage live incidents via PagerDuty, and coordinate resolution across teams for node outages, latency spikes, and SLO breaches
• Develop and maintain AI agents / automation tooling for health checks, auto-heal, hard fork notifications
• Deploy and manage services via ArgoCD and GitOps workflows (Helm charts)
• Manage bare-metal and cloud infrastructure including provisioning, benchmarking, and hardware replacement
• Respond to security advisories promptly and coordinate upgrades with minimal downtime
• Contribute to postmortems and async review processes; track action items and follow up on resolutions
• Collaborate cross-functionally with product, customer success, and other engineering teams on chain deprecations, capacity planning, and SLO reporting
What We're Looking For:
• Experience designing and operating large-scale, multi-region, multi-cloud production systems
• Experience with Kubernetes (k3s or similar), including StatefulSets, storage management, Secrets and service mesh (Istio)
• Experience with secrets management and access control in multi-cluster environments
• Familiarity with automation frameworks for node health checks, upgrades, and remediation workflows
• Experience with Infrastructure-as-Code (e.g. Terraform, Ansible, Pulumi, CloudFormation, Chef, Puppet, etc)
• Experience with GitOps tooling - ArgoCD, Helm, and managing deployments
• (Preferred) Experience with service mesh deployments such as Istio
• Proficiency with cloud infrastructure and bare-metal management, including storage provisioning and snapshot management
• Strong grasp of observability tooling - Grafana, Prometheus, Alertmanager - and experience building or tuning dashboards and alert rules
• Comfort working in an on-call environment, triaging production incidents quickly and calmly using PagerDuty and structured runbooks
• Ability to write clear technical documentation and postmortems, and contribute to async-first team communication
• Experience with networking and configuring / managing VPC networks
• A basic understanding of security best practices
• (Preferred) Good understanding of web applications, microservice architecture
• (Preferred) Experience working with startups
• Passion for blockchain technologies and Web3
Perks:
• Attractive salary package
• Opportunity to work with the latest cloud and blockchain technologies
• Flexible time away
• Private Medical Insurance
• Start-up environment: internal off-site hackathons, access to company-rented hacker house during summer
• Opportunity to travel across offices