ABOUT THE ROLE
Peloton is looking for an outstanding Site Reliability Engineer with an EKS (Kubernetes) focus to work with teams across the Platform teams to help build and maintain a multi cluster, multi region, reliable and highly scalable Kubernetes platform. In this role you will have a rare and exciting opportunity to work with groundbreaking technologies that drive innovation and ensure the reliability of running workloads in a flexible, scalable and secure way.
YOUR DAILY IMPACT AT PELOTON
You will be a technical leader within your team, influencing and driving technical investments across partner teams with a "Platform Thinking" mindset. You will help others in design, execution, and problem solving. • Architect, develop, test, release, and support CI/CD systems such as Jenkins, GitHub Actions, Gradle, and Artifactory. • Adhere to best practices in architectural design, testing (unit, integration, visual, and regression), and scrum methodology. • Evaluate developer platform designs, technical decisions, and code to ensure all are high quality, efficient, and well documented. • Assist in planning, execution and updating of technical roadmaps.
As a Site Reliability Engineer, you will
- Host a critical infrastructure that ensures that our developers have the best experience possible on multiple kubernetes pods across multiple clusters
- Automatic, fast auto scaling for Connected Fitness devices and eCommerce platform
- Develop and manage our Container Orchestration Platform, overseeing a diverse ecosystem of over 2,000 applications. This includes Multi-Cluster/Multi-tenant Kubernetes with 15+ clusters per environment, Istio Multi-cluster Mesh, and an AWS multi-account structure.
- Design, enhance, and implement additional services for our centralized Observability Platforms, ensuring efficient log management based on Splunk, and effective monitoring and alerting powered by DataDog and PagerDuty.
- Design, build, and automate new solutions centered around the Kubernetes container orchestration platform and its ecosystem of projects
- Provide a platform for machine learning (and other exciting workloads) Allow developers to move quickly and experiment, without getting in the way
- Promote standard methodologies for building and operating highly reliable systems
- Consult in code and design reviews, planning and technical discussions to ensure all are high quality, efficient, and well documented and meet reliability and capacity requirements
- Automate everything, from infrastructure down to day-to-day tasks
- Adhere to best practices in architectural design, testing (unit, integration, load testing, and regression), and Agile scrum methodology
- Follow standard incident management process and demonstrate ability to conduct timely post-mortems of infrastructure incidents and high judgment in knowing when to triage and when to dive down into a root-cause analysis
- Assist with all aspects of operational security and compliance and seek out potential threats to security and reliability and advocate solutions
- Participate in a rotating on-call duty schedule, providing support and assistance for the services within our team's responsibility
Above all, be a technical leader within your team, influencing and driving technical investments across partner teams with a "Platform Thinking" attitude
YOU BRING TO PELOTON
- A degree in Computer Science, Engineering or similar field of study or equivalent work experience
- 3+ years of experience in software engineering, with a solid understanding of Kubernetes and Infrastructure as Code
- 1+ years systems configuration and automation experience (e.g. Ansible, Chef, Puppet, Terraform)
- Extensive knowledge and hands-on experience in AWS Cloud infrastructure and Services, including CI/CD and IaC provisioning tools such as Jenkins, ArgoCD, Scalr, Terraform and Github Actions
- Experience in a cloud environment like AWS or GCP, and familiarity with running containerized services
- Experience with a programming language like Python, Golang or Java.
- Knowledge of best practices in observability and monitoring for Kubernetes clusters at scale with experience in cost optimization tools like Kubecost, Goldilocks, etc.
- Knowledge of standard processes in regards to securing a Kubernetes cluster and its deployments at scale
BONUS
- Passion for helping development teams make the transition to a container-native world
- Passion for reliable, scalable, observable software with a strong sense of ownership
- Design and operate large, reliable and scalable distributed systems
- Knowledge of network infrastructure basics, including DNS, DHCP, firewalling, and load balancing, to facilitate multi-functional collaboration.
#LI-Hybrid
#LI-SW2
ABOUT PELOTON:
Peloton provides Members with expert instruction, world-class content and the fitness industry's leading music library to create impactful and entertaining workout experiences for anyone, anywhere and at any stage in their fitness journey. At home, outdoors, traveling, or at the gym, Peloton offers an immersive and personalized experience [with or without equipment]. Access Peloton content via the Peloton Bike, Bike+, Tread, Guide, Row or the Peloton App, now with multiple membership tiers. Founded in 2012 and headquartered in New York City, Peloton has a highly engaged community of nearly 7 million Members across the US, UK, Canada, Germany, and Australia.
Peloton is an equal opportunity employer and complies with all applicable federal, state, and local fair employment practices laws. Equal employment opportunity has been, and will continue to be, a fundamental principle at Peloton, where all team members, applicants, and other covered persons are considered on the basis of their personal capabilities and qualifications without discrimination because of race, color, religion, sex, age, national origin, disability, pregnancy, genetic information, military or veteran status, sexual orientation, gender identity or expression, marital and civil partnership/union status, alienage or citizenship status, creed, genetic predisposition or carrier status, unemployment status, familial status, domestic violence, sexual violence or stalking victim status, caregiver status, or any other protected characteristic as established by applicable law. This policy of equal employment opportunity applies to all practices and procedures relating to recruitment and hiring, compensation, benefits, termination, and all other terms and conditions of employment. If you would like to request any accommodations from application through to interview, please email: applicantaccommodations@onepeloton.com
Please be aware that fictitious job openings, consulting engagements, solicitations, or employment offers may be circulated on the Internet in an attempt to obtain privileged information, or to induce you to pay a fee for services related to recruitment or training. Peloton does NOT charge any application, processing, or training fee at any stage of the recruitment or hiring process. All genuine job openings will be posted here on our careers page and all communications from the Peloton recruiting team and/or hiring managers will be from an @onepeloton.com email address.
If you have any doubts about the authenticity of an email, letter or telephone communication purportedly from, for, or on behalf of Peloton, please email applicantaccommodations@onepeloton.com before taking any further action in relation to the correspondence.
Peloton does not accept unsolicited agency resumes. Agencies should not forward resumes to our jobs alias, Peloton employees or any other organization location. Peloton is not responsible for any agency fees related to unsolicited resumes.