Job Description:
About Us
At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. Responsible Growth is how we run our company and how we deliver for our clients, teammates, communities, and shareholders every day.
One of the keys to driving Responsible Growth is being a great place to work for our teammates around the world. We’re devoted to being a diverse and inclusive workplace for everyone. We hire individuals with a broad range of backgrounds and experiences and invest heavily in our teammates and their families by offering competitive benefits to support their physical, emotional, and financial well-being.
Bank of America believes both in the importance of working together and offering flexibility to our employees. We use a multi-faceted approach for flexibility, depending on the various roles in our organization.
Working at Bank of America will give you a great career with opportunities to learn, grow and make an impact, along with the power to make a difference. Join us!
Position Summary
We are seeking an experienced and dynamic leader to lead and drive our Enterprise Cloud Operations SRE team. As a Cloud SRE leader, you will be responsible for overseeing the design, implementation, and maintenance of our cloud infrastructure, ensuring its reliability, scalability, and performance. You will lead a team of skilled SRE engineers, collaborating closely with cross-functional teams to optimize our cloud systems and deliver exceptional operational efficiency.
- Leadership: Provide strategic direction and leadership to the Cloud SRE team, fostering a culture of innovation, collaboration, and continuous improvement.
- Cloud Infrastructure Operations: Oversee the management and operations of our cloud infrastructure, ensuring high availability, scalability, and performance across all environments.
- Reliability and Resilience: Define and implement best practices for system reliability, fault tolerance, disaster recovery, and incident management to minimize service disruptions and improve system resilience.
- Performance Optimization: Collaborate with software engineering teams to optimize cloud infrastructure performance, capacity planning, and resource utilization, ensuring efficient scalability and cost-effectiveness.
- Automation and Tooling: Drive automation initiatives to streamline operational workflows, leveraging modern tooling, and DevOps practices to enhance efficiency and reliability.
- Monitoring and Alerting: Define and implement comprehensive monitoring and alerting strategies to proactively identify and address potential issues, enabling proactive incident response and troubleshooting.
- Incident Management: Establish incident response processes and lead incident management efforts, ensuring timely resolution, root cause analysis, and prevention of recurring issues.
- Security and Compliance: Collaborate with security teams to implement cloud infrastructure security measures, ensuring compliance with relevant regulations and industry best practices.
- Vendor Management: Manage relationships with cloud service providers, overseeing contract negotiations, service-level agreements (SLAs), and vendor performance.
- Team Development: Mentor and develop the Cloud SRE team, fostering their technical growth and professional advancement through training, performance management, and regular feedback.
Required Skills:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Proven experience of 10+ years in a similar leadership role, managing cloud infrastructure operations and leading SRE teams.
- Strong knowledge and hands-on experience with cloud platforms such as AWS, Azure, or GCP.
- Proficiency in infrastructure-as-code (IaC) and configuration management tools like Terraform, Ansible, or Chef.
- Solid understanding of DevOps principles, CI/CD pipelines, and automation frameworks.
- Experience in incident management, root cause analysis, and implementing reliability engineering practices.
- Excellent problem-solving skills and ability to analyze complex systems to identify areas for improvement.
- Strong communication and collaboration skills to work effectively with cross-functional teams.
- Familiarity with security best practices and regulatory compliance frameworks (e.g., GDPR, HIPAA).
Desired Skills:
- Relevant certifications (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer) are a plus.
Shift:
1st shift (United States of America)
Hours Per Week:
40