Software Engineer (SRE Tools & Automation), IS&T Enterprise Systems (apple)
apple Sunnyvale, United States
2024-10-27
Job posting number: #153654 (Ref:apl-200570513)
Job Description
Summary
The people here at Apple don’t just build products — we craft the kind of wonder that’s revolutionized entire industries. It’s the diversity of those people and their ideas that supports the innovation that runs through everything we do, from amazing technology to industry-leading environmental efforts. Join Apple, and help us leave the world better than we found it!
We at Apple build stuff to amaze our users with ultra fast thoughtfully designed and carefully crafted solutions. We are not just any team but a highly motivated fast paced and ever evolving closely knit team of individuals looking to get more to scale to new heights.We have a thing for individuals like yourself who don’t stop at mediocrity or don’t settle for anything less than perfect.
Apple's Service Management & GCRM team builds the applications and systems that powers customer support experiences for Apple’s global stores and service providers. Our team drives the experiences in MobileGenius, Repair Central client apps and Global Service Exchange web app, as well as GCRM services that are integrated with global repairs.
At Apple, customer experience is at the forefront of everything we do. To help us build functional systems that improve the customer experience, we're looking for a SRE/DevOps lead engineer who can be responsible for managing production support for multiple global applications, deploying product components, enable best developer productivity and implementing CICD integrations that meet our product needs. The ideal candidate will possess a strong background in production monitoring, readiness, measurement of system health, incident management, a deep understanding of operations excellence and a proven track record in managing large-scale production systems. If you have the passion for building scalable cloud infrastructure that is highly reliable, secure, highly available and responsive, then this job is for you.
We at Apple build stuff to amaze our users with ultra fast thoughtfully designed and carefully crafted solutions. We are not just any team but a highly motivated fast paced and ever evolving closely knit team of individuals looking to get more to scale to new heights.We have a thing for individuals like yourself who don’t stop at mediocrity or don’t settle for anything less than perfect.
Apple's Service Management & GCRM team builds the applications and systems that powers customer support experiences for Apple’s global stores and service providers. Our team drives the experiences in MobileGenius, Repair Central client apps and Global Service Exchange web app, as well as GCRM services that are integrated with global repairs.
At Apple, customer experience is at the forefront of everything we do. To help us build functional systems that improve the customer experience, we're looking for a SRE/DevOps lead engineer who can be responsible for managing production support for multiple global applications, deploying product components, enable best developer productivity and implementing CICD integrations that meet our product needs. The ideal candidate will possess a strong background in production monitoring, readiness, measurement of system health, incident management, a deep understanding of operations excellence and a proven track record in managing large-scale production systems. If you have the passion for building scalable cloud infrastructure that is highly reliable, secure, highly available and responsive, then this job is for you.
Description
Proven experience as SRE, with a focus on operations management and demonstrate expertise in managing large-scale production outages and leading incident response.
Engage with teams across Apple to Architect and build infrastructure thats reliable, available, secure and performing.
Experience in strategizing and achieving operational excellence in global distributed systems.
Deep understanding of production monitoring systems, log analysis, and troubleshooting, support dashboards and proficiency in scripting languages and automation tools.
Strong knowledge of Production support practices for managing web and iOS applications and passion for eliminating repetitive manual processes using automation.
Build and maintain CI-CD infrastructure to enable rapid build to release cycles for software engineering teams.
Envision and build automation tools to deliver infrastructure services reliably and in a repeatable fashion. Utilize AI & ML models to gain Operations Excellence in application support.
Lead a team of 10 highly skilled engineers and guide their work towards operations excellence, gaining efficiency by automating their daily routine tasks.
Be a problem-solver who is self-directed and capable of exhibiting deftness to handle multiple simultaneous competing priorities and deliver solutions in a timely manner.
Create and maintain accurate, up-to-date documentation reflecting configuration, and responsible for writing justifications, training users in complex topics, writing status reports, documenting procedures, and interacting with other Apple staff and management.
Provide guidance to improve the stability, efficiency and scalability of systems. Strong troubleshooting ability will be used daily.
Determine future needs for capacity by closely reviewing upcoming application features and load.
Continuously work with the engineering teams to improve reliability, implementing actionable monitoring framework and be part of production on-call.
Engage with teams across Apple to Architect and build infrastructure thats reliable, available, secure and performing.
Experience in strategizing and achieving operational excellence in global distributed systems.
Deep understanding of production monitoring systems, log analysis, and troubleshooting, support dashboards and proficiency in scripting languages and automation tools.
Strong knowledge of Production support practices for managing web and iOS applications and passion for eliminating repetitive manual processes using automation.
Build and maintain CI-CD infrastructure to enable rapid build to release cycles for software engineering teams.
Envision and build automation tools to deliver infrastructure services reliably and in a repeatable fashion. Utilize AI & ML models to gain Operations Excellence in application support.
Lead a team of 10 highly skilled engineers and guide their work towards operations excellence, gaining efficiency by automating their daily routine tasks.
Be a problem-solver who is self-directed and capable of exhibiting deftness to handle multiple simultaneous competing priorities and deliver solutions in a timely manner.
Create and maintain accurate, up-to-date documentation reflecting configuration, and responsible for writing justifications, training users in complex topics, writing status reports, documenting procedures, and interacting with other Apple staff and management.
Provide guidance to improve the stability, efficiency and scalability of systems. Strong troubleshooting ability will be used daily.
Determine future needs for capacity by closely reviewing upcoming application features and load.
Continuously work with the engineering teams to improve reliability, implementing actionable monitoring framework and be part of production on-call.
View Orignal JOB on: italents.net
Minimum Qualifications
- Proven experience in designing and building Infrastructure including Compute and Storage in Cloud and on premise.
- Proven experience with containerization and orchestration technologies such as Docker, Kubernetes or equivalent.
- Deep Understanding of programs using a high-level programming language like: C, Java, Ruby, Python, or Perl.
- Proven experience with Helm and Kustomize for managing Kubernetes applications and configurations through GitOps practices.
- Proven experience in MongoDB, AWS S3 and similar storage technologies.
Key Qualifications
Preferred Qualifications
- 2+ years proven experience in designing and building Infrastructure including Compute and Storage in Cloud and on premise.
- 2+ years proven experience with containerization and orchestration technologies such as Docker, Kubernetes or equivalent.
- 2+ years proven experience with Helm and Kustomize for managing Kubernetes applications and configurations through GitOps practices.
- 2+ years proven experience in MongoDB, AWS S3 and similar storage technologies.
- 2+ years proven experience in designing and building Infrastructure including Compute and Storage in Cloud and on premise
- 2+ years proven experience with Linux or other POSIX operating systems, shell scripting, and networking technologies.
- 2+ years proven experience with CI/CD using Jenkins, GitHub Actions or similar systems.
- Experience with configuration management tools (e.g. Ansible, Terraform).
- Proficiency with logging and observability technologies such as Prometheus, Grafana, Splunk or similar.
- Passionate about operational excellence through proper automation and engineering processes using programming languages such as Go, Python, Java, or other JVM languages.
- Experience integrating security practices into all stages of the software development lifecycle.
- Experience with preparing and executing PCI, SOC2, SOX, or other compliance audits.
- Analytical & problem solving skills, ability to communicate ideas clearly.
- Strong sense of ownership, customer service, and integrity demonstrated through clear communication with with a strong focus on teamwork.
- BS Degree in Computer Science or equivalent work experience is preferred