Site Reliability Engineer (apple)

apple    Hyderabad, Telangana, India    2024-09-29

Job posting number: #152395 (Ref:apl-200559413)

Job Description

Summary
Do you love working on highly scalable and secure distributed applications? Do you want your technical abilities to be challenged every day and for your work to make a difference in the lives of millions of people? If so, Apple is looking for dedicated hands-on SRE Engineer who are not afraid to share knowledge, think creatively, and question assumptions. Imagine what you could do here! At Apple, we believe new insights have a way of becoming excellent products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. The people here at Apple don’t just create products — they create the kind of wonder that’s revolutionized entire industries. It’s the diversity of those people and their ideas that inspires the innovation that runs through everything we do, from amazing technology to industry-leading environmental efforts. Join us to do the best work of your life with a welcoming, diverse, and hard-working group of engineers. Bring passion and dedication to the job, and there’s no telling what you could accomplish!
Description
You demonstrate passion for achieving the highest level of uptime, emphasizing scalability and high-performance. You have the zeal to enhance our systems observability ensuring that we have the necessary insights and tools to monitor, troubleshoot, and optimize our applications and infrastructure. Expertise in debugging and root causing issues with an instinct to automate repetitive tasks.

- Enhance System Observability: You will be implementing and maintaining robust observability solutions which provides real-time insights into the performance and health of our systems to proactively identify and address potential issues before they impact the users.
- Troubleshooting and Root Cause Analysis: Utilize your expertise to investigate and resolve incidents quickly during crisis situations, performing root cause analysis to prevent recurrence
- Automation: Leverage your coding skills to create tools and automating runbooks to improve efficiency.
- Documentation: Documenting and managing Runbooks and best practices to ensure knowledge sharing and team efficiency.
- Communication: Strong interpersonal skills and ability to work effectively across multiple business and technical teams
View Orignal JOB on: italents.net
Minimum Qualifications
  • At least 4 years of prior demonstrated experience in a Site Reliability Engineering, DevOps, or an Infrastructure-focused role.
  • Proficient in at-least one programming or scripting languages like Perl, Python, Ruby etc., for developing tools in Observability, ETL etc.
  • Hands-on experience in java programming and REST APIs for Application debugging and root cause analysis.
  • Support of internet-facing production services and distributed systems via deployments, onCall and Incident Management.
  • Proficiency in implementing and coordinating telemetry using monitoring and observability tools like Splunk, Grafana, and Prometheus, or similar.
  • Experience in solving and resolving issues in Kubernetes from both an operating system and application perspective.
  • Building and operating container orchestrating systems like Kubernetes or EKS.
  • Strong understanding of database principles and working knowledge in distributed storage and infrastructural solutions such as Oracle, Cassandra, SOLR, and Kafka
  • Firsthand experience in performance tuning of applications and databases.
  • Good command on Linux, Networking concepts (TLS/SSL, DNS, Load Balancers, etc.,) and troubleshooting skills in large scale environments
  • Deep understanding of basic security concepts and protocols - authentication, authorization, signing, encryption, SSL/TLS, SSH/SFTP, PKI, X509 certificates and PGP.
  • Experience with container management and micro-services architectures such as Docker in cloud and on-premises infrastructure.
  • Excellent knowledge of ITIL terminology for incident and problem management
  • Track record of excellent interpersonal, analytical, and communication skills.
  • Bachelor of Science in Computer Science or other related discipline.
Key Qualifications
Preferred Qualifications
  • N/A
Education & Experience
Additional Requirements


Employer Info

Job posting number:#152395 (Ref:apl-200559413)
Application Deadline:2024-10-29
Employer Location:apple
,Alabama
US
More jobs from this employer
Institution Website

Jobs Viewed Recently

顶部