AIML - Machine Learning Engineer, Machine Learning Platform & Infrastructure (apple)
apple Santa Clara, United States
2024-10-27
Job posting number: #154057 (Ref:apl-200575590)
Job Description
Summary
Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there’s no telling what you could accomplish.
Do you want to make Siri and Apple products more intelligent for our users? The Information Intelligence Infrastructure team is building groundbreaking technology for search, natural language processing, artificial intelligence and machine learning. Our infrastructure is the back-bone of Apple Intelligence. It powers the largest Apple foundation models on servers and a wide gamut of services at Apple including Apple Search, Apple Music, AppleTV, AppStore, iMessages, Photos & Camera, Spotlight, Safari, Siri and upcoming ever exciting Apple products serving millions of queries every day with incredible low latencies, drawing every ounce of compute from our hardware.
As part of this group, you will work with one of the most exciting high performance computing environments, with petabytes of data, millions of queries per second, and have an opportunity to imagine and build products that delight our customers every single day. You will have a chance to work on optimizing billions of parameter language and vision and speech models using state of the art technologies and make it run at scale of Apple.
Do you want to make Siri and Apple products more intelligent for our users? The Information Intelligence Infrastructure team is building groundbreaking technology for search, natural language processing, artificial intelligence and machine learning. Our infrastructure is the back-bone of Apple Intelligence. It powers the largest Apple foundation models on servers and a wide gamut of services at Apple including Apple Search, Apple Music, AppleTV, AppStore, iMessages, Photos & Camera, Spotlight, Safari, Siri and upcoming ever exciting Apple products serving millions of queries every day with incredible low latencies, drawing every ounce of compute from our hardware.
As part of this group, you will work with one of the most exciting high performance computing environments, with petabytes of data, millions of queries per second, and have an opportunity to imagine and build products that delight our customers every single day. You will have a chance to work on optimizing billions of parameter language and vision and speech models using state of the art technologies and make it run at scale of Apple.
Description
We design, build and maintain infrastructure to support features that empower billions of Apple users. Our team processes billions of requests every day across our search and foundation model platform. We take full end-to-end ownership of our services, driving them through every stage meticulously, encompassing conception, design, implementation, deployment, and maintenance. As a result, each one of us takes our responsibilities seriously. In this team, you’ll have the opportunity to work on incredibly complex large scale systems with trillions of records and petabytes of data, work along side Foundation Model Research team to optimize inference for cutting edge model architectures, and work closely with product teams to build production grade solutions for millions of customers in real time.
Minimum Qualifications
JOB IS FROM: italents.netVIEW
- Strong background in computer science: algorithms, data structures and system design
- 10+ year experience on large scale distributed system design, operation and optimization
- Familiar with one of the popular ML Frameworks like Pytorch, Tensorflow
- Excellent interpersonal skills able to work independently as well as cross-functionally
Key Qualifications
Preferred Qualifications
- Proficient in building and maintaining systems written in modern languages (e.g. Golang, Python)
- Familiar with fundamental Deep Learning architectures such as Transformers, Encoder/Decoder models.
- Familiarity with Nvidia TensorRT-LLM, vLLLM, DeepSpeed, Nvidia Triton Server etc.
- Experience writing custom CUDA kernels using CUDA or OpenAI Triton.