Job Description:
Senior ML Systems Engineer – London – Hybrid - £90,000 We are supporting our client, an innovative technology company developing next-generation networking solutions, advanced solutions in AI/ML, and high-performance computing environments, with the hire of a Senior ML Systems Engineer.
As a Senior ML systems Engineer, you'll be in a hands-on technical position, building simulations, modelling systems, and infrastructure for ML training and interfacing.
Responsibilities:- Design simulation models for compute, memory, interconnect, and communication behaviour in ML systems.
- Built tools to simulate performance.
- Model accelerators, hosts, and network fabrics.
- Develop performance experiments for ML systems.
- Provide analysis on end-to-end performance for ML Systems.
- build technical reports and design recommendations.
Qualifications:- Master’s or PhD in Computer Science, Electrical Engineering, Computer Engineering, or a related field.
- Strong experience in ML systems, distributed systems, performance engineering, computer architecture, or simulation.
- Experience with distributed training concepts such as data parallelism, tensor/model parallelism, pipeline parallelism, collectives, and synchronisation overheads.
- Experience with tools such as Python, C++, or Rust, PyTorch, JAX, TensorFlow, NCCL, XLA, CUDA, or similar tools
- Understanding of accelerator-based systems such as GPUs, TPUs, or custom ML hardware.
Benefits:- Salary circa £90,000 depending on experience.
- 25 days holiday plus bank holidays.
- Hybrid working model.
- Private healthcare and life assurance.
- Relocation support available.