Állás részletei
-
Cég neve
High Tech Engineering Center Kft.
-
Munkavégzés helye
Országos lefedettség -
Munkaidő, foglalkoztatás jellege
- Teljes munkaidő
- Általános munkarend
-
Elvárt technológiák
- DOCKER PYTHON LINUX
-
Elvárások
- Angol középfok
- 1-3 év tapasztalat
- Középiskola
Állás elmentve
A hirdetést eltávolítottuk a mentett állásai közül.
Állás leírása
Responsibilities
Optimize training and inference pipelines for large language models such as Llama 2, Llama 3, DeepSeek, and GPT-OSS
Work on MLPerf Training and/or Inference benchmarks for LLM workloads
Profile GPU workloads to identify compute, memory, and communication bottlenecks
Improve scaling efficiency across multi-GPU and multi-node setups
Tune distributed training strategies (DDP, FSDP, ZeRO, tensor/pipeline parallelism)
Build and maintain reproducible benchmark environments (Docker / Singularity)
Collaborate with engineers on performance, stability, and scalability improvements
Document findings and contribute to benchmark submissions and internal reports
Work on MLPerf Training and/or Inference benchmarks for LLM workloads
Profile GPU workloads to identify compute, memory, and communication bottlenecks
Improve scaling efficiency across multi-GPU and multi-node setups
Tune distributed training strategies (DDP, FSDP, ZeRO, tensor/pipeline parallelism)
Build and maintain reproducible benchmark environments (Docker / Singularity)
Collaborate with engineers on performance, stability, and scalability improvements
Document findings and contribute to benchmark submissions and internal reports
Requirements
1-2 year of AI engineering knowledge / Deep Learning, GPU, or HPC-related roles
Strong Python skills and solid experience with PyTorch
Hands-on experience with LLM training or inference (Llama, GPT-style models, or similar)
Experience with distributed training (DDP, FSDP, ZeRO, DeepSpeed, or equivalent)
Good understanding of GPU performance fundamentals (compute vs memory, profiling, optimization)
Experience working in Linux-based environments
Familiarity with container technologies (Docker or similar)
Good level of spoken and written English
Strong Python skills and solid experience with PyTorch
Hands-on experience with LLM training or inference (Llama, GPT-style models, or similar)
Experience with distributed training (DDP, FSDP, ZeRO, DeepSpeed, or equivalent)
Good understanding of GPU performance fundamentals (compute vs memory, profiling, optimization)
Experience working in Linux-based environments
Familiarity with container technologies (Docker or similar)
Good level of spoken and written English
Nice-to-have
Experience working with MLPerf or other standardized benchmarking frameworks, Exposure to LLM optimization techniques (activation checkpointing, KV-cache optimization, sequence parallelism), Experience with GPU profiling tools (torch.profiler, Nsight, or equivalent), Knowledge of GPU kernel optimization (CUDA, HIP, Triton, or similar), Experience working with job schedulers (Slurm or equivalent), Familiarity with quantization or mixed precision (FP16, BF16, FP8)
How to apply
You can submit your application on the company's website, which you can access by clicking the „Apply on company page“ button.
Állás, munka területe(i)
Álláshirdetés jelentése