Senior ML Embedded Engineer

Overview

Job TypeHybrid

Experience5 years

Job PositionEmbedded

UpdatedOct 16, 2025

LocationTel Aviv District

SalaryN/A

Skills

C ꞏ 5y
C++ ꞏ 5y
Python
Computer Vision
Memory-constrained programming
Performance optimization
Processor architecture
AI pipelines
CUDA
DSP
Hardware-specific SDKs
Kernel fusion
NVIDIA Nsight
OpenCL
Pruning
PyTorch Profiler
Quantization
RT-Embedded
Transformer architectures
Vision-language models

Senior ML Embedded Engineer

Location: Ramat Hahayal, Tel Aviv

Employment Type: Full-time

Company: GSI Technology – A publicly traded, international high-tech company (NASDAQ: GSIT) developing the cutting-edge Gemini® Associative Processing Unit (APU) for computer-in-memory acceleration.

GSI is pioneering the Gemini APU—a cutting-edge, game-changing processor designed to accelerate compute-intensive tasks like large language models, machine learning, advanced image processing, and radar imaging.

If you're passionate about architecting high-performance software systems, implementing advanced algorithms, and drilling into low-level technical details, this is the role for you.

We’re seeking a dynamic and fast-learning engineer with a passion for diving deep into large language model implementations, and a keen focus on performance optimization and efficient execution.

Position Overview

We are seeking a highly skilled and motivated Senior ML Embedded Software Engineer to lead the development and optimization of AI models — including Large Language Models (LLMs) and Vision Language Models (VLM;s) — on GSI’s proprietary APU. This role bridges high-level machine learning understanding with low-level system and performance engineering, primarily in Python ,C and C++. You will be responsible for architecting, implementing, and optimizing AI pipelines under hardware constraints, with a strong emphasis on computer vision and transformer architectures.

Key Responsibilities

Develop and optimize software libraries for CNNs, LLM’s and VLM implementations on embedded hardware.
Design end-to-end system flows integrating AI models, especially in computer vision domains.
Lead performance tuning efforts under constraints such as memory, compute, and latency.
Work closely with hardware teams to co-design software optimized for GSI’s APU.
Debug and optimize AI inference pipelines, including Python-based pre/post-processing where applicable.
Team up across disciplines to turn wild ideas into reliable, high-performance code.
Architect and develop a high-performance AI compiler framework for deploying quantized neural networks on the GSI Gemini edge platform, enabling advanced edge AI workloads and optimizing for low-latency inference, efficient hardware utilization, and seamless integration with hardware acceleration pipelines.

Required Qualifications

B.Sc. in Computer Science or Electrical Engineering from a leading university.
5+ years of experience in embedded software development using C++ and C.
Solid experience in one or more of the following: Computer Vision, RT-Embedded, DSP.
Proven experience in developing and optimizing AI pipelines under performance, memory, and latency constraints.
Proven track record in performance/memory-constrained programming.
Strong communication skills, analytical mindset, and attention to detail.
Independent, solution-oriented, and highly motivated to make things happen
Proven track record developing and optimizing software algorithms with deep consideration for hardware architecture, memory bandwidth, and system constraints
Strong understanding of processor architecture fundamentals—caches, pipeline stages, execution units, and memory hierarchies
Ability to interpret detailed hardware specifications and translate them into robust, efficient software solution.

Preferred Qualifications

Practical experience with transformer architectures and/or vision-language models (VLMs).
Deep knowledge of computer vision pipelines and multimodal systems.
Experience designing complex software systems from concept to deployment.
Familiarity with hardware-aware optimization techniques such as:
Quantization
Pruning
Kernel fusion
Experience with performance profiling tools (e.g., PyTorch Profiler, NVIDIA Nsight).
Low-level optimization experience with CUDA, OpenCL, or hardware-specific SDKs.

Privacy Statement

All applications will be handled with strict confidentiality. Your information will not be shared without your consent.

GSI Technology

Similar jobs

C++ Developer

Nes ZionaOct 15, 2025
Embedded Software Engineer

Center DistrictOct 09, 2025
HW Board Designer

Tel Aviv DistrictOct 08, 2025
Automation Engineer

HerzliyaOct 08, 2025
RF Engineer

NetanyaOct 16, 2025
AI Hardware Firmware Engineer

Tel Aviv DistrictOct 05, 2025
MEMS Research Engineer

Tel Aviv DistrictOct 05, 2025
Software Development Engineer

Petah TikvaOct 14, 2025

Your Account

Your Account