About the job
Duration: Minimum 6 months; ideally 9–12 months, depending on the candidate’s experience.
At Scandit, we empower individuals with technology that enhances their capabilities. From optimizing delivery routes for drivers to matching patients with their medications, and streamlining retail operations, our innovative solutions automate workflows to generate actionable insights across diverse industries. Join us on our journey of expansion, innovation, and growth, and be a part of Scandit's exciting future.
About the Internship
This research-oriented internship focuses on advancing machine learning techniques for intricate visual comprehension tasks. The project involves utilizing deep learning frameworks for image-to-sequence modeling, including Transformers, attention mechanisms, and state-of-the-art representation-learning methods, targeting complex computer vision challenges. Your contributions will play a vital role in long-term research initiatives aimed at enhancing performance, robustness, and generalization within large-scale visual applications.
Your Responsibilities
Work alongside seasoned machine learning researchers and engineers on pioneering research that merges computer vision with sequence modeling. Your responsibilities will include:
- Designing and experimenting with innovative ML architectures for structured visual data.
- Evaluating various modeling paradigms (e.g., encoder–decoder, hybrid Transformer models, sequence-based representations).
- Exploring methodologies to enhance robustness, generalization, and multi-view reasoning capabilities.
- Conducting systematic experiments, ablations, and error analyses to substantiate research hypotheses.
This internship offers a unique opportunity for model innovation, extensive experimentation, and scholarly research that could potentially impact millions of users. It is an ideal role for advanced master’s students, PhD candidates, or those preparing for a research career in academia or industry.
