About the job

Join our dynamic team at apiphany as an Associate Data Scientist, where you will play a crucial role in advancing our AI/ML engineering initiatives. In this hands-on position, you will be responsible for preparing, validating, and structuring data for large language model (LLM)-driven systems. Your expertise will contribute to real-world data processing, pipeline support, and model evaluation.

Key Responsibilities

Process and clean both structured and unstructured data for AI/ML pipelines.
Prepare datasets that are ready for training, fine-tuning, and evaluation of LLM workflows.
Support RAG and NL→SQL systems through meticulous data preparation and validation.
Conduct data quality checks to ensure completeness and consistency.
Assist in the development and maintenance of data pipelines and APIs, such as FastAPI.
Collaborate with engineering teams to troubleshoot and optimize data workflows.

Required Skills

At least 2 years of experience in data processing or related roles.
Proficiency in Python, along with experience in data libraries including Pandas, NumPy, and Scikit-learn.
Experience with LLM workflows, including fine-tuning, prompt engineering, and evaluation.
Familiarity with structured (SQL) and unstructured text data.
Solid understanding of data preparation techniques for AI/ML systems.

Nice to Have

Exposure to RAG pipelines, embeddings, or evaluation metrics.
Familiarity with machine learning frameworks such as PyTorch or TensorFlow, and Docker-based workflows.
Experience with CI/CD pipelines for ML systems.
Knowledge of vector databases (e.g., Chroma) and reranking techniques.
Research experience with Transformer-based architectures.

Note: This position is exclusively open to candidates based in India.

About the job

Key Responsibilities

Process and clean both structured and unstructured data for AI/ML pipelines.
Prepare datasets that are ready for training, fine-tuning, and evaluation of LLM workflows.
Support RAG and NL→SQL systems through meticulous data preparation and validation.
Conduct data quality checks to ensure completeness and consistency.
Assist in the development and maintenance of data pipelines and APIs, such as FastAPI.
Collaborate with engineering teams to troubleshoot and optimize data workflows.

Required Skills

At least 2 years of experience in data processing or related roles.
Proficiency in Python, along with experience in data libraries including Pandas, NumPy, and Scikit-learn.
Experience with LLM workflows, including fine-tuning, prompt engineering, and evaluation.
Familiarity with structured (SQL) and unstructured text data.
Solid understanding of data preparation techniques for AI/ML systems.

Nice to Have

Exposure to RAG pipelines, embeddings, or evaluation metrics.
Familiarity with machine learning frameworks such as PyTorch or TensorFlow, and Docker-based workflows.
Experience with CI/CD pipelines for ML systems.
Knowledge of vector databases (e.g., Chroma) and reranking techniques.
Research experience with Transformer-based architectures.

Note: This position is exclusively open to candidates based in India.

About the job

Key Responsibilities

Process and clean both structured and unstructured data for AI/ML pipelines.
Prepare datasets that are ready for training, fine-tuning, and evaluation of LLM workflows.
Support RAG and NL→SQL systems through meticulous data preparation and validation.
Conduct data quality checks to ensure completeness and consistency.
Assist in the development and maintenance of data pipelines and APIs, such as FastAPI.
Collaborate with engineering teams to troubleshoot and optimize data workflows.

Required Skills

At least 2 years of experience in data processing or related roles.
Proficiency in Python, along with experience in data libraries including Pandas, NumPy, and Scikit-learn.
Experience with LLM workflows, including fine-tuning, prompt engineering, and evaluation.
Familiarity with structured (SQL) and unstructured text data.
Solid understanding of data preparation techniques for AI/ML systems.

Nice to Have

Exposure to RAG pipelines, embeddings, or evaluation metrics.
Familiarity with machine learning frameworks such as PyTorch or TensorFlow, and Docker-based workflows.
Experience with CI/CD pipelines for ML systems.
Knowledge of vector databases (e.g., Chroma) and reranking techniques.
Research experience with Transformer-based architectures.

Note: This position is exclusively open to candidates based in India.

About the job

Key Responsibilities

Process and clean both structured and unstructured data for AI/ML pipelines.
Prepare datasets that are ready for training, fine-tuning, and evaluation of LLM workflows.
Support RAG and NL→SQL systems through meticulous data preparation and validation.
Conduct data quality checks to ensure completeness and consistency.
Assist in the development and maintenance of data pipelines and APIs, such as FastAPI.
Collaborate with engineering teams to troubleshoot and optimize data workflows.

Required Skills

At least 2 years of experience in data processing or related roles.
Proficiency in Python, along with experience in data libraries including Pandas, NumPy, and Scikit-learn.
Experience with LLM workflows, including fine-tuning, prompt engineering, and evaluation.
Familiarity with structured (SQL) and unstructured text data.
Solid understanding of data preparation techniques for AI/ML systems.

Nice to Have

Exposure to RAG pipelines, embeddings, or evaluation metrics.
Familiarity with machine learning frameworks such as PyTorch or TensorFlow, and Docker-based workflows.
Experience with CI/CD pipelines for ML systems.
Knowledge of vector databases (e.g., Chroma) and reranking techniques.
Research experience with Transformer-based architectures.

Note: This position is exclusively open to candidates based in India.

Associate Data Scientist - AI/ML Engineering Support

Unlock Your Potential

Experience Level

Qualifications

About the job

Key Responsibilities

Required Skills

Nice to Have

About apiphany

Direct Appointment Setter at Southern National Roofing | Columbia, MD

Project Superintendent

Community Support Lead Care Manager at Pacific Health Group | Remote

Physical Therapist at Performance Optimal Health | New Canaan

Part-Time In-Home Veterinarian

Sales Support Specialist at Golden Lighting | Tallahassee, FL

New Home Sales Consultant at LGI Homes | Lebanon, TN

Medical Director - Licensed Psychiatrist

Recruiting Coordinator - Join Our Innovative Team

Experienced Litigation Paralegal - Remote

Senior Director of Digital Communications

Nutritional Cook for Early Childhood Center

FMS Analyst at ACT1 Federal | Patuxent River, MD

Automotive Technician Opportunity at Citrus Kia

Software Security Analyst at TP-Link Systems Inc. | Irvine, California

Network Intrusion Detection Engineer - Active TS/SCI with CI Poly

Tax Associate - Private Client

Lead Behavior Technician - Full-Time Position

Local Roofing Sales Representative - Roof Restoration Specialist

Senior Director of Inventory and Merchandise Planning

Associate Data Scientist - AI/ML Engineering Support

Unlock Your Potential

Experience Level

Qualifications

About the job

Key Responsibilities

Required Skills

Nice to Have

About apiphany

Associate Data Scientist - AI/ML Engineering Support

Unlock Your Potential

Experience Level

Qualifications

About the job

Key Responsibilities

Required Skills

Nice to Have

About apiphany

Associate Data Scientist - AI/ML Engineering Support

Unlock Your Potential

Experience Level

Qualifications

About the job

Key Responsibilities

Required Skills

Nice to Have

About apiphany