Optimizing Machine Learning Models: From Fine-Tuning to Distillation
In this talk, we'll explore various machine learning model optimization techniques, with a focus on fine-tuning and distillation. As models grow larger and more complex, optimizing them for efficiency is crucial, especially when deploying in resource constrained environments. We'll dive deep into fine-tuning and distillation, explaining their processes, benefits, and trade-offs. Additionally, we'll introduce LoRA (Low-Rank Adaptation) and QLoRA (Quantized LoRA), two cutting edge techniques that allow for efficient fine-tuning of large models, particularly large language models (LLMs), with minimal computational overhead. The session will also briefly cover other optimization methods, such as pruning, quantization, and reinforcement learning (RL), providing a comprehensive overview of how to select the right technique for different use cases. By the end of this talk, you'll have a deeper understanding of these optimization strategies and when to apply them to enhance model performance, scalability, and efficiency.
Date and Time
Location
Hosts
Registration
- Date: 29 May 2025
- Time: 12:00 AM UTC to 01:00 AM UTC
-
Add Event to Calendar
Speakers
Raja Krishna of LOOP
Optimizing Machine Learning Models: From Fine-Tuning to Distillation
In this talk, we'll explore various machine learning model optimization techniques, with a focus on fine-tuning and distillation. As models grow larger and more complex, optimizing them for efficiency is crucial, especially when deploying in resource constrained environments. We'll dive deep into fine-tuning and distillation, explaining their processes, benefits, and trade-offs. Additionally, we'll introduce LoRA (Low-Rank Adaptation) and QLoRA (Quantized LoRA), two cutting edge techniques that allow for efficient fine-tuning of large models, particularly large language models (LLMs), with minimal computational overhead. The session will also briefly cover other optimization methods, such as pruning, quantization, and reinforcement learning (RL), providing a comprehensive overview of how to select the right technique for different use cases. By the end of this talk, you'll have a deeper understanding of these optimization strategies and when to apply them to enhance model performance, scalability, and efficiency.
Biography:
Raja is a Senior Software Engineer at LOOP, a Series A startup. With a Master's degree in Computer Science from the University of Texas at Arlington, Raja specializes in building web and AI applications using React, Python, TypeScript, and GraphQL. Raja actively shares knowledge on AI/ML through Medium articles and has spoken at industry conferences, helping fellow developers explore and implement AI-driven solutions.
Email: