Optimizing Machine Learning Models: From Fine-Tuning to Distillation

#computer #software #computational #intelligence #FTW
Share

In this talk, we'll explore various machine learning model optimization techniques, with a focus on fine-tuning and distillation. As models grow larger and more complex, optimizing them for efficiency is crucial, especially when deploying in resource constrained environments. We'll dive deep into fine-tuning and distillation, explaining their processes, benefits, and trade-offs. Additionally, we'll introduce LoRA (Low-Rank Adaptation) and QLoRA (Quantized LoRA), two cutting edge techniques that allow for efficient fine-tuning of large models, particularly large language models (LLMs), with minimal computational overhead. The session will also briefly cover other optimization methods, such as pruning, quantization, and reinforcement learning (RL), providing a comprehensive overview of how to select the right technique for different use cases. By the end of this talk, you'll have a deeper understanding of these optimization strategies and when to apply them to enhance model performance, scalability, and efficiency.  



  Date and Time

  Location

  Hosts

  Registration



  • Date: 29 May 2025
  • Time: 12:00 AM UTC to 01:00 AM UTC
  • Add_To_Calendar_icon Add Event to Calendar
If you are not a robot, please complete the ReCAPTCHA to display virtual attendance info.
  • Contact Event Hosts
  • Starts 09 April 2025 05:00 AM UTC
  • Ends 28 May 2025 05:00 AM UTC
  • No Admission Charge


  Speakers

Raja Krishna of LOOP

Topic:

Optimizing Machine Learning Models: From Fine-Tuning to Distillation

In this talk, we'll explore various machine learning model optimization techniques, with a focus on fine-tuning and distillation. As models grow larger and more complex, optimizing them for efficiency is crucial, especially when deploying in resource constrained environments. We'll dive deep into fine-tuning and distillation, explaining their processes, benefits, and trade-offs. Additionally, we'll introduce LoRA (Low-Rank Adaptation) and QLoRA (Quantized LoRA), two cutting edge techniques that allow for efficient fine-tuning of large models, particularly large language models (LLMs), with minimal computational overhead. The session will also briefly cover other optimization methods, such as pruning, quantization, and reinforcement learning (RL), providing a comprehensive overview of how to select the right technique for different use cases. By the end of this talk, you'll have a deeper understanding of these optimization strategies and when to apply them to enhance model performance, scalability, and efficiency.

Biography:

Raja is a Senior Software Engineer at LOOP, a Series A startup. With a Master's degree in Computer Science from the University of Texas at Arlington, Raja specializes in building web and AI applications using React, Python, TypeScript, and GraphQL. Raja actively shares knowledge on AI/ML through Medium articles and has spoken at industry conferences, helping fellow developers explore and implement AI-driven solutions.

Email: