Improving response time, reliability, and lifetime of flash drives using machine learning

Share

Abstract:

Solid-state drives (SSDs) are everywhere! Flash-based SSDs have established themselves as a higher-performance alternative to hard disk drives in cloud and mobile environments. SSDs are widely used as a form of storage in mobile devices, laptops, digital cameras, and cloud servers. Hence, improving the performance of SSDs impacts the overall computing system and the experience of millions of end-users. SSDs deliver significantly higher speeds and are more reliable than HDDs, however, they still remain a performance bottleneck of computing systems. SSDs are relatively reliable; however, they still fail, which can result in data loss or system unavailability. Datacenter operators are interested in predicting future drive failures to administer drive replacement, data migration, and drive acquisition strategies. The talk describes my research addressing the challenges of improving the reliability and response time of flash-based storage systems using machine learning.

 

To improve reliability, we propose a machine learning based approach for automatically predicting SSD failures. We analyzed telemetric data collected from over 30,000 drives running live applications in data centers over a span of six years, to find the most critical reasons for SSD failures. We introduce an approach for automatically predicting future SSD failures in data centers which enable interpretability of the model's predictions. To improve response time, we propose a neural network based approach to improve prefetching in SSDs. Prefetching is a technique to speed-up fetch operations by predicting future block accesses and preloading them into the main memory ahead of time. This research identifies the challenges of prefetching in SSDs and explains why prior approaches fail to achieve high accuracy and presents a deep neural network (DNN) based prefetching approach that significantly outperforms the state-of-the-art. I will conclude my talk with research challenges that I plan to address in the future.



  Date and Time

  Location

  Hosts

  Registration



  • Date: 27 Apr 2022
  • Time: 06:30 PM to 08:00 PM
  • All times are (UTC-05:00) Eastern Time (US & Canada)
  • Add_To_Calendar_icon Add Event to Calendar

Join Zoom Meeting

https://Fairfield.zoom.us/j/99362571394

 

Meeting ID: 993 6257 1394

One tap mobile

+16468769923,,99362571394# US (New York)

+13017158592,,99362571394# US (Washington DC)

 

Dial by your location

        +1 646 876 9923 US (New York)

        +1 301 715 8592 US (Washington DC)

        +1 312 626 6799 US (Chicago)

        +1 669 900 6833 US (San Jose)

        +1 253 215 8782 US (Tacoma)

        +1 346 248 7799 US (Houston)

Meeting ID: 993 6257 1394

Find your local number: https://Fairfield.zoom.us/u/abkMOyC91h

  • Starts 15 April 2022 12:00 PM
  • Ends 27 April 2022 08:00 PM
  • All times are (UTC-05:00) Eastern Time (US & Canada)
  • No Admission Charge


  Speakers

Chandranil “Nil” Chakraborttii

Topic:

Improving response time, reliability, and lifetime of flash drives using machine learning

Solid-state drives (SSDs) are everywhere! Flash-based SSDs have established themselves as a higher-performance alternative to hard disk drives in cloud and mobile environments. SSDs are widely used as a form of storage in mobile devices, laptops, digital cameras, and cloud servers. Hence, improving the performance of SSDs impacts the overall computing system and the experience of millions of end-users. SSDs deliver significantly higher speeds and are more reliable than HDDs, however, they still remain a performance bottleneck of computing systems. SSDs are relatively reliable; however, they still fail, which can result in data loss or system unavailability. Datacenter operators are interested in predicting future drive failures to administer drive replacement, data migration, and drive acquisition strategies. The talk describes my research addressing the challenges of improving the reliability and response time of flash-based storage systems using machine learning.

 

To improve reliability, we propose a machine learning based approach for automatically predicting SSD failures. We analyzed telemetric data collected from over 30,000 drives running live applications in data centers over a span of six years, to find the most critical reasons for SSD failures. We introduce an approach for automatically predicting future SSD failures in data centers which enable interpretability of the model's predictions. To improve response time, we propose a neural network based approach to improve prefetching in SSDs. Prefetching is a technique to speed-up fetch operations by predicting future block accesses and preloading them into the main memory ahead of time. This research identifies the challenges of prefetching in SSDs and explains why prior approaches fail to achieve high accuracy and presents a deep neural network (DNN) based prefetching approach that significantly outperforms the state-of-the-art. I will conclude my talk with research challenges that I plan to address in the future.

Biography:

Chandranil “Nil” Chakraborttii is a faculty in the department of computer science at Trinity College. Chakraborttii received his Ph.D. and master’s degree in computer science from the University of California Santa Cruz and an undergraduate degree in information technology from West Bengal State University, India. His main research interests are in machine learning, data science, and storage systems with a focus on data centers.  More specifically, Chakraborttii is interested in the performance optimization of flash-based solid-state drives for cloud systems using machine learning techniques. This improvement reflects in two major directions - reliability, and the response time of flash-based storage devices. Before starting his Ph.D. program, Nil spent three years as a software engineer in the software industry and has also taught for the Stanford Summer Institutes for four years. He has collaborated with industry partners, Samsung and Intel on research projects related to cloud storage systems and has jointly authored patents and publications.