Resource Allocation for Generative AI Workloads: Advanced Cloud Resource Management Strategies for Optimized Model Performance
  • Author(s): Pranav Murthy ; Aditya Mehra ; Lalit Mishra
  • Paper ID: 1704589
  • Page: 1428-1437
  • Published Date: 06-07-2023
  • Published In: Iconic Research And Engineering Journals
  • Publisher: IRE Journals
  • e-ISSN: 2456-8880
  • Volume/Issue: Volume 6 Issue 12 June-2023

This article continues from the previous work describing the next-generation turnkey solution to enhance generative AI models and their performance resolution in the cloud, which covers the strategies in the use of resources. It describes various types of scaling and slices, including auto-scaled and spot instances, for various unpredictable workloads. Other topics in the article are right-sized and load-balanced optimization of resources that improve and further the performance and cost. In cost containment measures, cost allocation tags and budget alerting are explained to make cost tracking without hindering the delivery of services. Real-time metrics as the means of performance control and the application of the customized dashboards and their application in ensuring the proper performance of artificial intelligence are also explained. In addition, the article covers information such as caching and partitioning of data and model optimization, as well as options like pruning and Quantization. The propagation of sound, logically evolved 'hybrid cloud' strategies for the on-premises and the cloud are deemed to keep the ratio in balance with the requirements of the business, as well as the security and compliance to guarantee that data is adequately protected. Below is the analysis of these strategies and the lesson any organization seeking to get the best out of generative AI in the cloud can learn.


Generative AI, Cloud Computing, Dynamic Resource Allocation, Auto-scaling, Spot Instances, Resource Optimization, Right-Sizing, Load Balancing, Cost Management


IRE Journals:
Pranav Murthy , Aditya Mehra , Lalit Mishra "Resource Allocation for Generative AI Workloads: Advanced Cloud Resource Management Strategies for Optimized Model Performance" Iconic Research And Engineering Journals Volume 6 Issue 12 2023 Page 1428-1437

Pranav Murthy , Aditya Mehra , Lalit Mishra "Resource Allocation for Generative AI Workloads: Advanced Cloud Resource Management Strategies for Optimized Model Performance" Iconic Research And Engineering Journals, 6(12)