The Doctoral School in Science and Engineering is happy to invite you to Karthick PANNER SELVAM’s defence entitled
Performance Prediction Models for Deep Learning: A Graph Neural Network and Large Language Model Approachrediction
Supervisor: Dr. Mats Håkan BRORSSONE
In this thesis, we developed an advanced performance prediction model to estimate critical metrics of Deep Learning (DL) models, such as latency, memory consumption, and energy usage. These models are designed to support neural architecture search and efficient cloud deployment.
Deep learning has transformed domains such as computer vision, natural language processing, climate modeling, and scientific computing. However, the increasing complexity of DL models introduces significant computational demands that require efficient hardware utilization and resource allocation. Accurate prediction of performance metrics is essential for optimizing hardware-specific compilers, enabling cost-effective cloud deployments, and minimizing environmental impacts.
To address these challenges, we present a comprehensive framework for performance prediction in this thesis. The work begins with a systematic benchmarking study of DL models, highlighting computational bottlenecks and establishing the necessity of performance prediction as a foundation for further development.
We introduce a Graph Neural Network (GNN)-based performance prediction model capable of analyzing DL models from various software frameworks, including PyTorch and TensorFlow. This model predicts performance metrics and recommends NVIDIA multi-GPU instance profiles for efficient deployment. Building on this, we propose a semi-supervised performance prediction approach that leverages unlabeled data to accelerate training convergence. Using a graph autoencoder for unsupervised learning, we generate high-quality embeddings that enhance supervised training, leading to faster and more accurate predictions.
For Large Language Models (LLMs), which present unique challenges due to their extensive nodes and edges, we proposed a tree-based performance prediction model. This method significantly improves inference speed compared to traditional GNN-based techniques, making it particularly suitable for complex LLM architectures.
Finally, we explore multimodal learning by combining LLM with GNN to create a hybrid performance prediction model. This model quickly adapts to new hardware environments with sparse training samples, leveraging a novel three-stage training strategy to effectively integrate GNN and LLM for quick adaptation.