EarthSight: Efficient Scheduling for Scalable Low-Latency Satellite Intelligence; Ansel Erol, Seungjun Lee, Divya Mahajan; Conference on Machine Learning and Systems (MLSys 2026), May 2026
NEST: Network- and Memory-Aware Device Placement for Distributed Deep Learning; Irene Wang, Vishnu Venkat Varma, Arvind Krishnamurthy, Divya Mahajan; Conference on Machine Learning and Systems (MLSys 2026), May 2026
Flashlight: PyTorch Compiler Extensions to Accelerate Attention Variants; Bozhi You, Irene Wang, Zelal Su Mustafaoglu, Abhinav Jangda, Angélica Moreira, Roshan Dathathri, Divya Mahajan, Keshav Pingali; Conference on Machine Learning and Systems (MLSys 2026), May 2026
Stream2LLM: Overlap Context Streaming and Prefill for Reduced Time-to-First-Token; Rajveer Backaniwala, Divya Mahajan, Kexin Rong; Conference on Machine Learning and Systems (MLSys 2026), May 2026
An Adaptive Vector Index Partitioning Scheme for Low-Latency RAG Pipeline; Junkyum Kim, Divya Mahajan; IEEE International Symposium on High-Performance Computer Architecture (HPCA 2026), February 2026
Artifact Available, Evaluated, and Reproduced
CATransformers: Carbon Aware Transformers Through Joint Model-Hardware Optimization; Irene Wang, Newsha Ardalani, Mostafa Elhoushi, Daniel Jiang, Samuel Hsia, Ekin Sumbul, Divya Mahajan, Carole Jean-Wu, Bilge Acun; Conference on Neural Information Processing Systems (NeurIPS 2025), December 2025
Characterizing the Efficiency of Distributed Training: A Power, Performance, and Thermal Perspective; Seokjin Go, Joongun Park, Spandan More, Hanjiang Wu, Irene Wang, Aaron Jezghani, Tushar Krishna, Divya Mahajan; International Symposium on Microarchitecture (MICRO 2025), October 2025
Pimba: A Processing-in-Memory Acceleration for Post-Transformer Large Language Model Serving; Wonung Kim, Yubin Lee, Yoonsung Kim, Jinwoo Hwang, Seongryong Oh, Jiyong Jung, Aziz Huseynov, Woong Gyu Park, Chang Hyun Park, Divya Mahajan, Jongse Park; International Symposium on Microarchitecture (MICRO 2025), October 2025
Artifact Available, Evaluated, and Reproduced
Characterizing Compute-Communication Overlap in GPU-Accelerated Distributed Deep Learning: Performance and Power Implications; Seonho Lee, Seokjin Go, Divya Mahajan; IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2025), May 2025 (Short Paper)
Gliding the Slipstream: Popularity-Based Embedding Skipping for Recommender Training; Yassaman Ebrahimzadeh Maboud, Muhammad Adnan, Divya Mahajan, Prashant J. Nair; Design Automation and Test in Europe (DATE 2025), April 2025
Artifact Available
Forecasting GPU Performance for Deep Learning Training and Inference; Seonho Lee, Amar Phanishayee, Divya Mahajan; International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2025), April 2025
Artifact Available, Evaluated, and Reproduced
Integrated Device Placement and Hardware Architecture Search; Irene Wang, Jakub Tarnawski, Amar Phanishayee, Divya Mahajan; International Conference on Machine Learning (ICML 2024), July 2024
Artifact Available, Featured in Conversation article
A Heterogeneous Acceleration Pipeline for Training Deep Recommendation Systems; Muhammad Adnan, Yassaman Ebrahimzadeh Maboud, Divya Mahajan, Prashant J. Nair; International Symposium on Computer Architecture (ISCA 2024), June 2024
Accelerating String-key Learned Index Structures via Memoization-based Incremental Training; Minsu Kim, Jinwoo Hwang, Guseul Heo, Sanghyuk An, Divya Mahajan, Jongse Park; Proceedings of the VLDB Endowment (VLDB 2024), May 2024
NeuPIMs: A NPU-PIM Heterogeneous Acceleration for Batched Inference of Large Language Model; Guseul Heo, Sangyeop Lee, Jaehong Cho, Hyunmin Choi, Sanghyeon Lee, Hyungkyu Ham, Gwangsun Kim, Divya Mahajan, Jongse Park; International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2024), April 2024
FLuID: Mitigating Stragglers in Federated Learning using Invariant Dropout; Irene Wang, Prashant J. Nair, Divya Mahajan; Conference on Neural Information Processing Systems (NeurIPS 2023), December 2023
Accelerating Recommendation System Training by Leveraging Popular Choices; Muhammad Adnan, Yassaman Ebrahimzadeh Maboud, Divya Mahajan, Prashant J. Nair; Proceedings of the VLDB Endowment (VLDB 2022)
A Computational Stack for Cross-Domain Acceleration; Sean Kinzer, Joon Kyung Kim, Soroush Ghodrati, Brahmendra Yatham, Alric Althoff, Divya Mahajan, Sorin Lerner, Hadi Esmailzadeh; IEEE International Symposium on High-Performance Computer Architecture (HPCA 2021), February 2021
Efficient Algorithms for Device Placement of DNN Graph Operators; Jakub Tarnawski, Amar Phanishayee, Nikhil R. Devanur, Divya Mahajan, Fanny Nina Paravecino; Conference on Neural Information Processing Systems (NeurIPS 2020), December 2020
In-RDBMS Hardware Acceleration of Advanced Analytics; Divya Mahajan, Joon Kyung Kim, Jacob Sacks, Adel Ardalan, Arun Kumar, Hadi Esmaeilzadeh; Proceedings of the VLDB Endowment (VLDB 2018), August 2018
RoboX: An End-to-End Solution to Accelerate Autonomous Control in Robotics; Jacob Sacks, Divya Mahajan, Behnam Khaleghi, R. Connor Lawson, Hadi Esmaeilzadeh; International Symposium on Computer Architecture (ISCA 2018), June 2018
Scale-out Acceleration for Machine Learning; Jongse Park, Hardik Sharma, Divya Mahajan, Joon Kyung Kim, Preston Olds, Hadi Esmaeilzadeh; International Symposium on Microarchitecture (MICRO 2017), October 2017
From High-Level Deep Neural Models to FPGAs; Hardik Sharma, Jongse Park, Divya Mahajan, Emmanuel Amaro, Joon Kyung Kim, Chenkai Shao, Asit Mishra, Hadi Esmaeilzadeh; International Symposium on Microarchitecture (MICRO 2016), October 2016
Towards Statistical Guarantees in Controlling Quality Tradeoffs in Approximate Acceleration; Divya Mahajan, Amir Yazdanbakhsh, Jongse Park, Bradley Thwaites, Hadi Esmaeilzadeh; International Symposium on Computer Architecture (ISCA 2016), June 2016
ApproxiGame: Towards Crowd-sourcing Quality Target Determination in Approximate Computing; Jongse Park, Emmanuel Amaro, Divya Mahajan, Bradley Thwaites, Hadi Esmaeilzadeh; International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2016), April 2016
Tabla: A Unified Template-based Framework for Accelerating Statistical Machine Learning; Divya Mahajan, Jongse Park, Emmanuel Amaro, Hardik Sharma, Amir Yazdanbakhsh, Joon Kim, Hadi Esmaeilzadeh; IEEE International Symposium on High-Performance Computer Architecture (HPCA 2016), March 2016 — Distinguished Paper Award
Distinguished Paper Award, Artifact Available
Axilog: Language Support for Approximate Hardware Design; Amir Yazdanbakhsh, Divya Mahajan, Bradley Thwaites, Jongse Park, Anandhavel Nagendrakumar, Sindhuja Sethuraman, Kartik Ramkrishnan, Nishanthi Ravindran, Rudra Jariwala, Abbas Rahimi, Hadi Esmaeilzadeh, Kia Bazargan; Design Automation and Test in Europe (DATE 2015), March 2015
Memristor Based Adders; Divya Mahajan, Matheen Mussadiq, Earl E. Swartzlander Jr.; Asilomar Conference on Signals, Systems and Computers, November 2014
Yin-Yang: Programming Abstractions for Cross-Domain Multi-Acceleration; Joon Kyung Kim, Byung Hoon Ahn, Sean Kinzer, Soroush Ghodrati, Rohan Mahapatra, Brahmendra Yatham, Dohee Kim, Parisa Sarikhani, Babak Mahmoudi, Divya Mahajan, Jongse Park, Hadi Esmaeilzadeh; IEEE Micro, September 2022
AxBench: A Multi-Platform Benchmark Suite for Approximate Computing; Amir Yazdanbakhsh, Divya Mahajan, Pejman Lotfi-Kamran, Hadi Esmaeilzadeh; IEEE Design & Test, May 2016
Axilog: Abstractions for Approximate Hardware Design and Reuse; Divya Mahajan, Kartik Ramkrishnan, Rudra Jariwala, Amir Yazdanbakhsh, Bradley Thwaites, Jongse Park, Anandhavel Nagendrakumar, Abbas Rahimi, Hadi Esmaeilzadeh, Kia Bazargan; IEEE Micro, October 2015
Ad-Rec: Advanced Feature Interactions to Address Covariate-Shifts in Recommendation Networks; Muhammad Adnan, Yassaman Ebrahimzadeh Maboud, Divya Mahajan, Prashant J. Nair; Workshop on Machine Learning for Systems (NeurIPS 2023)
Prediction-based Quality Control for Approximate Accelerators; Divya Mahajan, Amir Yazdanbakhsh, Jongse Park, Bradley Thwaites, Hadi Esmaeilzadeh; Workshop on Approximate Computing Across the System Stack (ASPLOS 2015)
MITHRA: Controlling Quality Tradeoffs in Approximate Acceleration; Divya Mahajan, Amir Yazdanbakhsh, Jongse Park, Bradley Thwaites, Hadi Esmaeilzadeh; Techcon, Silicon Research Corporation, September 2015
Thesis
Balancing Generality and Specialization for Machine Learning in the Post-ISA Era, Doctoral Thesis, Divya Mahajan, Georgia Institute of Technology, Atlanta, GA, USA
COC Best Dissertation Award
FPGA based Implementation of an ALU using Harvard Architecture, Undergraduate Thesis, Divya Mahajan, Indian Institute of Technology Ropar
Selected coverage of our work
Forbes article on creating the next generation of AI workforce: https://www.forbes.com/sites/committeeof200/2025/02/10/graduating-the-new-ai-ready-workforce-how-ga-tech-is-leading-the-way/
The future of computing: https://issuu.com/gatechcoe/docs/the_future_of_computing_helluva_engineer_magazin
How to tackle the AI's growing energy demands: https://theconversation.com/ais-ballooning-energy-consumption-puts-spotlight-on-data-center-efficiency-254192
Using AI to build a better world: https://coe.gatech.edu/magazine/2024/spring/ai-better-world