About Me

I am a Ph.D. candidate at the Luddy School of Informatics, Computing, and Engineering at Indiana University, Bloomington (IU). I obtained a master’s degree (MS) in Computer Science at the University of California, Riverside (UCR). My research interests include Large-scale AI Systems, High-performance Computing, Distributed Training and Inference, and Data Compression.

News

  • I am graduating and actively looking for positions starting June 2026!

  • Collaborating with IU Medical School exploring early stage cancer detection via tissue image and SNP analysis!

  • Serve as a student volunteer at SC25!

  • Start internship at ByteDance in San Jose May 2025!

  • COMPSO paper got accepted to PPoPP 2025!

Research Vision

Build next-generation high-performance AI systems infrastructure for large-scale foundation models, and scientific AI on heterogeneous supercomputing platforms, in close collaboration with national laboratories and industry. I aim to close the performance gap between modern AI workloads and HPC architectures, through algorithm–system co-design across data, communication, memory, and accelerators.

Selected Publications

  1. B. Sun, W. Liu, J. Pauloski, J. Tian, J. Jia, D. Wang, B. Zhang, M. Zheng, S. Di, S. Jin, Z. Zhang, X. Yu, K. Iskra, P. Beckman, G. Tan, and D. Tao, “COMPSO: Optimizing Gradient Compression for Distributed Training with Second-Order Optimizers,” accepted to ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP) 2025.Paper.
  2. B. Sun, Sheng Di, and F. Song, ”CKVC: Improving Large Language Model Inference Via KV Cache Reduction,” Submitting to ICS 2026. Research Project Building upon KVSort.
  3. B. Sun, X. Yu, D. Wang, S. Jin, C. Zhang, L. Ma, K. Iskra, T. Zhou, T. Bicer, P. Beckman, N. Sun, G. Tan and D.Tao, “A High-Performance Data Loading Framework for Distributed DNN Training in the Cloud,” 2022. Paper.
  4. C. Zhang, S. Smith, B. Sun, J. Tian, J. Soifer, X. Yu, S. L. Song, Y. He, and D. Tao, “Heat: A highly efficient and affordable training system for collaborative filtering based recommendation on CPUs,” in Proceedings of the 37th ACM International Conference on Supercomputing, ser. ICS ’23. New York, NY, USA: Association for Computing Machinery, 2023, p. 324–335. Paper.
  5. C. Zhang, B. Sun, X. Yu, Z. Xie, W. Zheng, K. A. Iskra, P. Beckman, and D. Tao, “Benchmarking and in-depth performance study of large language models on habana gaudi processors,” in Proceedings of the SC’23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis. SC-W ’23. New York, NY, USA: Association for Computing Machinery, 2023, p. 1759–1766. Paper.
  6. J. Jia, C. Xie, H. Lu, D. Wang, H. Feng, C. Zhang, B. Sun, H. Lin, Z. Zhang, X. Liu, and D. Tao, ”SDP4Bit:Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training,” accepted by the Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS) 2024. Paper.
  7. A. V. Babu, H. Chan, M. J. Cherukara, J. M. Monsalve Diaz, J. Doerfert, J. Feinstein, I. Foster, R. J. Harder, K. Hickey, J. H¨uckelheim, S. Kandel, R. Kettimuthu, T. Kumar, Z. Liu, M. MacDonell, A. Miceli, M. Ngom, P. Pal, N. Paulson, K. Picel, K. Raghavan, A. Ramanathan, E. Rangel, S. Raskar, V. Sastry, G. Sivaraman, B. Sun, M. Trovato, L. Valentino, Z. Xie, E. Yan, Y. Yao, K. Yoshii, X. Yu, and T. Zhou, “2022 ai testbed expeditions report,” Argonne National Laboratory (ANL), Argonne, IL (United States), Tech. Rep., 2022. Paper.

Invited Talk

  • 2025 ENGR-E 516 Engineering Cloud Computing Guest Lecturer. Link
  • 2023 Argonne Summer Student Seminar Link.
  • 2023 FZ Workshop Annual Meeting. Link

Recent Awards

  • 2025 SC Student Travel Award
  • 2025 NSF PPoPP Travel Grant
  • 2024 SIGHPC SC Student Travel Grants (1st)
  • 2024 SIGHPC SC Student Travel Grants (2nd)
  • 2022 SIGHPC Student Research Competition Travel Grants

Teaching

Fall 2025 Guest Lecturer, ISE (Indiana University Bloomington)
ENGR_E 516: Engineering Cloud Computing
Fall 2024 Associate Instructor, ISE (Indiana University Bloomington)
ENGR_E 516: Engineering Cloud Computing
Spring 2021 Teaching assistant, EECS (Washington State University)
CPT_S 460: Operating Systems and Computer Architecture

Other Publications

  1. X. Wei, F. Ye, O. Yonay, X. Chen, B. Sun, D. Tao, and T. Yang, “Fastclip: A suite of optimization techniques to accelerate clip training with limited resources,”. Paper.
  2. D. Wang, P. Grosset, J. Pulido, J. Tian, T. Athawale, J. Jia, B. Sun, B. Zhang, S. Jin, K. Zhao, J. Ahrens, and F. Song ”STZ: A High Quality and High Speed Streaming Lossy Compression Framework for Scientific Data,” The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC) 2025.Paper.
  3. (Co-1st author) S. Huang, B. Sun, L. Fan, X. Chen, J. Tian, Q. Shen, H.-Y. Wan, D. Tao, and E. Bao, “Ufn: User-friendly navigation framework based on systematic utilization of user-friendliness features from historical traffic data,” Journal of Computer Science and Technology, 2024. Paper.
  4. Q. Shen, S. Huang, B. Sun, X. Chen, D. Tao, H. Wan, and E. Bao, “Pvii: A pedestrian-vehicle interactive and iterative prediction framework for pedestrian’s trajectory,” Applied Intelligence, pp. 1–11, 2024. Paper.
  5. B. Sun, X. Yu, D. Tao. 2024. KVSort: Drastically Improving LLM Inference Performance via KV Cache Compression. Accepted The International Conference for High-Performance Computing, Networking, Storage, and Analysis (SC) 2024 ACM Student Research Competition Poster. Poster.
  6. B. Sun, X. Yu, K. Iskra, D. Tao. 2022. SurrogateTrain: Drastically Improving Performance of Data Loading for Training Scientific Surrogate Models. The International Conference for High-Performance Computing, Networking, Storage, and Analysis (SC’22) ACM Student Research Competition Poster. Poster.

Hobbies

Table Tennis, Rock Climbing, Go, Acoustic Guitar.