Introduction
I'm a PhD candidate in the Computer and Data Sciences (CDS) department at
Case Western Reserve University (CWRU)
working with my advisor
Prof. Li and co-advisor
Prof. Yin. Currently I focus on knowledge distillation, pruning and quantization,
self-training with direct preference optimization, truthfulness analysis,
knowledge probing, chain-of-thought prompting in large language models
(e.g., LLaMA model families); and security and privacy issues in federated learning.
Recently, I have been actively writing papers for prestigious machine learning conferences
such as NeurIPS, ICLR, ICML, ACL, EMNLP, NAACL, and COLM.
Before reaching CWRU, I obtained my master degree of Electrical and Computer Engineering
(ECE, Machine Learning and Data Science track)
from University of Southern California (USC), where I was a research member of
Data Science Lab supervised by
Prof. Kuppannagari, Hardware Accelerated Learning (HAL) Research Group
supervised by Dr. Kundu, and Neuro Image Computing Research (NICR) Group
supervised by
Prof. Shi. During my academical life at USC, I was devoted to using PyTorch to focus on
cutting-edge topics of applied Machine Learning such as missing smart-meter data imputation, privacy study
in knowledge distillation within computer vision, and filtering streamlines from diffusion MRI tractography.
What's more, I received my Bachelor's degree on Automation from Nanjing University of
Science and Technology. I once spent a fall semester as a transfer graduate student at the
ECE Department of North Carolina State University (NCSU), where I attained familiarity
with probability and improved my coding ability. My high school is Chengdu NO.7
High School(成都市第七中学).
Education
• Case Western Reserve University (CWRU), Cleveland, OH
Aug. 2022 – May 2027 (Expected)
Doctor of Philosophy in Computer and Data Sciences, total GPA 3.67/4.00
• University of Southern California (USC), Los Angeles, CA
Jan. 2020 – Dec. 2021 (Graduated)
Master of Science in Electrical and Computer Engineering, totel GPA 4.00/4.00
• North Carolina State University (NCSU), Raleigh, NC
Aug. 2019 - Dec. 2019 (Transfer)
Master of Science in Electrical and Computer Engineering, total GPA 4.00/4.00
• Nanjing University of Science and Technology (NJUST), Nanjing, China
Sep. 2015 - Jun. 2019 (Graduated)
Bachelor of Engineering in Automation, total GPA 3.39/4.00 (43/175)
Publications
Journal
[1] Souvik Kundu, Yao Fu, Bill Ye, Peter A. Beerel, and Massoud Pedram. Towards Adversary Aware Non-Iterative Model Pruning Through Dynamic Network Rewiring of DNNs.
Association for Computing Machinery (ACM), August 2021.
Conference
[1] Souvik Kundu, Qirui Sun, Yao Fu, Massoud Pedram, and Peter A. Beerel. Analyzing the Confidentiality of Undistillable Teachers in Knowledge Distillation.
Advances in Neural Information Processing Systems (NeurIPS 2021), May 2021.
[2] Sanmukh R. Kuppannagari, Yao Fu, Chung Ming Cheung, and Viktor K. Prasanna. Spatio-Temporal Missing Data Imputation for Smart Power Grids. 3rd International
Workshop on Applied Machine Learning for Intelligent Energy Systems (AMLIES 2021), May 2021.
[3] Yuan Li, Xinyu Nie, Yao Fu, and Yonggang Shi. FASSt: Filtering via Symmetric Autoencoder for Spherical Superficial White Matter Tractography.
In International Workshop on Computational Diffusion MRI, pp. 129-139. Cham: Springer Nature Switzerland, 2023.
Skills
• Languages: Python, Java, C/C++, SQL, MATLAB, R
• Developer Tools: PyCharm, VS Code, CLion, Git, IntelliJ, Eclipse
• Libraries: PyTorch, Hugging Face, Scikit-Learn, NumPy, Pandas, TensorFlow, Keras, OpenCV, NLTK
Experience
• PhD Research Assistant (CDS Department at CWRU)
Aug. 2022 – May 2027 (Expected)
=> Project 1: Explored knowledge distillation, parameter-efficient fine-tuning (LoRA, Quantization-aware LoRA, LoRAPrune) for
large language models (LLaMA model families) on mathematical reasoning capabilities measured by GSM8K.
=> Project 2: Implemented Training of Truth and Polarity Direction and Logistic Regression to probe LLaMA-3.1-8B's internal
representations to evaluate its truthfulness, measured by classification accuracy of probing classifiers on TruthQA.
=> Project 3: Implemented Activation-aware Weight Quantization (AWQ) and Pruning by Weights and Activations (Wanda) to compress
OPT-1.3B and LLaMA-2-7B. The compression performance is measured by perplexity on wikitext-2, zero-shot classification on commonsense reasoning,
and final model sizes.
=> Project 4: Realized Self-training with Direct Preference Optimization, Self-correct Reasoning for small-scale
language models (Llama-2-7B/13B) on mathematical/commonsense reasoning.
=> Project 5: Deployed a quantized LLaMA2-7B to run a chatbot on 14-inch MacBook Pro and observed the end-to-end latency
improvement achieved by different optimization techniques (loop unrolling, multithreading, and SIMD
programming) for the linear kernel.
=> Project 6: Realized one-shot federated learning via data distillation or data-free knowledge distillation techniques to reduce communication costs.
=> Project 7: Used PyTorch to implement both unconditional and conditional Denoising Diffusion Probabilistic Models (DDPM) based on CIFAR-10,
where the conditional one includes Classifier-Free-Guidance (CFG) and Exponential-Moving-Average (EMA).
• Master Research Assistant (Keck School of Medicine at USC)
Sep. 2021 – Jun. 2022
=> Project: Filtered brain streamlines from diffusion MRI tractography via deep learning models (U-Net and Autoencoders).
• Master Research Assistant (HAL Research Group at USC)
Jan. 2021 – Dec. 2021
=> Project: Explored the deep learning research on Robustness and Privacy associated with model compression such as
pruning, quantization, and knowledge distillation for image classification and object detection.
• Master Research Assistant (Data Science Lab at USC)
May 2020 – Sep. 2021
=> Project: Implemented missing data imputation using denoising Autoencoders and spatio-temporal graph neural networks.
Courses
Case Western Reserve University (CWRU)
• CSDS-410: Analysis of Algorithms. (my grade: A)
• CSDS-425: Computer Networks. (my grade: C)
• CSDS-433: Database Systems. (my grade: A)
• CSDS-456: Data Privacy. (my grade: A)
• CSDS-497: Natural Language Processing. (my grade: A)
• CSDS-600: Special Topics (Deep Generative Models). (my grade: A)
• CSDS-233: Introduction to Data Structures. (Teaching Assistant)
• CSDS-438: High Performance Data and Computing. (Teaching Assistant)
University of Southern California (USC)
• EE-510: Linear Algebra for Engineering. (my grade: A)
• EE-546: Mathematics of High-Dimensional Data. (my grade: A)
• EE-559: Machine Learning 1: Supervised Methods - basics of Supervised classification and regression. (my grade: A)
• EE-569: Introduction to Digital Image Processing. (my grade: A)
• EE-660: Machine Learning 2: Mathematical Foundations and Methods - Semi-supervised, and unsupervised machine learning; domain adaptation and transfer learning;
and techniques for interpretable machine learning. Feasibility of learning, model complexity, and performance (error) on unseen data. (my grade: A)
• CSCI-455x: Introduction to Programming System Design - basics of Java, C++ and Unix/Linux. (my grade: A)
• CSCI-570: Analysis of Algorithms. (my grade: A)
North Carolina State University (NCSU)
• ECE-513: Digital Signal Processing. (my grade: A)
• ECE-514: Random Process - basics of probability, statistics, and random process. (my grade: A)
• ECE-558: Digital Imaging Syetems - basics of digital image processing and computer vision. (my grade: A)
Online Courses
• Stanford-CS221: Artificial Intelligence.
• Stanford-CS224n: Natural Language Processing with Deep Learning.
• Stanford-CS224u: Natural Language Understanding.
• Stanford-CS224w: Machine Learning with Graphs.
• Stanford-CS229: Machine Learning.
• Stanford-CS231n: Deep Learning for Computer Vision.
• Stanford-CS330: Deep Multi-Task & Meta Learning.
• MIT-6.0001: Introduction to Computer Science and Programming in Python.
• MIT-6.0002: Introduction to Computational Thinking and Data Science.
• MIT-6.0006: Introduction to Algorithms.
• MIT-6.5940: TinyML and Efficient Deep Learning Computing.
• MIT-18.650: Statistics for Applications.
Honors
• MS Honors Program: USC Ming Hsieh Department of Electrical Engineering, Fall 2021.
Hobby
• Reading: I am particular interested in reading novels from Keigo Higashino (东野圭吾) such as “Journey Under the Midnight Sun (白夜行)”
and “The Miracles of the Namiya General Store (解忧杂货店)”, and history books such as History of China, History of the United States,
World War I and II, History of Europe, and so on. Feel free to chat with me!
• Exercising: I often run 4.5 miles or go to the gym for anaerobic exercises to relax.
• Watching: I am fond of watching Korean Dramas (All of Us are Dead, Squid Game, inter alia)
and American Dramas (Mayor of Kingstown, Special Ops: Lioness, House of the Dragon,
The Last of Us, The Walking Dead, inter alia).
• Playing Video Games: League of Legends, PUBG: Battlegrounds, Call of Duty, World War Z: Aftermath, and so on.
Contact
• Email address: yxf484@case.edu
• Google Scholar
• LinkedIn
• Github
• Twitter