CV
Email: marshallrui@gmail.com
GPA: 3.80/4.0 (87.18/100)
Education
- B.S. in Hangzhou City Univercity, 2024
- Exchange Program in Zhejiang University, 2023
Working experience
- High-Performance Computing Engineer, Hangzhou City University Supercomputing Center, Jul 2024
- Large Language Model (LLM) Optimization: Conducted performance optimization for LLama70B, specifically focusing on enhancing inference speed through the implementation of parallel inference frameworks.
- Computational Operator Optimization in HPC: Executed performance optimizations for computational operators within traditional high-performance computing (HPC) environments, aiming to improve computational efficiency and throughput.
Awards
Feb. 2024: Team Leader, Scond Prize @ ASC24 Student Supercomputer Challenge Preliminary
Aug. 2023: Team Leader, Excellence Award @ “Shenwei-Guoshi Cup” 7th National CPU Parallel Application Challenge
Aug. 2022: Team Leader, Excellence Award @ ACM China - International Parallel Computing Challenge
Skills
- Programming Languages
- C/C++: experience in C/C++ programming, demonstrating strong foundational knowledge and adherence to best performance coding practices.
- Python experience in C/C++ programming, utilized for various computational and data analysis tasks.
- Heterogeneous Computing and Parallel Programming
- CUDA/OpenMP/MPI: Skilled in CUDA, OpenMP, and MPI parallel programming models, capable of developing efficient parallel programs using these technologies. Familiar with the CUDA computational model and its architectural evolutions.
- Libraries and Tools: Experienced in using cuBLAS and Intel MKL libraries for high-performance mathematical and linear algebra computations.
- Linux System Skills
- System Calls and Low-Level Programming: Proficient in Linux system calls and x86 assembly language, familiar with low-level assembly instructions, capable of low-level system programming.
- Performance Analysis and Optimization: Skilled in utilizing profiling and analysis tools such as Profile, Vtune, gperf, and Valgrind for in-depth performance analysis and optimization. Capable of diagnosing and resolving complex performance issues.
- Code Analysis and Debugging: Familiar with Debug tools, experienced in using gdb, NVIDIA Nsight for code analysis and debugging.