This seminar is rescheduled to a later date in November.
Abstract: With the advancement of large language models (LLMs), their high training and inference costs have become a major bottleneck. This report focuses on cutting-edge algorithms to improve LLM efficiency. For pre-training, we will discuss data optimization methods that accelerate training by enhancing data quality. For inference, we will explore model compression (knowledge distillation) and architecture optimization (efficient attention mechanisms) as pathways to next-generation efficient model design.
