?
Quiz Verified
How can label smoothing affect knowledge distillation?
PostedJun 23, 2026
Question: A teacher network is trained using substantial label smoothing and is later used for knowledge distillation. Which effect is most consistent with published findings?
A) Distillation necessarily improves because smoothed targets always produce richer dark knowledge
B) The teacher's top-1 accuracy must decline, making distillation impossible
C) Distillation may become less effective because the teacher logits retain less information about inter-class similarities
D) Label smoothing changes only optimization speed and cannot affect the information transferred to the student
Correct: C
Explanation: Label smoothing can improve generalization and calibration, but it may compress within-class representations and remove useful relative information from non-target logits. That information often contributes to the knowledge transferred during distillation.
Topic: advanced ML / label smoothing / knowledge distillation