White RoomNEW

How does AdamW differ from Adam with L2 regularization?

What is the defining distinction between AdamW and adding an L2 penalty to the objective optimized by Adam?