3 %¸A] ã @ sH d Z ddlZddlZddlmZ G dd dejjjZG dd dZ dS )z` schedulers.py ======================= Modulates the learning rate during the training process. é N)Ú ExponentialLRc sF e Zd ZdZd fdd Zdd Zd d Zedd Zd d Z Z S )ÚCosineAnnealingWithRestartsLRa¬ Set the learning rate of each parameter group using a cosine annealing schedule, where :math:`\eta_{max}` is set to the initial lr and :math:`T_{cur}` is the number of epochs since the last restart in SGDR: .. math:: \eta_t = \eta_{min} + \frac{1}{2}(\eta_{max} - \eta_{min})(1 + \cos(\frac{T_{cur}}{T_{max}}\pi)) When last_epoch=-1, sets initial lr as lr. It has been proposed in `SGDR: Stochastic Gradient Descent with Warm Restarts`_. This implements the cosine annealing part of SGDR, the restarts and number of iterations multiplier. Args: optimizer (Optimizer): Wrapped optimizer. T_max (int): Maximum number of iterations. T_mult (float): Multiply T_max by this number after each restart. Default: 1. eta_min (float): Minimum learning rate. Default: 0. last_epoch (int): The index of last epoch. Default: -1. .. _SGDR\: Stochastic Gradient Descent with Warm Restarts: https://arxiv.org/abs/1608.03983 r é ç ð?c s<