Abstract: Recently, large Transformer models have achieved impressive results in various natural language processing tasks but require enormous parameters and intensive computations, necessitating ...