Transformer Ranker
Choosing the Right Transformer Model for Classification Task
The problem:There are too many pre-trained language models (LMs) out there. But which one of them is best for your NLP classification task? Since fine-tuning LMs is costly, it is not possible to try them all!
The solution: Transferability estimation with TransformerRanker!
TransformerRanker is a library that
-
quickly finds the best-suited language model for a given NLP classification task. All you need to do is to select a dataset and a list of pre-trained language models (LMs) from the 🤗 HuggingFace Hub. TransformerRanker will quickly estimate which of these LMs will perform best on the given task!
-
efficiently performs layerwise analysis of LMs. Transformer LMs have many layers. Use TransformerRanker to identify which intermediate layer is best-suited for a downstream task!!
Abstract (Approach Paper)
There currently exists a multitude of pre-trained transformer language models (LMs) that are readily available. From a practical perspective, this raises the question of which pre-trained LM will perform best if fine-tuned for a specific downstream NLP task. However, exhaustively fine-tuning all available LMs to determine the best-fitting model is computationally infeasible. To address this problem, we present an approach that inexpensively estimates a ranking of the expected performance of a given set of candidate LMs for a given task. Following a layer-wise representation analysis, we extend existing approaches such as H-score and LogME by aggregating representations across all layers of the transformer model. We present an extensive analysis of 20 transformer LMs, 6 downstream NLP tasks, and various estimators (linear probing, kNN, H-score, and LogME). Our evaluation finds that averaging the layer representations significantly improves the Pearson correlation coefficient between the true model ranks and the estimate, increasing from 0.58 to 0.86 for LogME and from 0.65 to 0.88 for H-score.