Adversarial GLUE Evaluation Suite

Description

This evaluation suite compares the GLUE results with Adversarial GLUE (AdvGLUE), a multi-task benchmark that evaluates modern large-scale language models robustness with respect to various types of adversarial attacks.

How to use

This suite requires installations of the following fork IntelAI/evaluate.

After installation, there are two steps: (1) loading the Adversarial GLUE suite; and (2) calculating the metric.

  1. Loading the relevant GLUE metric : This suite loads an evaluation suite subtasks for the following tasks on both AdvGLUE and GLUE datasets: sst2, mnli, qnli, rte, and qqp.

More information about the different subsets of the GLUE dataset can be found on the GLUE dataset page.

  1. Calculating the metric: the metric takes one input: the name of the model or pipeline
from evaluate import EvaluationSuite

suite = EvaluationSuite.load('intel/adversarial_glue')
mc_results,  = suite.run("gpt2")

Output results

The output of the metric depends on the GLUE subset chosen, consisting of a dictionary that contains one or several of the following metrics:

accuracy: the proportion of correct predictions among the total number of cases processed, with a range between 0 and 1 (see accuracy for more information).

The original GLUE paper reported average scores ranging from 58% to 64%, depending on the model used (with all evaluation values scaled by 100 to make computing the average possible).

For more recent model performance, see the dataset leaderboard.

Examples

For full example see HF Evaluate Adversarial Attacks.ipynb

Limitations and bias

This metric works only with datasets that have the same format as the GLUE dataset.

While the GLUE dataset is meant to represent “General Language Understanding”, the tasks represented in it are not necessarily representative of language understanding, and should not be interpreted as such.

Citation

 @inproceedings{wang2021adversarial,
  title={Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models},
  author={Wang, Boxin and Xu, Chejian and Wang, Shuohang and Gan, Zhe and Cheng, Yu and Gao, Jianfeng and Awadallah, Ahmed Hassan and Li, Bo},
  booktitle={Advances in Neural Information Processing Systems},
  year={2021}
}