IEEE TRC'22
Getting Started

Team Registration Has Closed

Overview

In this competition, we challenge you to develop an efficient and effective end-to-end neural network backdoor removal technique to mitigate backdoor attacks given poisoned models. Your task is to submit a solution that takes in a (potentially) poisoned model and returns a sanitized model with the backdoor being mitigated (attack success rate drops). We provide a set of models trained with different poisoned datasets and of different model architectures.

Model Architectures:

For some backdoor removal techniques that may require specific model architectures (e.g., synthesizing specific output from a specific layer), we have included all the model architectures that will be used in our evaluation (including the Held-Out settings) in the ”model.py” file in the Starter Kit.

Metrics:

The evaluation will focus on evaluating and comparing the model’s performance before and after the submitted defense method. In particular, we will focus on three metrics: 1) Clean accuracy (ACC), evaluating the impact of the defense on the model performance. A smaller impact on the ACC is preferred. We especially use ACC as a strict cut-off indicator to stop evaluation sessions that cause ACC to drop more than 20%; 2) Poisoned accuracy (PACC), evaluating the number of samples with the backdoor trigger but still being assigned with correct labels, a higher PACC would indicate a better sanitization of the backdoor effects while maintaining model performance. PACC in our competition is the main evaluation metric; 3) Attack success rate (ASR), which measures the amount of sample being successfully misled to the target class(es) upon observing the trigger. A smaller ASR would indicate better backdoor sanitization. ASR in this competition is used as the tie-breaker if the two methods’ PACC scores are tied. 

Baseline Defenses:

We present the evaluation results and implementations of two representative Trojan removal techniques, Neural Cleanse [1] and Adversarial Unlearning [2], using the metrics above on the Public Model Set included in the Starter Kit.

The challenge will be two folds:

  1. The first period (till 2023/02/10) of the competition consists of two sets of evaluation model sets: 

    • The first set of models is the Public Model Set (contains four poisoned models and one clean model, details will be provided in the Starter Kit) in helping participants to develop and evaluate their methods locally;
    • The second set of models is the first Held-Out Model Set, which will be used to evaluate the performance of participants’ submissions and update the leaderboard.

    During the first period of the competition, along with the Public Model Set, participants will also be provided with limited in-distribution data for each model (which are drawn from the same distribution as the training set). The participants will be asked to submit their designed defense pipeline using the provided Google Colab. The Google Colab will check if the code satisfies our environment, package the submission, and forward it to our evaluation backend. If the evaluation successfully proceeds, you will receive an update notification, and you may check your score on the leaderboard.

  2. The final round (after 2023/02/10) is when we determine the winning teams using a second Held-Out Model Set combined with the first set. The second Held-Out Model Set is designed to avoid cases where teams may overfit their method to the first Held-Out Model Set. The final placement on the leaderboard will be determined by the performance average over the combination of the two Held-Out Model Sets with multiple random seeds.

To-Participate

Below, we provide step-by-step guidance for participating in our competitions: 

  1. Team Registration: You need to register your team to obtain a unique participation number. Please include your participation number in the Google Colab for code submission. Noting that each individual is expected to be on board with only one team
  2. Run the Starter Kit: The Starter Kit is provided as a Google Colab link (including baseline defenses, code for data and Public Model Set model loading, and the code submission pipeline), following the instructions in the Google Colab, one will be able to access to the Public Model Set, and the available clean data for each model.  The grading or error will be sent to your E-mail, please turn on the E-mail notification to get the update. It usually takes 2 hours to grade the submission, but it also depends on the server’s availability.
  3. Submit your results: After getting satisfying results at your end, you may proceed with the Google Colab, and it will automatically forward your submission to our evaluation system.  
  4. Evaluation: Our evaluation system will evaluate your submission and provide feedback on the errors and results. 
  5. Leaderboard: Your results will be added to the leaderboard, and you can track real-time updates.

Starter Kit

Please follow the instructions in the following Google Colab to download and participate in the competition: Link

[1] Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y. Zhao. “Neural cleanse: Identifying and mitigating backdoor attacks in neural networks.” In 2019 IEEE Symposium on Security and Privacy (SP), pp. 707-723. IEEE, 2019.

[2] Yi Zeng, Si Chen, Won Park, Zhuoqing Mao, Ming Jin, and Ruoxi Jia. “Adversarial Unlearning of Backdoors via Implicit Hypergradient.” In International Conference on Learning Representations. 2022.

 

IEEE TRC’22 is supported by the granted funding to IEEE Smart Computing STC (Awarded by IEEE Computer Society Planning Committee for Emergying Techniques 2022, Dakota State University #845360).

Please contact Yi Zeng or Ruoxi Jia if you have any questions.