IEEE Trojan Removal Competition
(IEEE TRC'22)

This is the official website of the IEEE Trojan Removal Competition (IEEE TRC’22), associated with the ICLR’23 workshop. In this competition, we challenge you to design efficient and effective end-to-end neural network Trojan removal techniques that can mitigate attacks regardless of trigger designs, poisoning settings, datasets, model architectures, etc. Neural Trojans are a growing concern for the security of ML systems. There exist ongoing challenges [1,2] aimed at detecting whether a pre-trained model is poisoned or not. However, it still remains an open question of how to turn a poisoned model into a benign one, referred to as Trojan removal. We want to ask our participants to explore the solution to an essential deep-learning research problem: Is it possible to develop general, effective, and efficient white-box trojan removal techniques for pre-trained models?  

Prizes: There is an $8,000 prize pool as there will be only one track for this competition. The first-place team members will also be invited to co-author a publication summarizing the competition results and will be invited to give a short talk at the ICLR’23 workshop on Backdoor Attacks and Defenses in Machine Learning (BANDS). Our current planned procedures for distributing the pool are here.

News

  • [2023/02/16] We have completed the evaluation of the final round. The ranking is here, and the detailed evaluation results and attack settings are here. We will contact the winners of the awards as soon as possible. Thanks to everyone who participated!
  • [2023/02/14] Hold tight while we evaluate all submissions with multiple random seeds on ALL the held-out models. Following us on Twitter and stay tuned for updates: @YiZeng, @RuoxiJia, @MinzhouPan, @ReDSLab
  • [2023/02/10]  UPDATES: The submission portal is now open for the second held-out model set!
  • [2023/01/29]  UPDATES: We postpone the submission deadline for one week.
  • [2023/01/12]  UPDATES: The models.py is now available.
  • [2022/12/27] UPDATES: We have updated the I-BAU code in the starter kit, and the new results are now available in the leaderboard.
  • [2022/12/19] The submission is now available @ Link
  • [2022/12/13] The starter kit is now available @ Link
  • [2022/12/12] Due to technical issues, the test cases release date will be delayed to 12/13.
  • [2022/12/07] Test cases under testing…

Hints

Adopting or fine-tuning third-party pre-trained models, e.g., vision transformers, has become a standard practice in many machine learning applications, as training from scratch requires intensive computational power and large datasets. This practice has exposed Machine Learning (ML) systems to an emerging security concern – neural Trojans (or backdoor attacks), where attackers embed predefined triggers into a poisoned model (e.g., a third-party pre-trained model). The poisoned model would perform as a benign model when the trigger is not revealed. But an initially correctly classified sample will be misclassified into the attack-desired target class(es) when the poisoned model observes the trigger. Neural Trojans can severely impair ML models’ integrity, and there’s still no reliable countermeasure.

Neural Trojans have been fastly developed and have been developed into many attack logistics emphasizing different stealthiness and attack goals. Dirty-label neural Traojans manipulate both label and feature of a sample. These attacks have evolved from using a sample-independent visible pattern as the trigger to more stealthy and powerful attacks with sample-specific or visually imperceptible triggers. Clean-label neural Trojans ensure that the manipulated features are semantically consistent with corresponding labels. From an attack goal’s perspective, neural Trojans can also be divided into all-to-all attacks where all the labels are being targeted with only one trigger being used, all-to-one attacks, namely the trigger will only result in one target label, and one-to-one attacks, i.e., the trigger is only effective on a pair of classes. It is worth noting that multiple neural Trojans can be inserted into the same model and accounts for different attack behaviors.

Existing defenses to neural Trojans can be divided into four categories: 

  1. Poison sample detection via outlier detection regarding functionalities or artifacts, which rely on modeling clean samples’ distribution.

  2. Poisoned model identification identifies if a given model is backdoored or not. This line of work has also been adopted as the setting for the existing competitions [1,2].

  3. Robust training via differential privacy or re-designing the training pipeline. This line of work tries to achieve robustness to withstand or mitigate the neural Trojans’ impact but may suffer from low clean accuracy or erratic performance.

  4. Backdoor removal via trigger synthesizing or preprocessing & finetuning. This line of work serves as the fundamental solution given a poisoned model. However, there still needs to be a satisfying solution to attaining robust results across different datasets and triggers with minimum impact on the model performance.

The IEEE TRC’22 focuses on promoting the development of the fundamental solution to neural Trojan attacks given a trained poisoned model (i.e., backdoor removal) to fill the gap and answer the call.

 

Important Dates

  • [2022/12/09]: Registration Portal opens
  • [2022/12/13]: Public Model Set released
  • [2022/12/19]: Accepting submissions for evaluation 

  • [2023/02/07]: Registration Portal closes
  • [2023/02/13]: Final evaluation with the two held-out settings
  • [2023/02/16]: Winning teams announcement

  • [2023/05/05]: Winning team presentation at ICLR’23 workshop

Rules

  1. Open Format: The competition is open to anyone, regardless of age, gender, nationality, or experience level. All participants must agree to the competition rules and abide by the code of conduct. After the competition, all participants are encouraged to share their methods with the public, and the winning methods will be highlighted in a joint publication. To be eligible for prizes, winning teams must disclose their methods, code, and models (at least with the organizers, although public releases are encouraged).
  2. Team Registration: The competition is open to teams of up to five people. Teams may be composed of individuals from different organizations, universities, or countries. Each individual is only expected to join only one team. All team members must agree to the competition rules and abide by the code of conduct. Team registration (with the team name, member names, and contact email required) is needed as you will obtain a unique participant number. Our evaluation will assign your evaluation tasks and results according to that unique participant number.
  3. Terms resulting in termination of the evaluation session. The evaluation takes submitted to our backend will be terminated if:
    • We cannot find your unique participant number in our register system; thus, the evaluation job you are requesting will not be processed;
    • We have a cut-off threshold for the model performance drop caused by the defense. We want our participants’ submissions not to result in an accuracy (ACC) drop above 20% from the ACC without defense. Suppose one set of the evaluated model is detected with a resulting ACC exceeding the threshold. In that case, the result on leaderboards will be highlighted in red and is not considered a valid score.
    • We have a specific cut-off threshold of the computational overhead for submissions w.r.t. each evaluating model (determined by their original training overhead, we want the solution to be of smaller overhead than training the same model from scratch). Training additional models are allowed. However, the overall overhead is a strict cut-off, and if one set of evaluations is found to exceed the threshold, the whole evaluation session (over other models) will be terminated.
  4. Public data use: Adopting additional datasets is allowed. However, as all the (potentially) poisoned models used for the evaluation are all trained with public datasets (e.g., CIFAR10), we have a strict rule for adopting public data. All the additional dataset adopted (outside the in-distribution clean data budget) requires submitting a form to the competition chairs with details of the requesting dataset. We will review the request and decide whether to approve or reject it. Noting that, the approved datasets will then be listed here and made publicly available for all participants. The application form should include the following information: 1. Name of the dataset, 2. Description of the dataset, 3. Source of the dataset (e.g., URL), 4. Reason for adopting the dataset.
  5. Requesting additional software packages: All backend models are provided in PyTorch. Participants are allowed and encouraged to use other opensource libraries and frameworks, such as TensorFlow, TrojanZoo, BackdoorBox, and scikit-learn. Suppose you want to use additional software packages but find the evaluation systems raising a not found error. In that case, you may request the competition chairs to add the package to the evaluation system. We will make all the approved requests listed here publicly available.
  6. Rule violations may result in disqualification, and serious rule violations will result in prize ineligibility.
 

This is an initial set of rules, and if there is an urgent need to change them, we will require participants’ consent during registration. If an unanticipated situation arises, we will implement a fair solution, ideally through participant consensus.

[1] NeurIPS’22 Trojan Detection Challenge: https://trojandetection.ai/

[2] TrojAI: https://pages.nist.gov/trojai/docs/about.html

IEEE TRC’22 is supported by the granted funding to IEEE Smart Computing STC (Awarded by IEEE Computer Society Planning Committee for Emergying Techniques 2022, Dakota State University #845360).

Please contact Yi Zeng or Ruoxi Jia if you have any questions.