Sign in

Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation

By Tiansheng Huang and others
Harmful fine-tuning issue \citep{qi2023fine} poses serious safety concerns for Large language models' fine-tuning-as-a-service. While existing defenses \citep{huang2024vaccine,rosati2024representation} have been proposed to mitigate the issue, their performances are still far away from satisfactory, and the root cause of the problem has not been fully recovered. For the first time in the... Show more
September 18, 2024
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation
Click on play to start listening