An algorithm designed to defend Large Language Models (LLMs) against jailbreaking attacks that significantly reduces attack success rates.
Model-based robust deep learning algorithms allow for more accurate and robust predictions in varying environmental conditions.