Beniwal, HimanshuKim, YoungwooSap, MaartenDan, SohamHartvigsen, ThomasBeniwal, HimanshuHimanshuBeniwalKim, YoungwooYoungwooKimSap, MaartenMaartenSapDan, SohamSohamDanHartvigsen, ThomasThomasHartvigsen2025-08-282025-08-282025-05-0110.48550/arXiv.2505.16722http://repository.iitgn.ac.in/handle/IITG2025/19874As large language models (LLMs) become increasingly prevalent in global applications, ensuring that they are toxicity-free across diverse linguistic contexts remains a critical challenge. We explore "Cross-lingual Detoxification", a cross-lingual paradigm that mitigates toxicity, enabling detoxification capabilities to transfer between high and low-resource languages across different script families. We analyze cross-lingual detoxification's effectiveness through 504 extensive settings to evaluate toxicity reduction in cross-distribution settings with limited data and investigate how mitigation impacts model performance on non-toxic tasks, revealing trade-offs between safety and knowledge preservation. Our code and dataset are publicly available at https://github.com/himanshubeniwal/Breaking-mBaden-USBreaking mBad! supervised fine-tuning for cross-lingual detoxificatione-Printe-Print123456789/435