Synthical
Your space
Profile
Favorites
Folders
Feeds
All articles
Articles by Rongwu Xu | Synthical
Claim page
Rongwu Xu
Follow
Activity
Upvotes
Folders
Articles
10
Humanity's Last Exam
12 September 2025 by
Long Phan
and
others
Machine Learning
,
Artificial Intelligence
AICrypto: A Comprehensive Benchmark For Evaluating Cryptography Capabilities of Large Language Models
8 August 2025 by
Yu Wang
and
others
Cryptography and Security
Difficulty-Based Preference Data Selection by DPO Implicit Reward Gap
6 August 2025 by
Xuan Qi
and
others
Computation and Language
,
Artificial Intelligence
The Singapore Consensus on Global AI Safety Research Priorities
30 June 2025 by
Yoshua Bengio
and
others
Artificial Intelligence
,
Computers and Society
AI Awareness
29 June 2025 by
Xiaojian Li
and
others
at
Tsinghua University
Artificial Intelligence
,
Computation and Language
Does Chain-of-Thought Reasoning Really Reduce Harmfulness from Jailbreaking?
23 May 2025 by
Chengda Lu
and
others
at
Tsinghua University
Artificial Intelligence
Nuclear Deployed: Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents
23 March 2025 by
Rongwu Xu
and
others
Computation and Language
,
Artificial Intelligence
On the Role of Attention Heads in Large Language Model Safety
24 February 2025 by
Zhenhong Zhou
and
others
Computation and Language
,
Artificial Intelligence
Long
^2
RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall
27 January 2025 by
Zehan Qi
and
others
Computation and Language
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs
20 December 2024 by
Zhongshen Zeng
and
others
Computation and Language
,
Artificial Intelligence
Sing it, Narrate it: Quality Musical Lyrics Translation
29 October 2024 by
Zhuorui Ye
and
others
Computation and Language
,
Sound
Course-Correction: Safety Alignment Using Synthetic Preferences
26 October 2024 by
Rongwu Xu
and
others
at
Tsinghua University
Computation and Language
,
Artificial Intelligence
DebateQA: Evaluating Question Answering on Debatable Knowledge
2 August 2024 by
Rongwu Xu
and
others
Computation and Language
Walking in Others' Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias
22 July 2024 by
Rongwu Xu
and
others
Computation and Language
,
Artificial Intelligence
Knowledge Conflicts for LLMs: A Survey
22 June 2024 by
Rongwu Xu
and
others
at
Tsinghua University
Computation and Language
,
Artificial Intelligence
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
13 June 2024 by
Zhenhong Zhou
and
others
at
Tsinghua University
Computation and Language
,
Artificial Intelligence
Preemptive Answer "Attacks" on Chain-of-Thought Reasoning
31 May 2024 by
Rongwu Xu
and
others
Computation and Language
,
Artificial Intelligence
The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation
1
31 May 2024 by
Rongwu Xu
and
others
at
Tsinghua University
Computation and Language
,
Artificial Intelligence
Experimental Limits on Solar Reflected Dark Matter with a New Approach on Accelerated-Dark-Matter-Electron Analysis in Semiconductors
24 April 2024 by
Z. Zhang
and
others
at
Tsinghua University
High Energy Physics
,
Instrumentation and Detectors
First Search for Light Fermionic Dark Matter Absorption on Electrons Using Germanium Detector in CDEX-10 Experiment
15 April 2024 by
Jin Xin Liu
and
others
High Energy Physics
,
Instrumentation and Detectors
Constraints on the Blazar-Boosted Dark Matter from the CDEX-10 Experiment
29 March 2024 by
Rongwu Xu
and
others
at
Tsinghua University
High Energy Physics
,
Instrumentation and Detectors
Exploring Chinese Humor Generation: A Study on Two-Part Allegorical Sayings
16 March 2024 by
Rongwu Xu
Computation and Language
,
Artificial Intelligence
Probabilistic central Bell polynomials
1 March 2024 by
Rongwu Xu
and
others
Number Theory
Tempo: Confidentiality Preservation in Cloud-Based Neural Network Training
21 January 2024 by
Rongwu Xu
and
Zhixuan Fang
at
Tsinghua University
Cryptography and Security
,
Machine Learning
Topics
We have not analyzed this profile yet, please check back later