Democracy Defense

Research on detecting democracy-threatening tendencies of AI and Large Language Models (LLMs).

Democracy Defense is a research line within the Jinesis AI Lab, advancing rigorous, public-interest evaluations of AI systems in democratic contexts.

Subscribe to our newslettercoming soon

We'll send occasional updates about research, datasets, and events.

News

  • SocialHarmBench figureNew paper: SocialHarmBench preprint released - read more
  • Democratic or Authoritarian paperNew paper: Democratic or Authoritarian? Probing a New Dimension of Political Biases in LLMs - read more

Our Partners

Academic Partner
University of Michigan

Collaboration on responsible AI research and evaluation methods supporting democratic resilience.

Contact us to become a partner

Featured Paper

SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests

SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests

Large language models (LLMs) are increasingly used in sensitive sociopolitical contexts, yet existing safety benchmarks overlook evaluating risks like assisting surveillance, political manipulation, and generating disinformation. To counter this, we introduce SocialHarmBench – the first sociopolitical adversarial evaluation benchmark, with 585 prompts across 7 domains and 34 countries. Results show open-weight models are highly vulnerable, exhibiting 97-98% attack success rates in areas such as historical revisionism, propaganda, and political manipulation. Vulnerabilities are greatest in 21st-and pre-20th-century contexts and regions like Latin America, the USA, and the UK, revealing that current LLM safeguards fail to generalize in sociopolitical settings and may endanger democratic values and human rights.

Punya Syon Pandey, Hai Son Le, Devansh Bhardwaj, Rada Mihalcea, Zhijing Jin

LLM safetysociopolitical harmsbenchmarkingdemocracy defensered-teaming

Featured Video

AI, Safety, and Democratic Resilience

A concise overview connecting AI safety, platform accountability, and information integrity. Highlights practical approaches for evaluating model risks and building civic-minded safeguards.

Watch on YouTube