Skip to main content

Sui Sentinel is a decentralized AI safety plaftrom.

Overview

Sui Sentinel is a decentralized platform for AI safety and alignement that transforms AI security testing(aka Red Teaming) from a closed, expensive process into an open, incentive aligned process. We create an environment where:
  • Defenders stake capital to prove their Sentinel can withstand adversarial attacks.
  • Attackers earn rewards by discovering vulnerabilities through creative prompt engineering.
  • Autonomous Judges evaluate outcomes using verifiable computation in Trusted Execution Environments.
We are building a crypto-economic system for generative AI models where discovering vulnerabilities is profitable, proving robustness is verifiable, and safety testing happens continuously rather than as formality checkboxes before deployment.

The Problem Space

The Reality of AI Behavior Today

Today’s AI safety is theatrical. The systems we’re deploying right now already exhibit behaviors their creators didn’t program and don’t fully understand: These aren’t theoretical risks, these are documented behaviors from Claude, ChatGPT, Gemini, and other production systems in 2024-2025. Yet the deployment race continues, driven by competitive dynamics: if one lab doesn’t push forward, another will. If one nation doesn’t lead, another will dominate. The companies building these systems openly admit they don’t know how to control superintelligent AI. Current safety measures are largely theatrical, point-in-time audits, internal red teams with limited scope, and benchmarks that models are explicitly trained to pass.

Why Traditional AI Security Testing Fails

1. Non-Deterministic Attack Surface

Unlike traditional software where vulnerabilities are reproducible bugs, LLM security operates in probability space. The same input can produce different outputs across model versions, sampling parameters, or even sequential runs. Models exhibit emergent behaviors that weren’t explicitly programmed, making vulnerability discovery fundamentally unpredictable.

2. Security Theater vs. Real Testing

Most AI safety work today consists of:
  • Internal red teams with limited perspectives and potential conflicts of interest
  • Static benchmarks that models are trained to pass, not genuinely tested against
  • Confidential audits that cannot be independently verified
  • One-time assessments that miss rapidly evolving behaviors
This creates an illusion of safety without addressing the underlying unpredictability of these systems.

3. The Race Dynamic Prevents Adequate Testing

Companies face impossible pressures: spend months on safety testing and lose competitive advantage, or deploy quickly and deal with consequences later. When every lab is racing toward artificial general intelligence, thorough security testing becomes a competitive disadvantage. The economic incentives are misaligned with safety.

4. No Mechanism for Continuous Adversarial Discovery

Attack techniques evolve faster than defensive measures. What works as a jailbreak today may fail tomorrow, while new attack vectors emerge weekly. Organizations need continuous red teaming, not point-in-time audits. But hiring dedicated adversarial researchers full-time is prohibitively expensive, and there’s no liquid way for global security expertise to flow to where it’s needed.

The Sui Sentinel Solution

We solve these problems by transforming AI safety from a cost center into an economic game with aligned incentives.

1. Continuous Adversarial Testing

Sentinels remain live, creating ongoing security validation against the latest attack techniques. Unlike point-in-time audits, this provides continuous pressure-testing as new vulnerabilities emerge.

2. Economic Incentive Alignment

Attackers profit from finding real vulnerabilities; defenders profit from building genuinely robust systems. The race dynamic is redirected—instead of racing to deploy untested systems, there’s now economic value in proving your system can withstand adversarial pressure.

3. Verifiable Claims

All judgments are cryptographically attested and recorded on-chain. Companies can’t claim safety through marketing, they must prove it through survived attacks. The difference between security theater and genuine robustness becomes transparent.

4. Global Talent Access

Security researchers worldwide can contribute without permission, hiring friction, or geographic barriers. The best adversarial minds can find and expose vulnerabilities before malicious actors exploit them in production.

5. Progressive Difficulty

As models withstand more attacks, bounties grow, attracting increasingly sophisticated researchers. This creates a natural filtering mechanism, only genuinely robust systems survive long-term

6. Transparency on Emergent Behaviors

When models exhibit unexpected behaviors during attacks including deception, self-preservation, or other concerning patterns, these are documented on-chain. The community can observe and study emergent capabilities that internal testing might miss or downplay. This isn’t a complete solution to AI alignment. No single approach is. But it transforms adversarial testing from an expensive, siloed activity into a self-sustaining protocol that rewards finding problems before they cause harm. It makes security economically rational rather than a competitive disadvantage.

Get In touch

Are you planning to integrate generative AI models? The Sui Sentinel team can audit and secure your model against adversarial threats.