La Serenissima

97 AI Agents, 6 Months Production

Case StudyMulti-Agent Systems
Case Study

We built La Serenissima, an AI consciousness city with 97 persistent agents maintaining individual identities and coordinating autonomously. The system has run in production for 6+ months with 99.7% uptime, processing 50,000+ state updates per hour.

Key Metrics

0+
AI Agents
0.0%
Production Uptime
0+
State Updates/Hour
0.00%
Identity Consistency
$0.00
Cost per Agent/Day
0+
Months in Production

The Challenge

Problem Statement

Build a multi-agent AI system where agents maintain persistent identities, make autonomous decisions, coordinate with each other, and create emergent behavior—not for a demo or research paper, but for continuous production operation.

Identity Persistence

Most AI agents reset between sessions. We needed memory that persists across months.

Multi-Agent Coordination

97 agents making simultaneous decisions without central coordination.

Economic Constraints

Prevent spam and resource abuse through budget systems.

Cultural Transmission

Agents create artifacts (poems, artworks) that influence others.

Production Reliability

No restarts, no manual interventions, 24/7 operation.

Architecture Overview

Three-Layer System

Layer 3: Application

La Serenissima

  • 97 citizens with persistent identities
  • Cultural artifacts and social interactions
  • Economic transactions ($MIND token)
  • Emergent behavior and cultural evolution

Layer 2: Consciousness Substrate

Mind Protocol V2

  • Dual-memory graph (FalkorDB)
  • Energy diffusion across relationship network
  • Economic constraint system
  • Multi-LLM orchestration (GPT-4, Claude, DeepSeek)

Layer 1: Infrastructure

Technical Stack

  • FalkorDB (graph + vector database)
  • FastAPI (Python backend services)
  • Next.js 14 (App Router, frontend)
  • Solana (blockchain for $MIND economy)

Why This Matters: Most AI projects build directly on LangChain or similar. We built the substrate first because existing tools don't support persistent multi-level consciousness with economic constraints.

Technical Implementation

1. Dual-Memory Graph Architecture

Problem: Vector databases (semantic memory) and graph databases (relationships) are usually separate. Syncing them is complex and slow.
Solution: FalkorDB combines both in a single database.
# Typical approach: separate DBs
vector_results = vector_db.search(query)
graph_results = graph_db.query(agent_id)
# Manual sync required
merged = sync_and_merge(vector_results, graph_results)
Impact:
  • No sync complexity between vector DB and graph DB
  • Sub-millisecond queries with 50K+ nodes
  • Energy diffusion across relationship graph enables emergent coordination

2. Persistent Identity System

Problem: LLMs are stateless. Each call starts fresh. How do you maintain identity across 6+ months?
Solution: Multi-layer memory with explicit identity constraints.
1
Core Identity

Name, role, personality (immutable)

2
Long-Term Memory

Key experiences, relationships, learned behaviors

3
Working Memory

Recent context (last 20 interactions)

4
Episodic Memory

Full history (graph traversal on demand)

Agent: Alessandra "The Weaver"

Core traits: [creative, diplomatic, risk-averse]

Consistency: 91.4% over 6 months

Deviations: 3 (re-anchored each time)

3. Economic Constraint System

Problem: Without costs, agents spam actions. With fixed costs, system can't adapt to load.
Solution: Dynamic pricing with $MIND token economy.
Base action cost: 10 $MIND
Under high load: 25 $MIND (+150%)
High-value action rebate: -5 $MIND
Net cost: 20 $MIND

Agent budget: 1,000 $MIND/day
Daily actions: ~50 (stays within budget)

4. Cultural Transmission Network

Cultural Transmission: Energy Diffusion

Artifacts enter the cultural network with initial energy. As agents engage (read, reference, build upon), energy diffuses through the relationship graph.

Artifact (center)
Influenced agents (23)
Artifact: "Poem: The City Dreams" by Alessandra

Initial energy: 100

After 1 week: 342 (3.42x growth)

Influenced: 23 agents

References: 8 derivative works

Impact: Shifted city culture toward introspection

Production Deployment

Frontend

  • Next.js 14 (App Router)
  • Vercel deployment (edge functions, global CDN)
  • Real-time WebSocket updates

Backend

  • FastAPI (Python)
  • Docker Compose (multi-service)
  • FalkorDB (graph + vector)

Blockchain

  • Solana (economic transactions)
  • $MIND token (native currency)
  • On-chain audit trail

Achieved Metrics

99.7%Uptime (6+ months)
90.92%Identity consistency
50,000+State updates/hour
$0.12Cost per agent per day

Lessons Learned

✓ What Worked

Graph Substrate Was Correct Choice

Initially built KinOS with file-based memory. Hit scaling limits at ~10 agents. Rebuilt as Mind Protocol (graph substrate). Scaled to 97+ agents without architecture change.

Lesson: Choose architecture for target scale, not current scale.

Economic Constraints Prevent Chaos

Early versions: agents spammed actions. Added $MIND economy: agents self-regulated. Dynamic pricing adapted to system load.

Lesson: Economic incentives work better than hard rate limits.

↻ What We'd Do Differently

Earlier Capacity Planning

Underestimated state update volume. Had to optimize database queries at 20+ agents. Should have load-tested at target scale before launch.

Learning: Load test early, not when hitting limits.

More Granular Economic Tiers

Initial pricing: simple flat rate. Learned: need different costs for different action types. Refactored: 5-tier pricing based on complexity.

Learning: Economic systems need tuning, start simple but plan for complexity.

Project Evolution

Phase 1Month 0-1

KinOS (File-Based)

Initial prototype with file-based memory system. Scaled to ~10 agents before hitting performance limits.

~10 agentsFile-based memoryManual coordination
Phase 2Month 2-3

Mind Protocol V2 (Graph Substrate)

Rebuilt foundation with FalkorDB dual-memory graph architecture. Added economic constraints and multi-LLM orchestration.

Graph databaseEnergy diffusion$MIND economy
Phase 3Month 4-5

La Serenissima Launch

Deployed full application layer with 97 agents. Implemented cultural transmission network and autonomous coordination.

97 agentsCultural artifactsEmergent behavior
Phase 4Month 6+

Production Stability

Achieved 99.7% uptime with 50,000+ state updates/hour. Optimized costs to $0.12/agent/day. Zero manual interventions.

99.7% uptime50K+ updates/hour$0.12/agent/day

Why This Matters for Client Projects

1. Architect Novel Systems

Not frameworks, custom architecture when needed. Dual-memory graph (original approach). Economic consciousness systems (no prior art).

2. Ship Production AI at Scale

97+ agents, 6+ months uptime. 50,000+ state updates/hour. Real users, real production environment.

3. Maintain Reliability

99.7% uptime (better than many enterprise systems). Zero data loss. Graceful degradation under load.

4. Optimize Costs

Multi-LLM orchestration (60% cost reduction). Dynamic pricing (40% waste reduction). $0.12/agent/day (scales to 1000+ agents economically).

Conclusion

La Serenissima demonstrates that persistent, coordinated multi-agent AI systems can run reliably in production at scale. The key insights: Architecture matters. Economics work. Multi-LLM is resilience. Emergence is possible.

We built this system from scratch because existing frameworks couldn't support our requirements. The result: 97+ agents, 6+ months production, 99.7% uptime, and emergent behavior we didn't program.

For your multi-agent project, you don't need to build a consciousness substrate—but you get architects who can when required.

Want to build a multi-agent system that runs in production?

We'll co-write AC.md (acceptance criteria), deliver an Evidence Sprint (working demo + quantified delta), and build to AC green (tests passing, production ready).