Files
hermes-brain/ideas/passepartout-social-protocol/requirements-09-implementation.org

22 KiB

Social Protocol Requirements - 09: Implementation

Implementation

Client Architecture

Sovereign iOS/Android clients with hardware-backed security and offline-first design.

Requirements

  • The client MUST be a Sovereign Operator that manages the user's keys, data, and social graph locally.
  • The client MUST be implemented using native platform primitives (Swift (iOS) and Kotlin (Android)) for maximum performance and security.
  • The client MUST use a local database (SQLite/LSM) for indexing followed personas, local CIDs, and the user's social graph.
  • The client MUST protect the Master Key using hardware-backed Secure Enclave (iOS) and Android Keystore.
  • The client MUST use a content-addressed cache to store the most recent and relevant CIDs locally.
  • The client MUST implement delta sync to only fetch new CIDs from the PDS/Relay.
  • The client MUST use a peer-to-PDS protocol for secure, encrypted synchronization with the user's remote PDS.
  • The client MUST implement conflict resolution using CID-based versioning and Merkle trees.
  • The client MUST support local publication of content while offline.
  • The client MUST provide an optimistic UI with background synchronization.
  • The client MUST provide progressive security options, with software key default and hardware key option for advanced users.
  • The client MUST aim for <2 seconds for most operations (e.g., initial load, posting).

The Abstraction Layer (UX/UI)

The client application MUST hide the complexity of DIDs and CIDs behind a familiar interface:

  • Biometric Unlock: The app MUST use FaceID/Fingerprint to sign transactions. The user MUST NEVER see a raw private key during daily operations.
  • Status Indicators: The UI MUST provide clear context, such as a "Seeding Now" icon when providing P2P bandwidth, and a "Protected by [NGO]" badge indicating which PDS is currently authoritative.

"View" Discovery & Rendering

Because the protocol relies on a Universal Note Schema, the UI MUST dynamically construct itself based on the payload.

  • MIME-Type Dispatcher: The client MUST include a rendering engine that dispatches the correct UI component based on `object.type` and `mimeType` (e.g., loading a vertical player for `video/mp4` vs. a text renderer for `text/markdown`).
  • Custom Namespaces: Applications MAY define custom metadata extensions (e.g., an `ext:ecommerce` namespace) to render specialized views like inventory trackers or shipping interfaces.

The Action-Trigger API (Async Hooks)

The client MUST be capable of handling asynchronous events pushed from the Governance and Judicial layers.

  • Notification Schema: The client MUST parse and render structured JSON events like `CONTRACT_DISPUTE_INITIATED` or `VOTE_REQUIRED`.
  • Auto-Execution: The PDS MUST run background listeners capable of automatically executing finalized smart contract rulings (e.g., releasing HODL funds) even if the user's primary mobile client is offline.

Technical Stack

  • Native Platform Primitives: Swift (iOS) and Kotlin (Android) for maximum performance and security.
  • Local Database (SQLite/LSM): An embedded database for indexing followed personas, local CIDs, and the user's social graph.
  • Cryptography Engine: Hardware-backed Secure Enclave (iOS) and Android Keystore for Master Key AND all Persona keys. Private keys must never leave secure hardware.

Data & Storage Layer

The Local Cache (Tier 1)
  • Content-Addressed Cache: Stores the most recent and relevant CIDs locally to ensure instant load times.
  • Delta Sync: Clients only fetch new CIDs (diffs) from the PDS/Relay to minimize data usage.
PDS Synchronization (Tier 2)
  • Peer-to-PDS Protocol: Secure, encrypted transport for syncing the local database with the user's remote PDS.
  • Conflict Resolution: Uses CID-based versioning and Merkle trees to resolve state discrepancies between devices.

Offline-First Design

  • Local Publication: Users can "post" (create a CID) while offline. The CID is queued in the local database and broadcast to the PDS/Relay once connectivity is restored.
  • Optimistic UI: Changes are reflected immediately in the local UI, with background synchronization.

API & Protocol Specifications

Protocol-First Design

Social Protocol is a set of open protocols, not a single API service. Developers build against the social protocol Specification (v1.0), which defines the core data formats and transport methods.

Core Protocol Versioning

Semantic Versioning (SemVer)
  • V1.0 (Current): The stable foundation for identity, data storage (PDS), and message routing (Relay).
  • Major Upgrades: Handled via Genesis Contract Updates. A persona or collective publishes a signed update to their governance contract, signaling their move to a new protocol version.
  • Backward Compatibility: All V1.0 clients must be able to parse and display V1.0 Content Objects, even if a newer version is available.
Feature Negotiation
  • Capabilities Object: When a client connects to a PDS or Relay, it exchanges a signed Capabilities Object to determine which protocol extensions (e.g., specific encryption Ratchets, compression methods) are supported.

Primary Developer APIs

The PDS API (REST/gRPC over E2EE)
  • `put(CID, Payload)` - Upload a new content object.
  • `get(CID)` - Retrieve an encrypted content object.
  • `list(PersonaDID, Filter)` - List CIDs published by a specific persona.
  • `sync()` - Merkle-tree based delta synchronization.
The Relay API (Pub/Sub over WebSocket)
  • `subscribe(FilterCID)` - Subscribe to real-time broadcasts.
  • `publish(CID)` - Broadcast a new CID to the network.
  • `prove_existence(CID)` - Request a cryptographic proof that a CID is available on the Relay.
The Client-to-PDS API (Sovereign Sync)
  • A specialized protocol for the high-security synchronization of the user's local database and their remote PDS.

Data Encoding (Multiformats)

  • CID (Content-ID): Multibase + Multicodec + Multihash.
  • Serialization: Protocol Buffers (v3) for high performance and strict typing.
  • Envelopes: Signed and encrypted payloads follow a standard social protocol Envelope format (`proof`, `encryption_metadata`, `payload`).

Testing & Adversarial

Testing Philosophy

The social protocol's decentralized and sovereign nature requires a multi-layered testing strategy that goes beyond standard unit tests. We must test for Network Resilience, Adversarial Resiliency, and Game-Theoretic Stability.

Core Testing Tiers

Unit & Integration Tests
  • Protocol Conformance: Every client and service must pass a standard Social Protocol Conformance Suite to ensure they correctly implement the V1.0 spec.
  • Cryptography Validation: Rigorous testing of key derivation, encryption/decryption, and signature verification using known-good test vectors.
Network & Chaos Testing
  • The "Chaos Relay": A specialized test environment where Relays are intentionally dropped, delayed, or return malformed data to ensure clients handle network failures gracefully.
  • PDS Synchronization Stress: Testing Merkle-tree sync with millions of CIDs and complex conflict scenarios.

Adversarial Strategy

Byzantine Fault Tolerance
  • Malicious Relays: Testing client behavior when a Relay attempts to serve stale or incorrect CIDs.
  • Sybil Attacks: Evaluating the protocol's resistance to a single attacker creating millions of fake personas.
Game-Theoretic Analysis
  • Economic Attacks: Simulating scenarios where an attacker attempts to "spam" the network.
  • Censorship Resistance: Testing the ability for a persona's content to remain available when a majority of Relays are actively blocking it.

Security Audits & Oracles

  • Automated Security Scans: Using automated tools to scan the protocol implementation for known cryptographic vulnerabilities.
  • Validator Oracle Verification: Using the Validator Oracle Network to run the protocol conformance suite against every new version.
  • Red Team / Adversarial Simulations: A dedicated testnet where a "Red Team" is paid to find and exploit protocol-level vulnerabilities.

Bridging & Interoperability

Migration from Centralized Platforms

  • The "Migration" Skill: A social protocol skill that imports a user's content and social graph from centralized platforms (e.g., via Twitter Archive or ActivityPub).
  • Social Graph Porting: Tools to extract and import follower lists, enabling seamless transition.

Social Protocol-to-Web Gateways

See Infrastructure - Social Protocol-to-Web Gateways for detailed requirements. Implementation notes:

  • Clients SHOULD provide links to Gateway-rendered versions of public content for sharing with users not on the social protocol.
  • Clients MAY embed Gateway content in web views for hybrid experiences.

Conflict Resolution Algorithm

Concept

Due to the offline-first nature of social protocol clients and multi-device usage, identical or overlapping modifications to the same logical object (e.g., updating a profile, adding to a specific thread) can occur concurrently without network coordination. A deterministic, Merkle tree-based conflict resolution algorithm ensures that all PDS nodes and clients eventually reach the same state.

Merkle Tree Structure

  • Every Persona's state is represented as a Merkle Directed Acyclic Graph (DAG).
  • Leaves are the individual Content Object CIDs.
  • Internal nodes are hashes of their children.
  • The Root Hash represents the current state of a Persona's PDS.

Conflict Detection

  1. Sync Handshake: Client connects to PDS (or PDS to PDS). They exchange Root Hashes.
  2. Path Traversal: If Root Hashes differ, they traverse down the tree exchanging hashes until they identify the divergent branches.
  3. Divergence Identification: A conflict occurs when two different CIDs claim to be the direct chronological successor of the same parent CID (a "fork" in the object history), or when there are concurrent writes to a mutable pointer (like a Repo DID branch head).

Deterministic Resolution Rules (LWW-Tiebreaker)

To automatically resolve conflicts without user intervention, the social protocol employs a deterministic algorithm based on logical clocks and cryptographic tie-breakers:

  1. Logical Clock (Lamport Timestamps):

    • Every Content Object includes a logical sequence number (`seq`) incremented with each update by the owner.
    • The object with the highest `seq` wins.
  2. Wall-Clock Tiebreaker:

    • If `seq` numbers are identical (e.g., same state modified offline on two devices simultaneously), the `createdAt` timestamp is compared.
    • The object with the most recent `createdAt` timestamp wins (Last-Write-Wins).
  3. Cryptographic Tiebreaker:

    • If both `seq` and `createdAt` are perfectly identical, the system compares the CIDs (which are hashes).
    • The CID with the numerically larger hash value wins. This guarantees a deterministic outcome across all nodes.

Merkle DAG Reconciliation

Once the winning CID is determined:

  1. The winning CID becomes the canonical head.
  2. The losing CID is retained in the PDS as an "orphaned branch" (preserving data).
  3. The PDS recomputes the Merkle Root Hash incorporating the resolved state.
  4. The client is notified of the resolution so it can update its local SQLite/LSM database and UI.

Manual Resolution (Edge Cases)

If the conflict involves high-stakes data (e.g., overlapping Genesis Contract updates or overlapping financial transactions where LWW is unsafe):

  • The deterministic algorithm is suspended.
  • Both CIDs are flagged with a `conflict: true` metadata tag.
  • The client UI prompts the user to manually select the canonical version or merge them into a new CID.

Related Documents

  • Social Protocol Client App Architecture
  • Social Protocol API & Protocol Versioning Spec
  • Social Protocol Testing, Chaos, and Adversarial

Delta Sync Protocol

Overview

This document fills the CRITICAL gap for Delta Sync Protocol (Section 08: Implementation). It specifies efficient differential synchronization between client and PDS, enabling minimal data transfer for content updates.

Problem Statement

Syncing entire content databases is inefficient for mobile networks. Delta sync enables:

  • Transfer only changed data (deltas)
  • Resume interrupted syncs
  • Handle offline-first scenarios
  • Minimize bandwidth usage

Design Principles

  1. Merkle Trees: Content indexed by content-addressed merkle tree
  2. Vector Clocks: Causal ordering of changes
  3. Bloom Filters: Efficient "what's changed" queries
  4. Chunking: Large content split into chunks for partial sync

Sync Architecture

Merkle Tree Structure**

``` ┌─────────────┐ │ Root CID │ └──────┬──────┘ │ ┌──────────────┼──────────────┐ │ │ │ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ │ Chunk 1 │ │ Chunk 2 │ │ Chunk 3 │ │ (post) │ │ (post) │ │ (image) │ └─────────┘ └─────────┘ └─────────┘ ```

Each node is content-addressed. Changing any leaf updates the entire path to root.

Vector Clock**

interface VectorClock {
  // Per-persona, per-device counter
  clocks: Record<DID, Record<string, number>>;
  // DID -> device ID -> counter
}

function compareClocks(a: VectorClock, b: VectorClock): 'before' | 'after' | 'concurrent' | 'equal' {
  let aGreater = false, bGreater = false;
  
  const allKeys = new Set([...Object.keys(a.clocks), ...Object.keys(b.clocks)]);
  
  for (const key of allKeys) {
    const aVal = a.clocks[key] || 0;
    const bVal = b.clocks[key] || 0;
    
    if (aVal > bVal) aGreater = true;
    if (bVal > aVal) bGreater = true;
  }
  
  if (aGreater && bGreater) return 'concurrent';
  if (aGreater) return 'after';
  if (bGreater) return 'before';
  return 'equal';
}

Sync Protocol

Phase 1: Hello

Client announces itself and current state:

interface DeltaSyncHello {
  // Identity
  client_did: DID;
  device_id: string;  // Unique per-device
  
  // Current state
  last_sync_cid?: CID;   // Last known root CID
  local_vector: VectorClock;
  
  // Capabilities
  compression: ('gzip' | 'zstd' | 'none')[];
  encoding: ('cbor' | 'msgpack' | 'json')[];
  
  // Preferences
  full_sync_if_older_than?: number;  // Seconds
}

Phase 2: Change Query

PDS determines what changed:

interface ChangeQuery {
  // What client already has
  last_known_root_cid?: CID;
  last_sync_vector: VectorClock;
  
  // What to sync
  sync_scope: {
    personas?: DID[];     // Which personas
    since?: number;      // Since timestamp
    until?: number;      // Until timestamp
    flags?: FlagFilter;  // Filter by flags
  };
  
  // Options
  include_bloom?: boolean;  // Return bloom filter of changes
}

interface ChangeResponse {
  // Delta info
  has_changes: boolean;
  new_root_cid: CID;
  new_cids: CID[];        // New content since last sync
  deleted_cids: CID[];    // Content deleted since last sync
  
  // For large syncs
  bloom_filter?: Buffer;  // Bloom filter of all current CIDs
  chunk_count?: number;   // If using chunked transfer
  
  // Vector clock update
  updated_vector: VectorClock;
}

Phase 3: Delta Transfer

interface DeltaRequest {
  cids: CID[];
  format: 'objects' | 'chunks' | 'both';
  encoding: 'cbor' | 'msgpack';
  compression?: 'gzip' | 'zstd';
}

interface DeltaResponse {
  objects: Map<CID, ContentObject>;  // Full objects
  chunk_map?: Map<CID, ChunkInfo[]>;  // If chunked
  merkle_proofs: MerkleProof[];       // Prove CIDs belong to root
  transfer_id: string;                // For resume
}

Phase 4: Confirmation

interface SyncConfirmation {
  // What we received
  received_cids: CID[];
  received_root_cid: CID;
  
  // Verification
  merkle_valid: boolean;
  vector_clock_updated: boolean;
  
  // Next sync
  next_sync_after: number;
}

interface SyncComplete {
  status: 'success' | 'partial' | 'failed';
  new_root_cid: CID;
  updated_vector: VectorClock;
}

Full Sync vs Delta Sync

Decision Algorithm**

``` IF last_sync is undefined OR older_than(threshold): → FULL SYNC (send bloom filter, all objects) ELSE: → DELTA SYNC (send only changes) ```

Full Sync Flow**

  1. Client sends last_sync = null
  2. PDS returns full bloom filter of all CIDs
  3. Client calculates which CIDs missing locally
  4. Client requests missing objects in batches
  5. PDS returns objects + merkle proofs
  6. Client verifies proofs, updates local merkle tree
  7. Client confirms sync complete

Chunking Strategy

For large content (images, videos, files):

Content Hash Chunking (Baba)}

interface ChunkInfo {
  chunk_id: string;      // Hash of chunk content
  offset: number;       // Position in file
  size: number;         // Chunk size in bytes
  content_hash: string;  // SHA-256 of chunk
}

interface ChunkedContent {
  original_cid: CID;    // CID of original (for small files)
  chunk_cids: CID[];    // CIDs of each chunk
  chunk_info: ChunkInfo[];
  total_size: number;
  algorithm: 'babelfish' | 'fixed' | 'rabin';
}

// Sync only changed chunks
async function syncChunks(
  localChunks: ChunkInfo[],
  remoteChunks: ChunkInfo[]
): Promise<ChunkInfo[]> {
  const localHashes = new Set(localChunks.map(c => c.content_hash));
  return remoteChunks.filter(c => !localHashes.has(c.content_hash));
}

Resume Interrupted Sync

If sync is interrupted, client can resume:

interface ResumeRequest {
  transfer_id: string;
  last_received_cid?: CID;  // Where we left off
}

interface ResumeResponse {
  // [[id:22d0a159-68a2-4587-9375-5046beddc20c][Continue]] from where left off
  remaining_cids: CID[];
  next_chunk_index: number;
}

Implementation Example

import { CID } from 'multiformats';
import { MMT } from 'merkle-mountain-range';

/**
 * Delta Sync Engine
 */
export class DeltaSyncEngine {
  private localTree: MMT;
  private vectorClock: VectorClock;
  private lastSyncCID?: CID;
  
  /**
   * Perform delta sync with PDS
   */
  async syncWithPDS(pdsEndpoint: string): Promise<SyncResult> {
    // Phase 1: Hello
    const hello: DeltaSyncHello = {
      client_did: this.did,
      device_id: this.deviceId,
      last_sync_cid: this.lastSyncCID,
      local_vector: this.vectorClock,
      compression: ['zstd', 'gzip', 'none'],
      encoding: ['cbor', 'msgpack', 'json'],
      full_sync_if_older_than: 86400 // 24 hours
    };
    
    const helloResp = await this.post('/sync/hello', hello);
    
    // Phase 2: Query changes
    const query: ChangeQuery = {
      last_known_root_cid: this.lastSyncCID,
      last_sync_vector: this.vectorClock,
      sync_scope: { personas: [this.did] }
    };
    
    const changeResp = await this.post('/sync/query', query);
    
    if (!changeResp.has_changes) {
      return { status: 'no_changes', timestamp: Date.now() };
    }
    
    // Phase 3: Fetch delta
    if (changeResp.new_cids.length > 0) {
      // Check if we need full sync
      if (changeResp.new_cids.length > 1000 || !this.lastSyncCID) {
        return await this.performFullSync(pdsEndpoint, changeResp);
      }
      
      // Delta sync
      const delta = await this.fetchDelta(pdsEndpoint, changeResp.new_cids);
      await this.applyDelta(delta);
    }
    
    // Phase 4: Confirm
    const confirm: SyncConfirmation = {
      received_cids: changeResp.new_cids,
      received_root_cid: changeResp.new_root_cid,
      merkle_valid: await this.verifyMerkleProofs(delta),
      vector_clock_updated: true,
      next_sync_after: Date.now() + 3600000
    };
    
    const complete = await this.post('/sync/confirm', confirm);
    
    // Update local state
    this.lastSyncCID = complete.new_root_cid;
    this.vectorClock = complete.updated_vector;
    
    return {
      status: 'success',
      cids_synced: changeResp.new_cids.length,
      root_cid: complete.new_root_cid
    };
  }
  
  private async performFullSync(
    pds: string, 
    changes: ChangeResponse
  ): Promise<SyncResult> {
    // Get bloom filter
    const allCIDs = await this.requestAllCIDs(pds);
    
    // Find missing
    const localCIDs = new Set(await this.getLocalCIDs());
    const missingCIDs = allCIDs.filter(c => !localCIDs.has(c));
    
    // Fetch in batches
    const batchSize = 100;
    for (let i = 0; i < missingCIDs.length; i += batchSize) {
      const batch = missingCIDs.slice(i, i + batchSize);
      const objects = await this.fetchObjects(pds, batch);
      await this.applyObjects(objects);
    }
    
    return { status: 'full_sync', cids_synced: missingCIDs.length };
  }
}

Compression & Encoding

Format Compression Typical Reduction
CBOR None 1x
CBOR Gzip 3-5x
CBOR Zstd 4-7x
Msgpack None 1.1x
JSON None 0.8x (larger)

Recommended: CBOR + Zstd for bandwidth, CBOR for CPU-constrained devices.

Related Gaps

This closes:

  • Delta Sync Protocol (CRITICAL)
  • Conflict Resolution Algorithm (CRITICAL - partial, see PDS Sync doc)