Research & Working Systems

SIF funds research conducted by the Semantic Infrastructure Lab (SIL), our research division. SIL builds working systems proving semantic infrastructure works.

For technical deep-dives: Visit semanticinfrastructurelab.org for tools documentation, technical essays, and research papers.

Research Focus Areas

SIL’s research spans four interconnected pillars:

mindmap
  root((SIL Research))
    Progressive Disclosure
      Token Efficiency
      Agent-Help Standard
      Reveal Implementation
    Semantic Infrastructure
      Provenance Systems
      Trust Protocols
      Universal IR
    Multi-Agent Systems
      Hierarchical Agency
      Protocol Principles
      Coordination Patterns
    Deterministic Computation
      Morphogen Domains
      Multirate Execution
      Physical Units

The Semantic OS Vision

Our north star is the Semantic Operating System: a 7-layer architecture providing semantic infrastructure the way Unix provided OS infrastructure.

The Provenance-First Layer Model

LLLLLLL6543210:::::::RECITMPexonrerfemtuaolcpesnveuontiectstnntiigaiotnonicnoenLACWWEEegrhhmvaeoaobernsterntscdyis-wadtndenihgwo'niomrdgnfraeosgrki,oinawhmnchtagicaysenotpxutmelenep(sicdgla,nueriuetrastsaithhigociiomenoonrinngil(szaGot((arebrPctinsaaoiteeinnoysrnttnivthr)(saseaBGbocerinttalshpiI))htR)y))

Layer 0: Provenance (Everything Has Lineage) Every artifact has cryptographic lineage. No orphaned data, no mysterious origins.

Layer 1: Meaning (Semantic Understanding) Embeddings, type systems, similarity. Making meaning computable and inspectable.

Layer 2: Trust (Who Can Do What) Explicit authorization and identity. Trust is architectural, not implicit.

Layer 3: Intent (What We’re Accomplishing) Contracts declare what agents aim to do. Glass-box reasoning over black-box behavior.

Layer 4: Composition (Cross-Domain Integration) Universal semantic IR (Pantheon) enables cross-domain knowledge composition.

Layer 5: Execution (Doing Work Under Constraints) Deterministic engines (Morphogen) orchestrate reproducible computation.

Layer 6: Reflection (Learning From Execution) Observability and feedback loops. Systems that learn from what they’ve done.

Read more: Semantic OS Architecture — Complete technical specification


Working Systems (Shipped)

These aren’t demos. They’re working systems with comprehensive test suites, validated engineering, and real-world usage.


Reveal (v0.24.0, v0.26+ planned)

Progressive code exploration for developers and AI agents

What it does: Reveal shows code structure before detail. Instead of reading 10,000 tokens to find one function, you see the outline first (100 tokens), then extract what you need (50 tokens).

Proof points:

  • 10-150x token reduction via structure-first exploration
  • 11K+ total downloads (3K+/month, 100% organic) on PyPI
  • URI-based adapters (python://, ast://, json://) for semantic navigation
  • Code quality checks (bugs, security, complexity analysis)
  • Progressive configuration (v0.26+): Three-level system from zero-config to custom extensions

Gateway value: Developers use Reveal → see structure-first thinking → ask “How does this work?” → leads to USIR and Semantic OS concepts.

New research (Dec 2025):

  • Configuration as Semantic Contract: Config files should declare project semantics, not just tune behavior
  • Progressive Configuration Pattern: Solves zero-config vs configure-everything dilemma through three levels

Try it:

pip install reveal-cli
reveal src/mycode.py            # Structure overview
reveal src/mycode.py my_function # Extract specific function

Repository: github.com/Semantic-Infrastructure-Lab/reveal


Morphogen (v0.11)

Deterministic cross-domain computation framework

What it does: Morphogen evaluates computational DAGs (directed acyclic graphs) deterministically. Same input → same output, every time. No hidden state, no surprises.

Proof points:

  • 1,600+ tests passing (comprehensive test suite)
  • Deterministic evaluation with full dependency tracking
  • USIR type system in production (demonstrates semantic composition)
  • Provenance graphs showing complete computation history

Gateway value: Shows that semantic operations can compose correctly. Proves that deterministic, reproducible AI computation is feasible at scale.

Use cases:

  • Scientific simulations requiring reproducibility
  • Cross-domain modeling (fluid dynamics + biology + CAD)
  • Auditable computation pipelines

Repository: github.com/Semantic-Infrastructure-Lab/morphogen


TiaCAD (v3.1.2)

Declarative parametric CAD with semantic observability

What it does: CAD operations as composable semantic transformations. Design as code, not black-box GUI clicks. Every geometry operation preserves type information and tracks provenance.

Proof points:

  • 1,027 tests passing (geometry operations, constraints, transformations)
  • Type-preserving operations (semantic invariants maintained)
  • Composable geometry (operations chain correctly)
  • Semantic observability (inspect any value’s computational history)

Gateway value: Demonstrates domain-specific semantic infrastructure. Shows how to build on USIR for specialized domains while maintaining semantic guarantees.

Use cases:

  • Parametric design automation
  • Engineering documentation with provenance
  • Reproducible manufacturing specifications

Repository: github.com/Semantic-Infrastructure-Lab/tiacad


GenesisGraph (v0.3.0)

Cryptographic provenance verification for computational results

What it does: Every result carries cryptographic proof of its computation history. You can verify “where did this come from?” with mathematical certainty.

Proof points:

  • Cryptographic verification of computation chains
  • Trust assertion protocol (who computed what, when, how)
  • Provenance as first-class concern (not an afterthought)
  • Foundation for auditable AI systems

Gateway value: Enables answering “How do I know this is true?” for AI-generated results. Essential for high-stakes domains (medical, legal, scientific, financial).

Use cases:

  • Scientific reproducibility (verify research results)
  • Auditable AI decisions (regulatory compliance)
  • Deepfake detection (distinguish synthetic from real)
  • Supply chain verification (track computational provenance)

Repository: github.com/Semantic-Infrastructure-Lab/genesisgraph


Research Focus: Agent Ether (Year 1)

The next major milestone is Agent Ether: a coordination substrate for multi-agent systems.

The problem: Today’s AI agents don’t compose. They’re isolated LLM calls with ad-hoc tool access. No shared memory, no provenance, no semantic grounding.

Our approach: Agent Ether provides:

  • Shared semantic memory (GenesisGraph-backed)
  • USIR-based communication (agents speak same semantic language)
  • Deterministic orchestration (Morphogen-style computation)
  • Progressive disclosure (Reveal-style interaction)

Target: Research prototype by Q4 2026, production-ready by 2027.


Publications & Documentation

We publish our work continuously:

Technical documentation:

Strategic documentation:

  • Funding Strategy
  • SIF Founding Team Brief (coming soon)
  • Governance Model (coming soon)

Philosophy:

  • The Fork: Two Futures for AI (see homepage for overview)
  • Timeline A vs Timeline B (see homepage)

Research Methodology

How we work:

  1. Build working systems first — Theory follows practice
  2. Comprehensive testing — Every system has extensive test suites
  3. Production deployment — Tools must work in real workflows
  4. Open documentation — Publish patterns, publish failures
  5. Progressive disclosure — Start simple, reveal complexity as needed

Anti-patterns we avoid:

  • Vaporware (no demos without real code)
  • Overfitting to benchmarks (build for real problems)
  • Black-box complexity (glass-box transparency always)
  • Premature optimization (correctness before speed)

Success Metrics

Technical:

  • Test coverage >90% across all systems
  • Zero data loss in semantic memory operations
  • Deterministic reproduction of all computation
  • Progressive disclosure achieving 10-100x token efficiency

Adoption:

  • Major AI labs integrate USIR as IR layer
  • Research institutions adopt GenesisGraph for reproducibility
  • Regulatory frameworks reference provenance standards
  • Multi-agent systems use Agent Ether in production

Impact:

  • Scientific replication rates improve 20%+
  • AI energy efficiency improves 10x via progressive disclosure
  • Public trust in AI governance measurably increases
  • Timeline B momentum becomes visible

Open Questions

Research areas we’re actively exploring:

Semantic Memory:

  • What’s the optimal granularity for provenance tracking?
  • How do we handle privacy in cryptographic provenance?
  • Can we achieve web-scale semantic memory?

USIR:

  • What’s the minimal viable semantic type system?
  • How do we balance expressiveness vs simplicity?
  • Can natural language map cleanly to USIR?

Agent Coordination:

  • What’s the right abstraction for multi-agent memory?
  • How do we prevent semantic drift in long-running systems?
  • Can we guarantee composition without centralization?

Collaborate With Us

For researchers: We’re interested in collaborations on semantic memory, provenance systems, and multi-agent coordination. Reach out if you’re working on related problems.

For engineers: All four systems are open source. Contributions welcome. We need help with scaling, optimization, and ecosystem integration.

For organizations: Interested in deploying semantic infrastructure? We can advise on architecture, integration, and best practices.

Contact: See our contact page


The Long Game

These systems aren’t the end goal—they’re proof that the goal is achievable.

Reveal proves progressive disclosure works. Morphogen proves deterministic semantic computation scales. TiaCAD proves domain modules can maintain semantic invariants. GenesisGraph proves cryptographic provenance is feasible.

Together, they prove Timeline B is possible.

Now we build the rest of the Semantic OS. Join us.