Research & Working Systems
SIF funds research conducted by the Semantic Infrastructure Lab (SIL), our research division. SIL builds working systems proving semantic infrastructure works.
For technical deep-dives: Visit semanticinfrastructurelab.org for tools documentation, technical essays, and research papers.
Research Focus Areas
SIL’s research spans four interconnected pillars:
mindmap
root((SIL Research))
Progressive Disclosure
Token Efficiency
Agent-Help Standard
Reveal Implementation
Semantic Infrastructure
Provenance Systems
Trust Protocols
Universal IR
Multi-Agent Systems
Hierarchical Agency
Protocol Principles
Coordination Patterns
Deterministic Computation
Morphogen Domains
Multirate Execution
Physical Units
The Semantic OS Vision
Our north star is the Semantic Operating System: a 7-layer architecture providing semantic infrastructure the way Unix provided OS infrastructure.
The Provenance-First Layer Model
Layer 0: Provenance (Everything Has Lineage) Every artifact has cryptographic lineage. No orphaned data, no mysterious origins.
Layer 1: Meaning (Semantic Understanding) Embeddings, type systems, similarity. Making meaning computable and inspectable.
Layer 2: Trust (Who Can Do What) Explicit authorization and identity. Trust is architectural, not implicit.
Layer 3: Intent (What We’re Accomplishing) Contracts declare what agents aim to do. Glass-box reasoning over black-box behavior.
Layer 4: Composition (Cross-Domain Integration) Universal semantic IR (Pantheon) enables cross-domain knowledge composition.
Layer 5: Execution (Doing Work Under Constraints) Deterministic engines (Morphogen) orchestrate reproducible computation.
Layer 6: Reflection (Learning From Execution) Observability and feedback loops. Systems that learn from what they’ve done.
Read more: Semantic OS Architecture — Complete technical specification
Working Systems (Shipped)
These aren’t demos. They’re working systems with comprehensive test suites, validated engineering, and real-world usage.
Reveal (v0.24.0, v0.26+ planned)
Progressive code exploration for developers and AI agents
What it does: Reveal shows code structure before detail. Instead of reading 10,000 tokens to find one function, you see the outline first (100 tokens), then extract what you need (50 tokens).
Proof points:
- 10-150x token reduction via structure-first exploration
- 11K+ total downloads (3K+/month, 100% organic) on PyPI
- URI-based adapters (
python://,ast://,json://) for semantic navigation - Code quality checks (bugs, security, complexity analysis)
- Progressive configuration (v0.26+): Three-level system from zero-config to custom extensions
Gateway value: Developers use Reveal → see structure-first thinking → ask “How does this work?” → leads to USIR and Semantic OS concepts.
New research (Dec 2025):
- Configuration as Semantic Contract: Config files should declare project semantics, not just tune behavior
- Progressive Configuration Pattern: Solves zero-config vs configure-everything dilemma through three levels
Try it:
pip install reveal-cli
reveal src/mycode.py # Structure overview
reveal src/mycode.py my_function # Extract specific function
Repository: github.com/Semantic-Infrastructure-Lab/reveal
Morphogen (v0.11)
Deterministic cross-domain computation framework
What it does: Morphogen evaluates computational DAGs (directed acyclic graphs) deterministically. Same input → same output, every time. No hidden state, no surprises.
Proof points:
- 1,600+ tests passing (comprehensive test suite)
- Deterministic evaluation with full dependency tracking
- USIR type system in production (demonstrates semantic composition)
- Provenance graphs showing complete computation history
Gateway value: Shows that semantic operations can compose correctly. Proves that deterministic, reproducible AI computation is feasible at scale.
Use cases:
- Scientific simulations requiring reproducibility
- Cross-domain modeling (fluid dynamics + biology + CAD)
- Auditable computation pipelines
Repository: github.com/Semantic-Infrastructure-Lab/morphogen
TiaCAD (v3.1.2)
Declarative parametric CAD with semantic observability
What it does: CAD operations as composable semantic transformations. Design as code, not black-box GUI clicks. Every geometry operation preserves type information and tracks provenance.
Proof points:
- 1,027 tests passing (geometry operations, constraints, transformations)
- Type-preserving operations (semantic invariants maintained)
- Composable geometry (operations chain correctly)
- Semantic observability (inspect any value’s computational history)
Gateway value: Demonstrates domain-specific semantic infrastructure. Shows how to build on USIR for specialized domains while maintaining semantic guarantees.
Use cases:
- Parametric design automation
- Engineering documentation with provenance
- Reproducible manufacturing specifications
Repository: github.com/Semantic-Infrastructure-Lab/tiacad
GenesisGraph (v0.3.0)
Cryptographic provenance verification for computational results
What it does: Every result carries cryptographic proof of its computation history. You can verify “where did this come from?” with mathematical certainty.
Proof points:
- Cryptographic verification of computation chains
- Trust assertion protocol (who computed what, when, how)
- Provenance as first-class concern (not an afterthought)
- Foundation for auditable AI systems
Gateway value: Enables answering “How do I know this is true?” for AI-generated results. Essential for high-stakes domains (medical, legal, scientific, financial).
Use cases:
- Scientific reproducibility (verify research results)
- Auditable AI decisions (regulatory compliance)
- Deepfake detection (distinguish synthetic from real)
- Supply chain verification (track computational provenance)
Repository: github.com/Semantic-Infrastructure-Lab/genesisgraph
Research Focus: Agent Ether (Year 1)
The next major milestone is Agent Ether: a coordination substrate for multi-agent systems.
The problem: Today’s AI agents don’t compose. They’re isolated LLM calls with ad-hoc tool access. No shared memory, no provenance, no semantic grounding.
Our approach: Agent Ether provides:
- Shared semantic memory (GenesisGraph-backed)
- USIR-based communication (agents speak same semantic language)
- Deterministic orchestration (Morphogen-style computation)
- Progressive disclosure (Reveal-style interaction)
Target: Research prototype by Q4 2026, production-ready by 2027.
Publications & Documentation
We publish our work continuously:
Technical documentation:
Strategic documentation:
- Funding Strategy
- SIF Founding Team Brief (coming soon)
- Governance Model (coming soon)
Philosophy:
Research Methodology
How we work:
- Build working systems first — Theory follows practice
- Comprehensive testing — Every system has extensive test suites
- Production deployment — Tools must work in real workflows
- Open documentation — Publish patterns, publish failures
- Progressive disclosure — Start simple, reveal complexity as needed
Anti-patterns we avoid:
- Vaporware (no demos without real code)
- Overfitting to benchmarks (build for real problems)
- Black-box complexity (glass-box transparency always)
- Premature optimization (correctness before speed)
Success Metrics
Technical:
- Test coverage >90% across all systems
- Zero data loss in semantic memory operations
- Deterministic reproduction of all computation
- Progressive disclosure achieving 10-100x token efficiency
Adoption:
- Major AI labs integrate USIR as IR layer
- Research institutions adopt GenesisGraph for reproducibility
- Regulatory frameworks reference provenance standards
- Multi-agent systems use Agent Ether in production
Impact:
- Scientific replication rates improve 20%+
- AI energy efficiency improves 10x via progressive disclosure
- Public trust in AI governance measurably increases
- Timeline B momentum becomes visible
Open Questions
Research areas we’re actively exploring:
Semantic Memory:
- What’s the optimal granularity for provenance tracking?
- How do we handle privacy in cryptographic provenance?
- Can we achieve web-scale semantic memory?
USIR:
- What’s the minimal viable semantic type system?
- How do we balance expressiveness vs simplicity?
- Can natural language map cleanly to USIR?
Agent Coordination:
- What’s the right abstraction for multi-agent memory?
- How do we prevent semantic drift in long-running systems?
- Can we guarantee composition without centralization?
Collaborate With Us
For researchers: We’re interested in collaborations on semantic memory, provenance systems, and multi-agent coordination. Reach out if you’re working on related problems.
For engineers: All four systems are open source. Contributions welcome. We need help with scaling, optimization, and ecosystem integration.
For organizations: Interested in deploying semantic infrastructure? We can advise on architecture, integration, and best practices.
Contact: See our contact page
The Long Game
These systems aren’t the end goal—they’re proof that the goal is achievable.
Reveal proves progressive disclosure works. Morphogen proves deterministic semantic computation scales. TiaCAD proves domain modules can maintain semantic invariants. GenesisGraph proves cryptographic provenance is feasible.
Together, they prove Timeline B is possible.
Now we build the rest of the Semantic OS. Join us.