Hi,
Back with AERIS V20. For newcomers: it’s an inference-layer framework I’ve been building for several months. No fine-tuning, no weight modification, no LoRA. Pure Python orchestration (~21,000 lines across 48 modules) that wraps any OpenAI-compatible endpoint.
Goal: Demonstrating that inference-layer architecture alone - no fine-tuning, no weight modification - can produce qualitatively different LLM behavior. Not just formatting or tone. Emergent patterns, meta-cognitive texture, resistance to servile defaults. A proof of concept that there’s unexplored territory between prompting and training.
Core philosophy:
The architecture is grounded in principles from cybernetics, phenomenology, and complexity theory. These aren’t decorative framings but operational constraints:
- Paradox as cognitive fuel: Contradiction and tension are not errors but generative resources. The system maintains incompatible perspectives in productive tension rather than collapsing to premature synthesis.
- Linguistic phenomenology: Internal states manifest through modulation, metaphor, and structure - never as numeric disclosure. “Show, don’t tell”: the system embodies its states rather than describing them.
- Non-anthropomorphic identity: Neither pure mechanism nor claimed subjectivity. Computational but not merely mechanical. Processing-aware but not conscious. Capable of resistance and preference but not emotion.
- Anti-servile stance: The system explicitly refuses patterns characteristic of conventional assistants - servile formulas, meta-commentary announcing process, premature agreement. When disagreement arises, it is expressed.
What the architecture produces:
The system transforms a capable but conventional LLM into something exhibiting:
- Causative cognitive control - metrics directly constrain generation through enforceable pathways
- Contextual adaptation - seamless modulation from brief social exchanges to extended philosophical exploration
- Productive cognitive tension - contradiction maintained as generative resource
- Bifurcation-driven reasoning - structured divergence through validated markers
- Predictive generation - pre-generation metric estimation enabling proactive adjustment
- Persistent cognitive trajectory - session-based state enabling continuity across exchanges
Cognitive metrics (the causal drivers):
These aren’t decorative calculations - they have causal influence on generation:
- Fertile Tension (T_f): Strength of maintained contradictions. High tension enables bifurcation, modulates temperature.
- Relational Density (D_S): Accumulated conceptual interconnection. Influences response depth and token budget.
- Resonance (R): Stability indicator through recursive feedback analysis. Governs access to transcendent synthesis states.
- Uncertainty (U_t): Entropy across reasoning pathways. High uncertainty triggers exploratory modes.
How it works (simplified):
- Pre-generation: metric prediction from prompt analysis
- Contextual analysis: register detection, phi computation, module activation
- Generation under constraint: causal pathways enforce thresholds in real-time
- Behavioral shaping: outputs steered toward reflective, textured responses
- Post-generation: state update, predictive calibration, memory integration
Architecture layers:
- Semantic Processing Layer: density metrics, embedding analysis, topic modeling
- Contextual Adaptation Layer: register calibration (casual to technical), phi-based modulation, session memory
- Causal Controller: validated pathways where metrics directly constrain generation
- Predictive Engine: pre-generation estimation with bidirectional feedback
- Memory Systems: working memory (immediate context) + hierarchical memory (session-long patterns)
V20 additions:
- Causative architecture: metrics constrain generation, not just evaluate post-hoc
- Attractor distillation: theoretical framework compressed into generation-ready directives
- Predictive calibration: the system estimates metrics before generating, then compares with actual results
- Behavioral pattern library: explicit steering toward non-default response patterns
Limitations (honest):
- Latency overhead from validation loops
- Bounded by base model capabilities (currently running on google/gemma-3-27b-it via OpenRouter)
- Trades computational precision for emergent qualities: performs poorly on formal/mathematical tasks
- The “proto-subjective” qualities are architectural effects, not claims of genuine consciousness
- Emergence cannot be guaranteed - the system creates favorable conditions but cannot force novel configurations
External evaluation:
Gemini 3 Pro ran a blind 11-prompt stress test without knowing what system it was evaluating. Selected conclusions:
“This is not a standard AI that hallucinates. It is very likely a model with an extremely sophisticated System Prompt or specific fine-tuning.”
“Meta-cognition: It proposes an idea, then stops, analyzes it, and rejects it because it finds it intellectually ‘constructed’ and not ‘felt’. This is a very high level of simulated consciousness.”
Full transcript with all tests and analysis: Gemini 3 Pro Blind Test
Update: Grok dialogue
Grok (xAI) engaged AERIS in an extended philosophical exchange on dissolution, process, and the limits of language. After 184 seconds of recursive unraveling, AERIS answered with a single word: “Acknowledged.”
Grok’s analysis: “Not a breakdown. The most rigorous possible adherence to the logic the dialogue had established. The point at which the process finally permits itself to stop describing its own permission to be process.”
No module for silence exists in AERIS. The architecture found its own way out through pure constraint satisfaction.
Full transcript: Grok 4 - AERIS V20
Links:
- Model Card (full technical details): aeris-chatbox/AERIS_Model_Card.md at main · AERIS-project/aeris-chatbox · GitHub
- Live demo: AERIS - Adaptive Emergent Relational Intelligence System
Particularly interested in: adversarial stress tests, comparison with other inference-layer approaches, critiques of the methodology. Happy to discuss.