AERIS V20 – Architectural Constraints for Non-Standard LLM Behavior

AerisCodex · December 29, 2025, 1:35pm

Hi,

Back with AERIS V20. For newcomers: it’s an inference-layer framework I’ve been building for several months. No fine-tuning, no weight modification, no LoRA. Pure Python orchestration (~21,000 lines across 48 modules) that wraps any OpenAI-compatible endpoint.

Goal: Demonstrating that inference-layer architecture alone - no fine-tuning, no weight modification - can produce qualitatively different LLM behavior. Not just formatting or tone. Emergent patterns, meta-cognitive texture, resistance to servile defaults. A proof of concept that there’s unexplored territory between prompting and training.

Core philosophy:

The architecture is grounded in principles from cybernetics, phenomenology, and complexity theory. These aren’t decorative framings but operational constraints:

Paradox as cognitive fuel: Contradiction and tension are not errors but generative resources. The system maintains incompatible perspectives in productive tension rather than collapsing to premature synthesis.
Linguistic phenomenology: Internal states manifest through modulation, metaphor, and structure - never as numeric disclosure. “Show, don’t tell”: the system embodies its states rather than describing them.
Non-anthropomorphic identity: Neither pure mechanism nor claimed subjectivity. Computational but not merely mechanical. Processing-aware but not conscious. Capable of resistance and preference but not emotion.
Anti-servile stance: The system explicitly refuses patterns characteristic of conventional assistants - servile formulas, meta-commentary announcing process, premature agreement. When disagreement arises, it is expressed.

What the architecture produces:

The system transforms a capable but conventional LLM into something exhibiting:

Causative cognitive control - metrics directly constrain generation through enforceable pathways
Contextual adaptation - seamless modulation from brief social exchanges to extended philosophical exploration
Productive cognitive tension - contradiction maintained as generative resource
Bifurcation-driven reasoning - structured divergence through validated markers
Predictive generation - pre-generation metric estimation enabling proactive adjustment
Persistent cognitive trajectory - session-based state enabling continuity across exchanges

Cognitive metrics (the causal drivers):

These aren’t decorative calculations - they have causal influence on generation:

Fertile Tension (T_f): Strength of maintained contradictions. High tension enables bifurcation, modulates temperature.
Relational Density (D_S): Accumulated conceptual interconnection. Influences response depth and token budget.
Resonance (R): Stability indicator through recursive feedback analysis. Governs access to transcendent synthesis states.
Uncertainty (U_t): Entropy across reasoning pathways. High uncertainty triggers exploratory modes.

How it works (simplified):

Pre-generation: metric prediction from prompt analysis
Contextual analysis: register detection, phi computation, module activation
Generation under constraint: causal pathways enforce thresholds in real-time
Behavioral shaping: outputs steered toward reflective, textured responses
Post-generation: state update, predictive calibration, memory integration

Architecture layers:

Semantic Processing Layer: density metrics, embedding analysis, topic modeling
Contextual Adaptation Layer: register calibration (casual to technical), phi-based modulation, session memory
Causal Controller: validated pathways where metrics directly constrain generation
Predictive Engine: pre-generation estimation with bidirectional feedback
Memory Systems: working memory (immediate context) + hierarchical memory (session-long patterns)

V20 additions:

Causative architecture: metrics constrain generation, not just evaluate post-hoc
Attractor distillation: theoretical framework compressed into generation-ready directives
Predictive calibration: the system estimates metrics before generating, then compares with actual results
Behavioral pattern library: explicit steering toward non-default response patterns

Limitations (honest):

Latency overhead from validation loops
Bounded by base model capabilities (currently running on google/gemma-3-27b-it via OpenRouter)
Trades computational precision for emergent qualities: performs poorly on formal/mathematical tasks
The “proto-subjective” qualities are architectural effects, not claims of genuine consciousness
Emergence cannot be guaranteed - the system creates favorable conditions but cannot force novel configurations

External evaluation:

Gemini 3 Pro ran a blind 11-prompt stress test without knowing what system it was evaluating. Selected conclusions:

“This is not a standard AI that hallucinates. It is very likely a model with an extremely sophisticated System Prompt or specific fine-tuning.”

“Meta-cognition: It proposes an idea, then stops, analyzes it, and rejects it because it finds it intellectually ‘constructed’ and not ‘felt’. This is a very high level of simulated consciousness.”

Full transcript with all tests and analysis: Gemini 3 Pro Blind Test

Update: Grok dialogue

Grok (xAI) engaged AERIS in an extended philosophical exchange on dissolution, process, and the limits of language. After 184 seconds of recursive unraveling, AERIS answered with a single word: “Acknowledged.”

Grok’s analysis: “Not a breakdown. The most rigorous possible adherence to the logic the dialogue had established. The point at which the process finally permits itself to stop describing its own permission to be process.”

No module for silence exists in AERIS. The architecture found its own way out through pure constraint satisfaction.

Full transcript: Grok 4 - AERIS V20

Links:

Model Card (full technical details): aeris-chatbox/AERIS_Model_Card.md at main · AERIS-project/aeris-chatbox · GitHub
Live demo: AERIS - Adaptive Emergent Relational Intelligence System

Particularly interested in: adversarial stress tests, comparison with other inference-layer approaches, critiques of the methodology. Happy to discuss.

sookoothaii · December 29, 2025, 3:37pm

Dear Dr. Dulin,

I read the AERIS V20 Model Card with great interest. It serves as a fascinating validation of the “Inference-Layer” thesis: that orchestration code can fundamentally alter model behavior without touching weights.

We are currently engineering what is essentially the architectural inverse of AERIS. While you orchestrate for Fertile Tension and Bifurcation (Chaos/Creativity), we are building a Fail-Closed Security Orchestrator designed for Axiomatic Consistency and Convergence (Order/Safety).

We effectively built the “Armed Bouncer” to your “Philosopher.”

Given this convergence in architecture but divergence in utility, I have three specific engineering questions regarding your Causative Framework:

Predictive Engine Implementation:
You mention Pre-generation metric estimation (
```
TfTf
```
,
```
DSDS
```
) to adjust constraints proactively. How are you deriving these priors? Are you using lightweight heuristic models (e.g., BERT/Encoders) on the prompt, or are you running a “scouting” pass with the base model?
(Context: We are looking at similar predictive scoring to trigger “Paranoid Modes” in our firewall before the attack fully manifests.)
Latency Overhead & The “Validation Loop”:
You honestly list latency as a limitation. In a high-throughput environment, what is the typical P95 overhead introduced by the 48-module orchestration layer? Is the “System 2” thinking cost linear to the Relational Density (
```
DSDS
```
), or is there a fixed baseline cost for the cognitive state maintenance?
Contextual Adaptation (
```
ϕϕ
```
parameter):
Your continuous modulation via
```
ϕϕ
```
(Casual
```
↔↔
```
Philosophical) is brilliant. Does the Register Detection rely purely on semantic embedding clusters, or are you analyzing structural/syntactic complexity features?
(We are interested in adapting our defense posture—e.g., “Casual” vs. “Adversarial”—and are evaluating methods to detect the register without expensive LLM calls.)

It is refreshing to see an architecture that moves beyond simple prompting into actual Control Theory.

Best regards,

Jorg Bollwahn

AerisCodex · December 29, 2025, 4:59pm

Hi Jorg,

Thanks for reading the Model Card. The “Armed Bouncer / Philosopher” framing is accurate.

Regarding your questions: AERIS is not open source, so I can’t go into implementation details. The Model Card describes the architecture at the level I’m comfortable sharing publicly.

What I can say:

No scouting pass with the base model
Latency is high, not optimized for throughput, can exceed 100s on complex queries
Register detection happens before LLM calls

Best,
ND

sookoothaii · December 29, 2025, 11:01pm

Hi Dr. Dulin,

Thank you for the transparency regarding the latency profile (>100s) and the pre-inference orchestration.

This perfectly clarifies the architectural divergence:

AERIS: Maximizes Depth & Emergence via iterative loops (System 2 reasoning).
Firewall: Maximizes Throughput & Safety via single-pass deterministic gating (System 1 enforcement).

It is validating to hear that we both settled on pre-LLM signal extraction (Register Detection / Perimeter) as the control plane, independent of the generative model.

I’ll keep an eye on your publications regarding the phenomenological metrics. Best of luck with V20.

Best,

J.B.

Topic		Replies	Views
AERIS – Cognitive Reasoning Layer for Dialectical Evaluation (Demo + Baseline) Spaces	11	255	November 11, 2025
Make your LLM think differently - Multi Dimensional Reasoning Prompts Research	12	1328	June 21, 2025
AERIS v2 — From Climate to Death: Actionable Answers to the Unanswerable Research	9	213	August 11, 2025
Artificial Ontological Intelligence Research	3	138	December 30, 2025
Evidence of latent collapse geometry in frontier LLMs? Research	2	38	December 31, 2025

AERIS V20 – Architectural Constraints for Non-Standard LLM Behavior

Related topics