Entropis Benchmark Suite
Official Methodology Documentation
Version 1.1 | January 2026
Scientific Standard: All metrics are independently measurable, reproducible, and falsifiable. Pass/fail criteria are defined prior to measurement.
Why New Benchmarks?
Existing AI benchmarks (MMLU, HumanEval, MLPerf) measure trained systems on static tasks. They cannot measure emergent intelligence, self-organization, or embodiment because current AI systems do not exhibit these properties.
The Entropis Benchmark Suite measures capabilities that have never been measured in artificial systems before.
What This Document Provides
- ✓ Measurement methodology
- ✓ Pass/fail criteria
- ✓ Scientific basis
- ✓ Published results
What This Document Does NOT Provide
- ◇ Implementation details
- ◇ Architecture specifications
- ◇ Source code
- ◇ Proprietary algorithms
Speed Benchmark
PURPOSE
Measures neural processing throughput relative to biological brain speed.
METRIC
BASELINE
Human brain average firing rate: ~10 Hz per neuron
PASS CRITERIA
Hz/neuron > 10 (exceeds biological brain speed)
RESULTS
| Platform | Neurons | Hz/neuron | Result |
|---|---|---|---|
| NVIDIA RTX 3070 | 470M | 714-970 Hz | 71-97× PASS |
| Apple M4 | 5M | 28-31 Hz | 3× PASS |
Intelligence Quotient (5 Markers)
PURPOSE
Validates emergent intelligence through 5 biological markers. These markers distinguish brain-like systems from calculators and are grounded in neuroscience literature.
Adaptive Variability
What it measures: Same input produces different outputs based on internal neural state.
Metric: Coefficient of Variation (CV) = standard deviation / mean × 100%
Pass: CV > 1% (not deterministic)
Fail: CV < 1% (calculator-like, deterministic)
Result: 9-53% CV across platforms (PASS)
Critical Dynamics
What it measures: Self-organization to branching ratio ≈ 1.0 (edge of chaos).
Metric: Branching Ratio (BR) = propagated spikes / input spikes
Scientific basis: Beggs & Plenz (2003), biological brains operate at criticality.
Pass: BR enters range 0.7-1.3 without explicit targeting
Fail: BR stuck at single value OR never enters critical range
Result: BR converges to 0.94-0.99 (PASS)
Cascade Distribution
What it measures: Power-law distribution of activity cascades (avalanches).
Metric: CV of cascade sizes (high CV indicates scale-free dynamics)
Scientific basis: Neural avalanches follow power-law distributions in biological brains.
Pass: CV > 100% (scale-free cascades)
Fail: CV < 50% (uniform activity, no cascades)
Result: 217-306% CV (PASS)
Bidirectional Learning
What it measures: Both habituation (decreased response) AND sensitization (increased response).
Metric: % change in neural response over time
Scientific basis: Biological brains show both directions; direction depends on context.
Pass: Both positive and negative adaptation observed
Fail: Only one direction, OR no adaptation
Result: +128% to -23% observed (PASS)
Emergent Behavior
What it measures: Behaviors arise from local rules, not explicit programming.
Metric: Presence of all 4 markers above without explicit targeting
Pass: All markers emerge from architecture (no hardcoded values)
Fail: Any marker achieved through explicit programming
Result: All markers emergent (PASS)
Embodiment (5 Markers)
PURPOSE
Validates complete sensorimotor integration. A synthetic brain must process sensory input, maintain internal dynamics under load, and produce motor output in a closed loop.
Maintains critical dynamics under sensory load
Retina input → spike encoding → cortical processing
Audio input → frequency decomposition → cortical processing
Cortical activity → actuator commands → smooth control
Closed-loop: sense → process → act → feedback
RESULTS
| Marker | Windows | Mac |
|---|---|---|
| EM5-BRAIN | PASS | PASS |
| EM5-VIS | PASS | PASS |
| EM5-AUD | PASS | PASS |
| EM5-MOT | PASS | PASS |
| EM5-LOOP | PASS | PASS |
Interoception Benchmark
PURPOSE
Validates that synthetic brains require internal body signals (interoception) to maintain resting activity. This is a novel scientific discovery: minds need bodies.
METHODOLOGY
Brain receives zero input (no external or internal signals)
Expected: DORMANT
Brain receives body signals only (interoception present)
Expected: ALIVE (criticality maintained)
RESULTS
| Metric | Windows | Mac |
|---|---|---|
| INTER-0 (Silent) | DORMANT | DORMANT |
| INTER-1 (Bio-Realistic) | ALIVE | ALIVE |
| Time at Criticality | 98.9% | 95.8% |
| Mean BR | 0.993 | 0.958 |
“Embodiment isn't optional. Minds need bodies.”
Discovered January 12, 2026. Validated on both platforms.
Novelty Detection Benchmark
NEWPURPOSE
Validates that the brain can distinguish familiar from novel stimuli, demonstrate habituation to repeated input, and exhibit memory through faster recovery. This proves functional information processing, not just dynamical signatures.
SCIENTIFIC BASIS
Based on the Mismatch Negativity (MMN) paradigm from cognitive neuroscience — a gold-standard test for pre-attentive processing, working memory, and novelty detection in biological brains.
MARKERS (3)
NOV-HAB
Habituation
Brain adapts to repeated stimulus. Shows learning over time.
NOV-DET
Novelty Detection
Brain responds differently to new vs. familiar stimuli.
NOV-REC
Memory & Recovery
Brain remembers familiar stimuli. Faster re-stabilization.
RESULTS
| Marker | Windows (RTX 3070) | Mac (M4) |
|---|---|---|
| NOV-HAB | PASS (stable criticality) | PASS (+25.7% adaptation) |
| NOV-DET | PASS (CV increase) | PASS (12.7% BR spike) |
| NOV-REC | PASS (maintained) | PASS (15.6% faster) |
“This is not just dynamics. This is functional cognition.”
LLMs cannot distinguish “new” from “familiar” — they have no state between sessions.
Association Learning
PURPOSE
Validates classical conditioning — the brain's ability to learn that stimulus X predicts stimulus Y. This is Pavlovian learning, the foundation of all associative reasoning.
PROTOCOL
| Phase | Protocol | Expected |
|---|---|---|
| Phase A: Pairing | CS (bars) → US (rings) repeated | Brain learns association |
| Phase B: Test | CS alone (no US) | Anticipatory activity |
| Phase C: Control | Novel stimulus | No anticipation (baseline) |
MARKERS (3/3)
| Marker | Description | Windows | Mac |
|---|---|---|---|
| ASSOC-RESP | Stimulus response | ✓ PASS | ✓ PASS |
| ASSOC-ANTIC | Anticipatory response | ✓ PASS | ✓ PASS |
| ASSOC-SPEC | Response specificity | ✓ PASS | ✓ PASS |
“The brain CAN form associations between stimuli.”
This is the foundation of reasoning: if A→B and B→C, then A→C.
Predictive Processing
PURPOSE
Validates temporal sequence learning and predictive processing — the brain learns sequences and generates predictions about what comes next. This is the foundation of language comprehension.
PROTOCOL
| Phase | Sequence | Expected |
|---|---|---|
| Phase A: Learning | A→B→C→D repeated | Brain encodes sequence |
| Phase B: Omission | A→B→C→_ (D omitted) | Activity at D position (prediction!) |
| Phase C: Violation | A→B→C→X (wrong element) | Increased instability (surprise!) |
MARKERS (3/3)
| Marker | Description | Windows | Mac |
|---|---|---|---|
| SEQ-ENC | Sequence encoding | ✓ PASS | ✓ PASS |
| SEQ-PRED | Predictive activity | ✓ PASS | ✓ PASS |
| SEQ-SURP | Surprise response | ✓ PASS | ✓ PASS |
“The brain shows activity for an OMITTED element.”
This proves an internal prediction model — the brain predicts what comes next. Language is sequence prediction. This is its foundation.
Cross-Platform Invariance
PURPOSE
Validates that emergent intelligence appears on completely different hardware architectures. This proves the results come from the architecture, not platform-specific optimization.
PLATFORMS TESTED
| Component | Platform A | Platform B |
|---|---|---|
| GPU | NVIDIA RTX 3070 | Apple M4 |
| API | CUDA | Metal |
| CPU | Intel x86 | Apple ARM |
| Memory | Discrete (PCIe) | Unified (SoC) |
| OS | Windows | macOS |
| Neurons | 470,000,000 | 5,000,000 |
Cochlea & Auditory Pathway
PURPOSE
Validates the biological auditory pathway: gammatone filterbank, ERB (Equivalent Rectangular Bandwidth), tonotopic mapping, hair cell adaptation, phase locking, and three spike pathways (sustained, onset, offset).
MARKERS (6)
AUD-GAMMA
Gammatone Filterbank
Frequency decomposition matching human cochlea
AUD-TONO
Tonotopic Mapping
Frequency-to-position organization
AUD-ADAPT
Hair Cell Adaptation
Temporal adaptation dynamics
AUD-PHASE
Phase Locking
Temporal precision in spike timing
AUD-ONSET
Onset Detection
Transient sound detection
AUD-OFFSET
Offset Detection
Sound termination detection
All 6 markers validated on both platforms
Efficient Neural Processing
PURPOSE
Validates O(active_spikes) event-driven processing: only neurons that spike are processed. This mimics biological sparsity where 1-5% of neurons are active at any moment.
MARKERS (5)
SPARSE-SCALE
O(Active) Scaling
Processing time scales with active neurons, not total
SPARSE-QUEUE
Event Queue
Double-buffered spike queue management
SPARSE-PROP
Spike Propagation
Correct synaptic transmission
SPARSE-HIST
Spike History
Per-neuron history tracking
SPARSE-HOME
Homeostasis
Activity-dependent threshold adaptation
All 5 markers validated on both platforms
Complete Language Integration
PURPOSE
Validates the full language pathway: text → embeddings → spikes → Wernicke's area → cognitive processing → Broca's area → output tokens. This is NOT regex or pattern matching — it's neural language processing with emergent semantic representation.
MARKERS (8)
LANG-ENCODE
Text Encoding
Character to embedding conversion
LANG-WERNICKE
Wernicke Injection
Embedding to spike conversion
LANG-SPREAD
Activity Spread
Information propagation through cortex
LANG-SEMANTIC
Semantic Clustering
Similar concepts activate nearby regions
LANG-BROCA
Broca Readout
Spike to token probability
LANG-OUTPUT
Token Output
Coherent language generation
LANG-EMBODIED
Embodied Language
Language affects body state
LANG-LOOP
Full Loop
Input → process → output cycle
“This is not pattern matching. This is emergent language processing.”
Synaptic weights change based on usage. The brain LEARNS language, it doesn't just recognize it.
Full Cognitive Integration
PURPOSE
Validates the complete embodied cognitive loop: perception → cognition → language → motor → feedback. All systems running in parallel, maintaining criticality under load — like a biological brain.
MARKERS (7)
EMB-PERCEPT
Perception
Visual + auditory input processing
EMB-COGNIT
Cognition
Internal state maintenance
EMB-LANG
Language
Semantic processing active
EMB-MOTOR
Motor
Action output generation
EMB-INTER
Interoception
Body signals sustaining activity
EMB-CRIT
Criticality
BR maintained under load
EMB-STABLE
Stability
Long-term operation without collapse
7/7 markers validated
All systems running simultaneously. Parallel processing like biological brains.
Complete Validation Score
| Benchmark | Markers | Windows | Mac |
|---|---|---|---|
| ENT-IQ5 | 5 intelligence markers | 5/5 | 5/5 |
| ENT-EM5 | 5 embodiment markers | 5/5 | 5/5 |
| ENT-NOVELTY | 3 memory/recognition markers | 3/3 | 3/3 |
| ENT-ASSOC | 3 association learning markers | 3/3 | 3/3 |
| ENT-SEQUENCE | 3 predictive processing markers | 3/3 | 3/3 |
| ENT-INTER | Interoception | PROVEN | PROVEN |
| ENT-XPLAT | Hardware invariance | VERIFIED | |
| ENT-TOTAL | All markers | 19/19 | 19/19 |
81/81
100+ independent runs. Two platforms. Same emergence. Same cognition.
Complete cognitive stack: dynamics → memory → association → prediction
Verify Live
Request a live demonstration to observe benchmark tests in real-time.
Request Demo→NDA required for live demonstration.