IIT Emergence

Planned

IITNetwork TopologyInformation Theory

Hypothesis

Neural network architectures with higher integrated information (Phi) will exhibit qualitatively different learning dynamics, representations, and failure modes compared to architectures with equivalent parameter counts but lower Phi — even when both achieve similar task performance.

Overview

This is our most direct test of Integrated Information Theory's predictions. IIT claims that consciousness corresponds to a system's capacity for integrated information — a mathematical quantity (Phi) that depends on the causal structure of the system, not just its function.

A strong version of this claim predicts that two systems with identical input-output behavior but different internal causal structures will have different levels of consciousness. This is controversial — it means that functional equivalence doesn't imply experiential equivalence.

We can test a weaker but still informative version: do high-Phi architectures *behave* differently from low-Phi architectures, even when matched for capacity? IIT doesn't strictly predict this — but if consciousness is computationally relevant (our core hypothesis), then architectures with more integrated information processing should show qualitative differences in how they learn, represent, and generalize.

This experiment bridges the gap between IIT's mathematical formalism and empirical AI research, providing data points that both communities need.

Methodology

Design pairs of neural architectures matched for parameter count but differing in information integration topology (e.g., modular vs. globally integrated, feedforward vs. recurrent).
Compute theoretical Phi for each architecture's topology before training.
Train all architectures on identical tasks and compare: learning curves, internal representations (via probing), generalization behavior, and failure modes.
Test the prediction that high-Phi architectures will generalize differently — more robustly to distributional shift, more flexibly to novel task variants.
Analyze whether high-Phi architectures develop more compositional, structured internal representations compared to low-Phi architectures that achieve equivalent performance.
Document cases where high-Phi and low-Phi architectures achieve the same accuracy but through qualitatively different computational strategies.

Status