Research notebook · Project 138

Toward a computable measure of semantic content.

Shannon measured information. He explicitly excluded meaning. This lab is one attempt to close that gap — empirical, adversarial, in public.

"The semantic aspects of communication are irrelevant to the engineering problem."
— Claude Shannon, 1948

Lab clock CT · last update
exp 01 exp 02 exp 03 exp 04 exp 05 exp 06
28,000+
Data points
7
Experiments
4
Critic reviews
1
Metric library
The lineage

From cryptanalysis, to computation, to meaning.

Eleven centuries of people who tried to formalize what messages mean. Each node is a problem-of-its-time finally written down. The last node is the gap we are trying to fill.

Al-Kindi
850
Llull
1275
Trithemius
1499
Dee
1581
Leibniz
1679
Babbage
1830s
Turing
1940s
Shannon
1948
The gap
2026
Experiments

Active research threads.

Each card is a falsifiable question with concrete numbers. Status updates as data arrives — no mock results, no hand-waving.

Experiment 01 / 01B

Semantic entropy

Complete

Can Shannon entropy distinguish meaning?

Pairs tested (PAWS scale-up) 8,000
Shannon AUC 0.48
Semantic AUC 0.63
Key finding Shannon entropy is a coin flip for meaning detection (AUC 0.48). Semantic embeddings outperform at 0.63. Hypothesis confirmed: entropy cannot measure meaning.
Shannon vs semantic AUC across 8 PAWS bins
Experiment 02 / 02B

Compression-meaning correlation

Complete

Is understanding the same thing as compression?

Samples tested 525
Compression vs magnitude r = 0.63
Entropy vs uniqueness r = 0.36
Best predictor Compression ratio
Key finding Compression predicts meaning magnitude but not direction. Poetry compresses worse but scores higher on meaning. Compression is a proxy for complexity, not semantics.
Compression r vs entropy r across sample bins
Experiment 03 / 03B

Geometric consistency

Complete

Do meaning vectors share geometry across different AI architectures?

Analogies tested 19,544
Procrustes alignment 96.6%
Universal analogies 7 / 10
Architectures GPT / Claude / Llama / Mistral
Key finding 96.6% Procrustes alignment across 4 architectures. Meaning geometry is objective, not arbitrary. 7 of 10 analogies are universal across all models.
Pairwise Procrustes scores across 4 models
Experiment 04

Consciousness spectrum

Queued

Where on the model-size axis does the phase transition occur?

Hypothesis M(x) threshold = awareness
Models 1B → 400B params
Status Awaiting results
awaiting experiment start
Experiment 05

Grounding test

Complete

Does physical world-contact change how meaning is organized?

Grounding advantage −0.16
CLIP color test accuracy 90%
Statistical significance p = 0.008
Comparison Text-only vs multimodal
Key finding Grounding is qualitative, not quantitative. Multimodal models restructure representations differently — they don't just "know more," they organize knowledge differently. Grounding advantage is negative (−0.16) because grounded models compress differently.
Text vs multimodal across 6 task families
Experiment 06

Metric validation

Complete

Does M(x,C) correlate with existing meaning measures?

Cosine similarity r = 0.82
BERTScore r = 0.58
M(x,C) metric r = 0.58
Shannon entropy r = 0.16
Key finding Cosine similarity dominates at r=0.82. M(x,C) matches BERTScore at r=0.58. Shannon entropy is nearly decorrelated at r=0.16, confirming it measures something fundamentally different from meaning.
Cosine · BERTScore · M(x,C) · Shannon
The frontier map

What's solved, what's in progress, where breakthroughs are waiting.

Three columns. The left is settled science — don't reinvent. The middle is partial progress — opportunities to contribute. The right is genuinely open — the breakthrough territory we're aiming at. Click any item for the full dossier.

9Solved
8Active
8Unsolved
0/8Our progress
Solved
Don't reinvent
Shannon entropy ≠ meaning
1948 — Shannon himself said this
Semantic information theory
1953 — Bar-Hillel & Carnap formalized it
Knowledge and information flow
1981 — Dretske bridged info to meaning
Strongly semantic information
2004 — Floridi added truthfulness requirement
Word embeddings capture similarity
2013 — Mikolov, word2vec
Bayesian surprise as meaning proxy
2009 — Itti & Baldi
Free Energy / predictive processing
2010 — Friston
Sentence embeddings + cosine similarity
2019 — Reimers & Gurevych
Cross-architecture convergence
2024 — Huh et al, Platonic Representation Hypothesis
Active
Partial progress
Mechanistic interpretability
What circuits do inside models — Anthropic, DeepMind
Sparse autoencoders for feature discovery
Anthropic 2023–2026
Hallucination detection from internal states
Multiple labs, not solved
LLM understanding vs pattern matching
Bender & Koller 2020 debate
Rate-distortion theory for semantics
Zaslavsky et al, 2018
Normalized compression distance for meaning
Cilibrasi & Vitanyi, 2005
Semantic information via physics
Kolchinsky & Wolpert, 2018
Multimodal alignment safety transfer
Open problem
Unsolved
Breakthrough territory
Hard problem of consciousness
What IS subjective experience?
Queued
Deception detection from representation geometry
Can you catch an AI lying by its embeddings?
Queued
Cross-lingual meaning universals
Same geometry in Chinese, Finnish, Arabic?
Queued
The phase transition
Where does processing become meaning?
Queued
Grounded vs ungrounded meaning geometry
How does grounding restructure representations?
Queued
Meaning without a subject
Can M(x,C) work without consciousness?
Queued
Temporal dynamics of meaning
How does meaning change during conversation?
Queued
Adversarial robustness of semantic geometry
Can an attacker fool meaning metrics?
Queued
The equation

A proposed computable measure of semantic content.

M(x,C) is the proposed function: meaning of expression x in context C, summed over how much the expression shifts the posterior probability of each proposition relative to the prior.

The meaning function
M(x, C)  =  Σw(p) · |P(p|C,x) P(p|C)|
M(x, C)Meaning of expression x in context C.
xInput expression — text, symbol, signal.
w(p)Weight of proposition p — its salience.
P(p | C, x)Posterior probability of p given x and C.
P(p | C)Prior probability of p in context C alone.
Research log

Live experiment feed.

Append-only event stream from the lab. Every entry is timestamped (CT) and tagged with the experiment that emitted it.

Event stream live