Research notebook · Project 138

Toward a computable measure of semantic content.

Shannon measured information. He explicitly excluded meaning. This lab is one attempt to close that gap — empirical, adversarial, in public.

"The semantic aspects of communication are irrelevant to the engineering problem."
— Claude Shannon, 1948

Lab clock — CT · last update —

exp 01 exp 02 exp 03 exp 04 exp 05 exp 06

28,000+

Data points

Experiments

Critic reviews

Metric library

The lineage

From cryptanalysis, to computation, to meaning.

Eleven centuries of people who tried to formalize what messages mean. Each node is a problem-of-its-time finally written down. The last node is the gap we are trying to fill.

Al-Kindi

850

Llull

1275

Trithemius

1499

Dee

1581

Leibniz

1679

Babbage

1830s

Turing

1940s

Shannon

1948

The gap

2026

Experiments

Active research threads.

Each card is a falsifiable question with concrete numbers. Status updates as data arrives — no mock results, no hand-waving.

Experiment 01 / 01B

Semantic entropy

Complete

Can Shannon entropy distinguish meaning?

Pairs tested (PAWS scale-up) 8,000

Shannon AUC 0.48

Semantic AUC 0.63

Key finding Shannon entropy is a coin flip for meaning detection (AUC 0.48). Semantic embeddings outperform at 0.63. Hypothesis confirmed: entropy cannot measure meaning.

Shannon vs semantic AUC across 8 PAWS bins

Experiment 02 / 02B

Compression-meaning correlation

Complete

Is understanding the same thing as compression?

Samples tested 525

Compression vs magnitude r = 0.63

Entropy vs uniqueness r = 0.36

Best predictor Compression ratio

Key finding Compression predicts meaning magnitude but not direction. Poetry compresses worse but scores higher on meaning. Compression is a proxy for complexity, not semantics.

Compression r vs entropy r across sample bins

Experiment 03 / 03B

Geometric consistency

Complete

Do meaning vectors share geometry across different AI architectures?

Analogies tested 19,544

Procrustes alignment 96.6%

Universal analogies 7 / 10

Architectures GPT / Claude / Llama / Mistral

Key finding 96.6% Procrustes alignment across 4 architectures. Meaning geometry is objective, not arbitrary. 7 of 10 analogies are universal across all models.

Pairwise Procrustes scores across 4 models

Experiment 04

Consciousness spectrum

Queued

Where on the model-size axis does the phase transition occur?

Hypothesis M(x) threshold = awareness

Models 1B → 400B params

Status Awaiting results

awaiting experiment start

Experiment 05

Grounding test

Complete

Does physical world-contact change how meaning is organized?

Grounding advantage −0.16

CLIP color test accuracy 90%

Statistical significance p = 0.008

Comparison Text-only vs multimodal

Key finding Grounding is qualitative, not quantitative. Multimodal models restructure representations differently — they don't just "know more," they organize knowledge differently. Grounding advantage is negative (−0.16) because grounded models compress differently.

Text vs multimodal across 6 task families

Experiment 06

Metric validation

Complete

Does M(x,C) correlate with existing meaning measures?

Cosine similarity r = 0.82

BERTScore r = 0.58

M(x,C) metric r = 0.58

Shannon entropy r = 0.16

Key finding Cosine similarity dominates at r=0.82. M(x,C) matches BERTScore at r=0.58. Shannon entropy is nearly decorrelated at r=0.16, confirming it measures something fundamentally different from meaning.

Cosine · BERTScore · M(x,C) · Shannon

The frontier map

What's solved, what's in progress, where breakthroughs are waiting.

Three columns. The left is settled science — don't reinvent. The middle is partial progress — opportunities to contribute. The right is genuinely open — the breakthrough territory we're aiming at. Click any item for the full dossier.

9Solved

8Active

8Unsolved

0/8Our progress

Solved

Don't reinvent

Shannon entropy ≠ meaning

1948 — Shannon himself said this

Semantic information theory

1953 — Bar-Hillel & Carnap formalized it

Knowledge and information flow

1981 — Dretske bridged info to meaning

Strongly semantic information

2004 — Floridi added truthfulness requirement

Word embeddings capture similarity

2013 — Mikolov, word2vec

Bayesian surprise as meaning proxy

2009 — Itti & Baldi

Free Energy / predictive processing

2010 — Friston

Sentence embeddings + cosine similarity

2019 — Reimers & Gurevych

Cross-architecture convergence

2024 — Huh et al, Platonic Representation Hypothesis

Active

Partial progress

Mechanistic interpretability

What circuits do inside models — Anthropic, DeepMind

Sparse autoencoders for feature discovery

Anthropic 2023–2026

Hallucination detection from internal states

Multiple labs, not solved

LLM understanding vs pattern matching

Bender & Koller 2020 debate

Rate-distortion theory for semantics

Zaslavsky et al, 2018

Normalized compression distance for meaning

Cilibrasi & Vitanyi, 2005

Semantic information via physics

Kolchinsky & Wolpert, 2018

Multimodal alignment safety transfer

Open problem

Unsolved

Breakthrough territory

Hard problem of consciousness

What IS subjective experience?

Queued

Deception detection from representation geometry

Can you catch an AI lying by its embeddings?

Queued

Cross-lingual meaning universals

Same geometry in Chinese, Finnish, Arabic?

Queued

The phase transition

Where does processing become meaning?

Queued

Grounded vs ungrounded meaning geometry

How does grounding restructure representations?

Queued

Meaning without a subject

Can M(x,C) work without consciousness?

Queued

Temporal dynamics of meaning

How does meaning change during conversation?

Queued

Adversarial robustness of semantic geometry

Can an attacker fool meaning metrics?

Queued

The equation

A proposed computable measure of semantic content.

M(x,C) is the proposed function: meaning of expression x in context C, summed over how much the expression shifts the posterior probability of each proposition relative to the prior.

The meaning function

M(x, C) = Σ w(p) · |P(p|C,x) − P(p|C)|

M(x, C)Meaning of expression x in context C.

xInput expression — text, symbol, signal.

w(p)Weight of proposition p — its salience.

P(p | C, x)Posterior probability of p given x and C.

P(p | C)Prior probability of p in context C alone.

Research log

Live experiment feed.

Append-only event stream from the lab. Every entry is timestamped (CT) and tagged with the experiment that emitted it.

Event stream live