#ml

4 posts tagged #ml.

May 13, 2026

Attention is Explainable Because it is a Kernel

A reading of self-attention through kernel smoothing and RKHS.
Feb 22, 2026

Not All Infinities Are Equal

The singularity structure of cross-entropy explains hallucination, the modality gap, and why contrastive losses need such big batches.
Feb 21, 2026

Opposite Is Not Different

The cosine-similarity scale has three landmarks, not two. Maximum difference is orthogonality, not opposition — and the most influential contrastive losses spent years optimizing for the wrong target.
Feb 19, 2026

Activations Are Bad for Geometry

Pointwise activations factor into the layer's Jacobian as a diagonal modulation. The same modulation that buys selectivity destroys geometric structure on the data manifold.