The Free Transformer

cs.LG arXiv:2510.17558
View PDF arXiv JSON

Abstract

We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experimental evaluations show that allowing such a conditioning translates into substantial improvements on downstream tasks.

PDF Viewer