Abstract: An orthogonal-gradient measurement matrix construction algorithm is proposed for reducing the maximum and average mutual-coherence of sensing matrix. It shrinks Gram matrix based on ...
We also provide LogStableMax, which outputs log-probabilities directly. By discarding the parallel component and updating only in directions orthogonal to the current weights, the model is encouraged ...
In this paper, we propose a novel Intra- and Inter-Head Orthogonal Attention (I2OA) to efficiently improve MA in image captioning by introducing a concise orthogonal regularization to heads.