TY - UNPB

T1 - LSH Microbatches for Stochastic Gradients: Value in Rearrangement

AU - Buchnik, E.

AU - Cohen, E.

AU - Hassidim, A.

AU - Matias, Y.

PY - 2018/9/28

Y1 - 2018/9/28

N2 - Metric embeddings are immensely useful representations of associations between entities (images, users, search queries, words, and more). Embeddings are learned by optimizing a loss objective of the general form of a sum over example associations. Typically, the optimization uses stochastic gradient updates over minibatches of examples that are arranged independently at random. In this work, we propose the use of {\em structured arrangements} through randomized {\em microbatches} of examples that are more likely to include similar ones. We make a principled argument for the properties of our arrangements that accelerate the training and present efficient algorithms to generate microbatches that respect the marginal distribution of training examples. Finally, we observe experimentally that our structured arrangements accelerate training by 3-20\%. Structured arrangements emerge as a powerful and novel performance knob for SGD that is independent and complementary to other SGD hyperparameters and thus is a candidate for wide deployment.

AB - Metric embeddings are immensely useful representations of associations between entities (images, users, search queries, words, and more). Embeddings are learned by optimizing a loss objective of the general form of a sum over example associations. Typically, the optimization uses stochastic gradient updates over minibatches of examples that are arranged independently at random. In this work, we propose the use of {\em structured arrangements} through randomized {\em microbatches} of examples that are more likely to include similar ones. We make a principled argument for the properties of our arrangements that accelerate the training and present efficient algorithms to generate microbatches that respect the marginal distribution of training examples. Finally, we observe experimentally that our structured arrangements accelerate training by 3-20\%. Structured arrangements emerge as a powerful and novel performance knob for SGD that is independent and complementary to other SGD hyperparameters and thus is a candidate for wide deployment.

UR - http://scholar.google.com/scholar?num=3&hl=en&lr=&q=allintitle%3A%20LSH%20Microbatches%20for%20Stochastic%20Gradients%3A%20Value%20in%20Rearrangement%2C%20author%3ABuchnik%20OR%20author%3ACohen%20OR%20author%3AHassidim%20OR%20author%3AMatias&as_ylo=2018&as_yhi=&btnG=Search&as_vis=0

UR - https://openreview.net/forum?id=r1erRoCqtX

M3 - نسخة اولية

VL - 5389

T3 - arXiv preprint arXiv:1803.,

BT - LSH Microbatches for Stochastic Gradients: Value in Rearrangement

ER -