Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more

Elad Tolochinsky, Ibrahim Jubran, Dan Feldman

פרסום מחקרי: פרסום בכתב עתמאמר מכנסביקורת עמיתים

תקציר

Coreset (or core-set) is a small weighted subset Q of an input set P with respect to a given monotonic function f : R → R that provably approximates its fitting loss (Equation presented) to any given x ∈ Rd. Using Q we can obtain an approximation of x that minimizes this loss, by running existing optimization algorithms on Q. In this work we provide: (i) A lower bound which proves that there are sets with no coresets smaller than n = |P | for general monotonic loss functions. (ii) A proof that, with an additional common regularization term and under a natural assumption that holds e.g. for logistic regression and the sigmoid activation functions, a small coreset exists for any input P. (iii) A generic coreset construction algorithm that computes such a small coreset Q in O(nd + n log n) time, and (iv) Experimental results with open-source code which demonstrate that our coresets are effective and are much smaller in practice than predicted in theory.

שפה מקוריתאנגלית
עמודים (מ-עד)21520-21547
מספר עמודים28
כתב עתProceedings of Machine Learning Research
כרך162
סטטוס פרסוםפורסם - 2022
אירוע39th International Conference on Machine Learning, ICML 2022 - Baltimore, ארצות הברית
משך הזמן: 17 יולי 202223 יולי 2022
https://proceedings.mlr.press/v162/

ASJC Scopus subject areas

  • ???subjectarea.asjc.1700.1702???
  • ???subjectarea.asjc.1700.1712???
  • ???subjectarea.asjc.2200.2207???
  • ???subjectarea.asjc.2600.2613???

טביעת אצבע

להלן מוצגים תחומי המחקר של הפרסום 'Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more'. יחד הם יוצרים טביעת אצבע ייחודית.

פורמט ציטוט ביבליוגרפי