TY - GEN
T1 - Complexity theoretic limitations on learning halfspaces
AU - Daniely, Amit
N1 - Publisher Copyright: © 2016 ACM.
PY - 2016/6/19
Y1 - 2016/6/19
N2 - We study the problem of agnostically learning halfspaces which is defined by a fixed but unknown distribution D on ℚn × {±1}. We define ErrHALF(D) as the least error of a halfspace classifier for D. A learner who can access D has to return a hypothesis whose error is small compared to ErrHALF(D). Using the recently developed method of Daniely, Linial and Shalev-Shwartz we prove hardness of learning results assuming that random K-XOR formulas are hard to (strongly) refute. We show that no efficient learning algorithm has nontrivial worst-case performance even under the guarantees that ErrHALp(D) ≤ η for arbitrarily small constant η > 0, and that D is supported in {±1 }n × {±1}. Namely, even under these favorable conditions, and for every c > 0, it is hard to return a hypothesis with error ≤ 1/2/1/nc. In particular, no efficient algorithm can achieve a constant approximation ratio. Under a stronger version of the assumption (where K can be poly-logarithmic in n), we can take η = 2-log1-v(n) for arbitrarily small ? > 0. These results substantially improve on previously known results, that only show hardness of exact learning.
AB - We study the problem of agnostically learning halfspaces which is defined by a fixed but unknown distribution D on ℚn × {±1}. We define ErrHALF(D) as the least error of a halfspace classifier for D. A learner who can access D has to return a hypothesis whose error is small compared to ErrHALF(D). Using the recently developed method of Daniely, Linial and Shalev-Shwartz we prove hardness of learning results assuming that random K-XOR formulas are hard to (strongly) refute. We show that no efficient learning algorithm has nontrivial worst-case performance even under the guarantees that ErrHALp(D) ≤ η for arbitrarily small constant η > 0, and that D is supported in {±1 }n × {±1}. Namely, even under these favorable conditions, and for every c > 0, it is hard to return a hypothesis with error ≤ 1/2/1/nc. In particular, no efficient algorithm can achieve a constant approximation ratio. Under a stronger version of the assumption (where K can be poly-logarithmic in n), we can take η = 2-log1-v(n) for arbitrarily small ? > 0. These results substantially improve on previously known results, that only show hardness of exact learning.
KW - Halfspaces
KW - Hardness of learning
KW - Random XOR
UR - http://www.scopus.com/inward/record.url?scp=84979225371&partnerID=8YFLogxK
U2 - 10.1145/2897518.2897520
DO - 10.1145/2897518.2897520
M3 - منشور من مؤتمر
T3 - Proceedings of the Annual ACM Symposium on Theory of Computing
SP - 105
EP - 117
BT - STOC 2016 - Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing
A2 - Mansour, Yishay
A2 - Wichs, Daniel
T2 - 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016
Y2 - 19 June 2016 through 21 June 2016
ER -