TY - GEN
T1 - Privacy-Preserving Decision Trees Training and Prediction
AU - Akavia, Adi
AU - Leibovich, Max
AU - Resheff, Yehezkel S.
AU - Ron, Roey
AU - Shahar, Moni
AU - Vald, Margarita
N1 - Publisher Copyright: © 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - In the era of cloud computing and machine learning, data has become a highly valuable resource. Recent history has shown that the benefits brought forth by this data driven culture come at a cost of potential data leakage. Such breaches have a devastating impact on individuals and industry, and lead the community to seek privacy preserving solutions. A promising approach is to utilize Fully Homomorphic Encryption (FHE ) to enable machine learning over encrypted data, thus providing resiliency against information leakage. However, computing over encrypted data incurs a high computational overhead, thus requiring the redesign of algorithms, in an “ FHE -friendly” manner, to maintain their practicality. In this work we focus on the ever-popular tree based methods (e.g., boosting, random forests), and propose a new privacy-preserving solution to training and prediction for trees. Our solution employs a low-degree approximation for the step-function together with a lightweight interactive protocol, to replace components of the vanilla algorithm that are costly over encrypted data. Our protocols for decision trees achieve practical usability demonstrated on standard UCI datasets, encrypted with fully homomorphic encryption. In addition, the communication complexity of our protocols is independent of the tree size and dataset size in prediction and training, respectively, which significantly improves on prior works.
AB - In the era of cloud computing and machine learning, data has become a highly valuable resource. Recent history has shown that the benefits brought forth by this data driven culture come at a cost of potential data leakage. Such breaches have a devastating impact on individuals and industry, and lead the community to seek privacy preserving solutions. A promising approach is to utilize Fully Homomorphic Encryption (FHE ) to enable machine learning over encrypted data, thus providing resiliency against information leakage. However, computing over encrypted data incurs a high computational overhead, thus requiring the redesign of algorithms, in an “ FHE -friendly” manner, to maintain their practicality. In this work we focus on the ever-popular tree based methods (e.g., boosting, random forests), and propose a new privacy-preserving solution to training and prediction for trees. Our solution employs a low-degree approximation for the step-function together with a lightweight interactive protocol, to replace components of the vanilla algorithm that are costly over encrypted data. Our protocols for decision trees achieve practical usability demonstrated on standard UCI datasets, encrypted with fully homomorphic encryption. In addition, the communication complexity of our protocols is independent of the tree size and dataset size in prediction and training, respectively, which significantly improves on prior works.
KW - Decision trees
KW - Fully homomorphic encryption
KW - Prediction
KW - Privacy preserving machine learning
KW - Training
UR - http://www.scopus.com/inward/record.url?scp=85103232247&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-67658-2_9
DO - 10.1007/978-3-030-67658-2_9
M3 - Conference contribution
SN - 9783030676575
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 145
EP - 161
BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2020, Proceedings
A2 - Hutter, Frank
A2 - Kersting, Kristian
A2 - Lijffijt, Jefrey
A2 - Valera, Isabel
PB - Springer Science and Business Media Deutschland GmbH
T2 - European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2020
Y2 - 14 September 2020 through 18 September 2020
ER -