Abstract
Subgroup discovery (SGD) is presented here as a data-mining approach to help find interpretable local patterns, correlations, and descriptors of a target property in materials-science data. Specifically, we will be concerned with data generated by density-functional theory calculations. At first, we demonstrate that SGD can identify physically meaningful models that classify the crystal structures of 82 octet binary (OB) semiconductors as either rocksalt or zincblende. SGD identifies an interpretable two-dimensional model derived from only the atomic radii of valence s and p orbitals that properly classifies the crystal structures for 79 of the 82 OB semiconductors. The SGD framework is subsequently applied to 24 400 configurations of neutral gas-phase gold clusters with 5-14 atoms to discern general patterns between geometrical and physicochemical properties. For example, SGD helps find that van der Waals interactions within gold clusters are linearly correlated with their radius of gyration and are weaker for planar clusters than for nonplanar clusters. Also, a descriptor that predicts a local linear correlation between the chemical hardness and the cluster isomer stability is found for the even-sized gold clusters.
Original language | American English |
---|---|
Article number | 013031 |
Journal | New Journal of Physics |
Volume | 19 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2017 |
Externally published | Yes |
Keywords
- big-data analytics
- data mining
- gold clusters
- machine learning
- octet binary semiconductors
- pattern discovery
All Science Journal Classification (ASJC) codes
- General Physics and Astronomy