Abstract
The shortage of available high-quality clinical databases restricts medical diagnostics downstream. Clinical databases are often limited to controlled non-natural environments, they are restricted due to privacy limitations and require complex scoring procedures that ultimately result in rater bias. Social media includes massive amounts of information on subjects through streams of text, audio, and video data that is accessible and currently underutilized for medical research. In this work we suggest a method for utilizing this information, by constructing databases for medical condition assessment. To this end we have created SMDC (Social Medical Data Constructor), a utility based on medical expert requirements. Data Features and non-confidential demographic information are extracted online, and labels are derived using data mining techniques. We examine the feasibility of the suggested technology with ADHD recognition from a database extracted from YouTube clips using the self-tagging as ADHD labels. The database maintain privacy and copywrite limitations and no personal identification is provided. To validate the database, we show a high correlation of the model labels with expert labeling (r = 0.68) and compatibility of six known ADHD motor biomarker features of hyperactivity to the ones derived using our database. We extracted from the video clips kinematics features and reached ADHD recognition accuracy of 83%, and 81%, for female sand males respectively. The suggested technology has a potential to assess natural real-life behavior properties of the medical condition and be further used for pre-training the medical condition prediction model, and consequently reduced the required clinical dataset size that can be used efficiently for model fine-tuning and clinical verification.
| Original language | English |
|---|---|
| Pages (from-to) | 164725-164736 |
| Number of pages | 12 |
| Journal | IEEE Access |
| Volume | 12 |
| DOIs | |
| State | Published - 2024 |
Keywords
- ADHD
- databases
- machine learning
- medical diagnosis
- social networks
All Science Journal Classification (ASJC) codes
- General Computer Science
- General Materials Science
- General Engineering