Abstract
Children with special needs may struggle to identify uncomfortable and unsafe situations. In this study, we aimed at developing an automated system that can detect such situations based on audio and text cues to encourage children’s safety and prevent situations of violence toward them. We composed a text and audio database with over 1891 sentences extracted from videos presenting real-world situations, and categorized them into three classes: neutral sentences, insulting sentences, and sentences indicating unsafe conditions. We compared insulting and unsafe sentence-detection abilities of various machine-learning methods. In particular, we found that a deep neural network that accepts the text embedding vectors of bidirectional encoder representations from transformers (BERT) and audio embedding vectors of Wav2Vec as input attains the highest accuracy in detecting unsafe and insulting situations. Our results indicate that it may be applicable to build an automated agent that can detect unsafe and unpleasant situations that children with special needs may encounter, given the dialogue contexts conducted with these children.
Original language | English |
---|---|
Article number | 3927 |
Journal | Applied Sciences (Switzerland) |
Volume | 13 |
Issue number | 6 |
DOIs | |
State | Published - Mar 2023 |
Keywords
- assistive technologies for persons with disabilities
- audio classification
- bulling
- children’s safety
- machine learning
- pretrained models
- text classification
All Science Journal Classification (ASJC) codes
- General Engineering
- Instrumentation
- Fluid Flow and Transfer Processes
- Process Chemistry and Technology
- General Materials Science
- Computer Science Applications