AI-Based Research Tool for Large Genealogical Corpora: The Case of Jewish Communities Worldwide

Research output: Contribution to journalArticlepeer-review


This paper presents a new methodology for AI-based research and exploration of large genealogical corpora. The proposed approach is based on an automatic quantitative question-answering (QA) system that enables researchers to ask questions in natural language and learn about trends related to individuals, families, and communities in the corpus of the study. The proposed methodology includes: 1) an automatic method for training dataset generation, 2) a transformer-based table selection method, and 3) an optimized transformer-based quantitative QA model. The findings indicate that the proposed architecture outperforms the state-of-the-art models by achieving 87% accuracy on the large corpus of Jewish genealogical data. This study may have practical implications for genealogical information centers and museums, making genealogical data research easy and scalable for experts as well as the general public.

Original languageEnglish
Pages (from-to)733-737
Number of pages5
JournalProceedings of the Association for Information Science and Technology
Issue number1
StatePublished - Oct 2023


  • Question-answering
  • cultural heritage
  • deep learning
  • digital humanities
  • genealogical data

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • Library and Information Sciences


Dive into the research topics of 'AI-Based Research Tool for Large Genealogical Corpora: The Case of Jewish Communities Worldwide'. Together they form a unique fingerprint.

Cite this