The purpose of this paper is to evaluate Google question-answering (QA) quality.
Given the large variety and complexity of Google answer boxes in search result pages, existing evaluation criteria for both search engines and QA systems seemed unsuitable. This study developed an evaluation criteria system for the evaluation of Google QA quality by coding and analyzing search results of questions from a representative question set. The study then evaluated Google’s overall QA quality as well as QA quality across four target types and across six question types, using the newly developed criteria system. ANOVA and Tukey tests were used to compare QA quality among different target types and question types.
It was found that Google provided significantly higher-quality answers to person-related questions than to thing-related, event-related and organization-related questions. Google also provided significantly higher-quality answers to where- questions than to who-, what- and how-questions. The more specific a question is, the higher the QA quality would be.
Suggestions for both search engine users and designers are presented to help enhance user experience and QA quality.
Particularly suitable for search engine QA quality analysis, the newly developed evaluation criteria system expanded and enriched assessment metrics of both search engines and QA systems.
