Evaluation in Visualization
The clash of cultures between the correlation co-efficients and the trained ears revolves around the issue of evaluation. Although ICAD claims to be rooted in scientific fields, a review of 495 sonifications found that only 6.1% had ever been evaluated (Dubus and Bresin 2013). this seems contradictory, but the level of evaluation is equally marginal in the more mature field of data visualization. A survey of 65 visualization papers found only 12 included an evaluation, of which only 2 were valid or useful (Ellis and Dixon 2006). They observe that evaluation is difficult for visualizations because there are many stages of data processing, task analysis, and graphic design which involve a string of decisions. The combinatorial explosion of decision variables makes the scientific evaluation of a specific instance of design impractical and the generalization of results methodologically unsound. In this situation visualization researchers often resort to summative comparisons between two or three designs that aim to obtain a seal of approval for one of them, but these results make for weak science at best (Ellis and Dixon 2006). Often the most useful outcomes from these evaluation experiments come from accidental and unexpected results. Visualization researchers Geoffrey Ellis and Alan Dixon propose a new method of “experimental evaluation” which focuses on understanding fundamental concepts rather than specific designs. Experimental evaluation may include the investigation of boundary conditions or cases that are known to produce poor outcomes. This is not a search for an optimal or universal solution but a way to understand the fundamental characteristics of a large multidimensional design space.