quinta-feira, 18 de março de 2010

The Method of Triadic Comparisons

(Application of multidimensional scaling to subjective evaluation of coded speech, Joseph L. Hall)

A problem with metric multidimensional scaling in human listening experiments is that listeners are required to associate numbers with dissimilarities. Different listeners use numbers differently, and it is difficult for a listener to use numbers consistently throughout the course of a listening test. A test in which listeners are required to rank order dissimilarities, so that judgments are of the form ``greater than'' or ``less than,'' is more satisfactory in this respect. Tests of this sort can be analyzed by nonmetric multidimensional scaling, in which the numbers in the dissimilarity matrix are a rank ordering of the distance between objects. In our example, the two cities that are closest together receive a rank ordering of one, and the two cities that are furthest apart receive a rank ordering of 45. With 10 cities there are 45 intercity distances. The criterion used by the multidimensional scaling program is that the rank ordering of distances in the stimulus space agree with the rank ordering of input dissimilarities. According to Young and Harris op. cit., p.
127 , the nonmetric minimization problem is much more difficult than the metric problem and requires an iterative solution. In our airline mileage example, the result of applying nonmetric classical multidimensional scaling to this dissimilarity matrix is an object space that differs only slightly from the first one.

Rank ordering the intercity mileages is easy; it involves simply sorting 45 numbers. In practice, when a listener is asked to make judgment about the relative dissimilarities of auditory stimuli, rank ordering more than a few stimuli becomes very time consuming and puts unacceptable demands on the listener's memory. A technique that has been developed to solve this problem is the method of triadic comparisons. Rather than being presented with all $n$ stimuli and being asked to rank order their dissimilarities, the subject is presented with stimuli three at a time and is asked to judge which two of the three are most dissimilar and which two of the three are most similar. The most dissimilar pair is given a score of two, the most similar pair is given a score of zero, and the remaining pair is given a score of one. This process is repeated for all $\binom{n}{3}$ triads of the n stimuli taken three at a time, and the scores resulting from each triad are added up to obtain the dissimilarity matrix. An advantage of this method is that each trial is completely self-contained. The subject's judgment on a given trial is based only on the three stimuli that are presented on that trial. This differs from MOS testing, in which the subject's judgment on a given trial is influenced in an uncontrolled manner by stimuli presented on other trials. The method of triadic comparisons does not allow for direct comparison of all $\binom{n}{2}$ stimulus pairs, so that with sparsely populated stimulus spaces some distortion is possible. In our flying-mileages example, the stimulus space resulting from the method of triadic comparisons differs from the original stimulus space; but in spite of the drastically modified task, the basic structure remains unchanged.


The solution space generated by weighted multidimensional scaling is not rotatable: the dimensions of the stimulus space and the subject space are dictated by the data and are not arbitrary. This is an extremely important property. It means that we can ask listeners to make judgments about similarities or differences among speech samples without specifying what features they should base these judgments on; and, to the extent that the assumptions of the model are valid, we can determine the relevant features from the experimental results.

Nenhum comentário:

Postar um comentário