Factors Affecting Audiovisual Speech Perception as Measured by the McGurk Effect
Basu Mallick, Debshila
Beauchamp, Michael S; Dannemiller, James L
Doctor of Philosophy
Multisensory speech perception occurs when an individual integrates spoken sounds and mouth movements of a talker into a coherent percept, e.g., during face-to-face conversations. Under usual circumstances, spoken sounds and mouth movements match. However, when there is a mismatch between spoken sounds and mouth movements, individuals sometimes perceive a “fused” percept, different from the constituent audiovisual information. This phenomenon, known as the McGurk effect has been used in thousands of papers in the literature as a measure of audiovisual integration in speech. For my dissertation I attempted to extend the findings of my previous work by investigating the sources of interindividual and interstimulus differences in the McGurk effect. In the first experiment, I attempted to investigate the influence of response-type on individuals’ perception of the McGurk effect. Studies of the McGurk effect have predominantly adopted either an open-choice or a forced-choice response format to record participants’ responses. For my dissertation, I compared open vs. forced choice responses in two groups. To allow me to collect data from large numbers of subjects, I developed an experimental toolkit that uses a web-based crowdsourcing tool called Amazon Mechanical Turk (MTurk) and methods to collect and analyze data using MTurk. I collected data from 110 and 117 participants in the open-choice and forced-choice conditions respectively. I found that participants in the forced-choice condition were more likely to report the McGurk effect than the open-choice group (69% vs 42%, p = 10-7). This increase was consistent across all 8 stimuli. I showed that there was large variability in McGurk responses across subjects and stimuli for both open and forced choice conditions, ranging from 0% to 100% for subjects, and 30% to 80% for stimuli. In the second experiment, I attempted to influence the efficacy of McGurk stimuli by changing the speed of video playback. As technology becomes geared more towards audiovisual communication (e.g. videos on YouTube, Coursera), individuals now have the option of slowing information down or speeding them up to accommodate information processing needs. I modified the playback rate such that the stimuli were presented at .5x, 1x, and 2x speeds (slow, normal, fast) to 2 groups of participants (58 in one group and 60 in another) recruited using MTurk. I found that playback rate does indeed affect frequency of McGurk responses. Under slow speeds, McGurk responses dropped (an estimated 11%), while visual responses increased (12%), whereas, speeding up the video to 2x did not result in responses different from the normal speed (0.7%). The drop in McGurk responses in the slow condition may be explained with increase in onset asynchrony between the visual and auditory cues.
Audiovisual; speech perception; McGurk effect