{"id":232,"date":"2014-10-11T02:37:19","date_gmt":"2014-10-11T02:37:19","guid":{"rendered":"http:\/\/sonny.cslu.ohsu.edu\/~gormanky\/blog\/?p=232"},"modified":"2014-10-11T02:37:19","modified_gmt":"2014-10-11T02:37:19","slug":"how-uh-and-um-differ","status":"publish","type":"post","link":"https:\/\/www.wellformedness.com\/blog\/how-uh-and-um-differ\/","title":{"rendered":"How &quot;uh&quot; and &quot;um&quot; differ"},"content":{"rendered":"<p>If you&#8217;ve been following the recent\u00a0discussions on <a title=\"Language Log\" href=\"http:\/\/languagelog.ldc.upenn.edu\/\">Language Log<\/a>, then you know that there is a great deal of inter-speaker variation in the use of the fillers\u00a0<em>uh<\/em> and\u00a0<em>um<\/em>, despite their superficial similarity. In this post, I&#8217;ll discuss some published results, summarize some of the Language Log findings (with the obvious caveat that none of it has been subject to any sort of peer review) and explain what I\u00a0think it all\u00a0means for our understanding of the contrast between\u00a0<em>uh<\/em> and\u00a0<em>um<\/em>.<\/p>\n<h1>The function of\u00a0<em>uh<\/em> and\u00a0<em>um<\/em><\/h1>\n<p>The vast majority of work on disfluencies (which include fillers like\u00a0<em>uh <\/em>and\u00a0<em>um<\/em> as well as repetitions, revisions, and false starts) assumes that\u00a0<em>uh<\/em> and\u00a0<em>um<\/em> are functionally equivalent, substitutable forms. But Clark and Fox Tree (2002) argue that they are subtly different. They claim that\u00a0<em>uh<\/em> serves as signal minor delays and\u00a0<em>um<\/em> signals major delays. The\u00a0evidence for this is straightforward:<\/p>\n<ul>\n<li><em>Um<\/em> is more often followed by a pause than <em>uh.<\/em><\/li>\n<li>Pauses after<em> <em>um<\/em><\/em>s tend to be longer than those occurring after<em> <em>uh<\/em><\/em>s<i>\u00a0<\/i>(though <a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=14991\">Mark has failed to replicate this<\/a>\u00a0in a much larger corpus,\u00a0and I am inclined to defer to him).<\/li>\n<li><em>Um<\/em> is more common than\u00a0<em>uh <\/em>in utterance-initial position, the point at which speech\u00a0planning demands\u00a0are presumably at their greatest. [1]<\/li>\n<\/ul>\n<p>From these results, though, it is not obvious that\u00a0<em>uh<\/em> and\u00a0<em>um<\/em> are <strong>qualitatively<\/strong> different. This has not prevented people (myself included) from making this jump. For example, <a href=\"http:\/\/www.theatlantic.com\/health\/archive\/2014\/08\/men-say-uh-and-women-say-um\/375729\/\">Mark speculated a bit about this for the Atlantic<\/a>: &#8220;People tend to use UM when they&#8217;re trying to decide what to say, and UH when they&#8217;re trying to decide how to say it.&#8221; This is plausible, but the\u00a0evidence for differential functions of <em>uh<\/em> and\u00a0<em>um<\/em>\u00a0is\u00a0lacking.<\/p>\n<h1>Intraspeaker differences in\u00a0<em>uh<\/em> and\u00a0<em>um<\/em><\/h1>\n<h2>Gender effects<\/h2>\n<p>The first\u2014and probably most robust\u2014finding, is that female speakers have a higher average\u00a0<em>um<\/em>\/<em>uh<\/em> ratio than males. This pattern was found in several corpora of American English available from the LDC (<a href=\"http:\/\/itre.cis.upenn.edu\/~myl\/languagelog\/archives\/002629.html\">1<\/a>\u00a0<a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=13713\">2<\/a>\u00a0<a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=13805\">3<\/a>). It also reported in a recent paper by Acton (2011), who looks two American English corpora. A higher\u00a0<em>um\/uh<\/em> ratio in females was also found in two corpora of British English. <a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=14093\">The first looks at data from the HCRC map task<\/a>\u00a0and the <a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=14143\">second at the conversational portion of the British National Corpus<\/a>\u00a0(BNC). The latter was earlier\u00a0the\u00a0subject of a study by Rayson et al. (1997), who found that that\u00a0<em>er\u00a0<\/em>(the British equivalent of\u00a0<em>uh<\/em>)\u00a0[2]\u00a0was the one of the\u00a0words most strongly associated with male (rather than female) speakers; the only word more &#8220;masculine&#8221; than\u00a0<em>uh<\/em>\u00a0was the expletive\u00a0<em>fucking<\/em>.<\/p>\n<h2>Social class effects<\/h2>\n<p>The second\u00a0finding is that <em>um<\/em>\/<em>uh<\/em>\u00a0ratio is correlated with social class: higher status speakers have a higher\u00a0<em>um\/uh<\/em> ratio.\u00a0Once again, this was first reported by Rayson et al., who found that\u00a0<i>erm<\/i>\u00a0is more common in speakers with high-status occupations. <a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=13990\">Mark found a similar pattern in American English<\/a>\u00a0using educational attainment\u2014rather than occupation\u2014as a measure of social class.<\/p>\n<h2><strong>Age effects<\/strong><\/h2>\n<p>The third\u00a0finding\u00a0is that younger speakers have a higher\u00a0<em>um<\/em>\/<em>uh<\/em> ratio than older speakers. This was first reported by Rayson et al. (once again, studying the conversational portion of the BNC), who found that that\u00a0<em>er\u00a0<\/em>is much more common in speakers over the age of 35. Similar patterns are reported by Acton, and several Language Log correspondents (<a href=\"http:\/\/itre.cis.upenn.edu\/~myl\/languagelog\/archives\/002629.html\">1<\/a>\u00a0<a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=13713\">2<\/a>\u00a0<a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=14058\">3<\/a>\u00a0<a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=14143\">4<\/a>).<\/p>\n<h2>Geographic effects<\/h2>\n<p>Finally, <a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=14015\">Jack Grieve looked at <em>um<\/em>\/<em>uh<\/em> ratio geographically<\/a>, and found that\u00a0<em>um<\/em> was more common in the Midlands and the central southwest. I see two issues with this result, however. First, I don&#8217;t observe any\u00a0geographic patterns in the raw data (<em>ibid.<\/em>, in the comments section of that post); to my eye, the geographic patterns\u00a0only emerge after aggressive smoothing; this may just be another case of\u00a0Smoothers Gone Wild. Secondly, the data was taken from geocoded Twitter posts, not speech. As\u00a0commenter &#8220;BK&#8221; asks: &#8220;do we have any reason to believe that writing &#8216;UM&#8217; vs &#8216;UH&#8217; in a tweet is at all correlated with the use of &#8216;UM&#8217; vs &#8216;UH&#8217; in speech?&#8221; Regrettably, I suspect the answer is no,\u00a0but there still is probably something to be gleaned\u00a0from tweeters&#8217; stylistic use of these fillers.<\/p>\n<h2><em>Uh<\/em> and <i>um<\/i> in children with autism<\/h2>\n<p>Our\u00a0recent work on filler use in children with autism spectrum disorders (ASD) might provide us another way to get at the functional differentiation between\u00a0<em>uh<\/em> and\u00a0<em>um<\/em>. We [3] used a semi-structured corpus of diagnostic interviews of children ages 4-8, and find that children with ASD produce a much lower\u00a0<em>um<\/em>&#8211;<em>uh<\/em> ratio than typically developing children matched for age and intelligence. Children with specific language impairment\u2014a neurodevelopmental disorder characterized by language delays or deficits in the absence of other developmental or sensory impairments\u2014have an\u00a0<em>um<\/em>&#8211;<em>uh\u00a0<\/em>ratio much closer to the typical children; this tells us it&#8217;s not about language impairment (something which is relatively common\u2014but not specific to\u2014children with ASD). We also find that\u00a0<em>um<\/em>&#8211;<em>uh<\/em> ratio is correlated with the Communication Total Score of the Social Communication Questionnaire, a parent-reported measure of communication ability. At the very least, individuals who use more\u00a0<em>um<\/em> are perceived to have better\u00a0communication abilities by their parents. At best, use of\u00a0<em>um\u00a0<\/em>itself\u00a0contributes to these perceptions.<\/p>\n<h1>How\u00a0<em>uh<\/em> and <i>um<\/i> differ<\/h1>\n<p>To the sociolinguistic eye, the effects of gender, class, and age just described tell us\u00a0a lot about\u00a0<em>uh<\/em>\u00a0and\u00a0<em>um<\/em>. Given that\u00a0women have a higher\u00a0<em>um\/uh\u00a0<\/em>ratio than\u00a0men, we expect that\u00a0<em>um <\/em>is either the more prestigious variant, or the incoming variant, or both.\u00a0This is what Labov calls\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Gender_paradox\">the gender paradox<\/a>: women consistently lead men in the use of prestige variants, and lead men in the adoption of innovative variants.\u00a0Further evidence that\u00a0<em>um<\/em> is the prestige variant comes from social class: higher status individuals have a higher\u00a0<em>um<\/em>\/<em>uh<\/em> ratio. Younger speakers have a higher\u00a0<em>um<\/em>\/<em>uh <\/em>ratio, suggesting\u00a0that\u00a0<em>um<\/em> is also the incoming variant. This is not the only possible interpretation, however; it may be that the\u00a0variants are subject to <a href=\"http:\/\/en.wikipedia.org\/wiki\/Age-graded_variation\"><em>age grading<\/em><\/a>\u2014meaning that speakers change their use of\u00a0<em>uh<\/em> and\u00a0<em>um<\/em> as they age\u2014which does not entail that there is any change in progress. Given a change in\u00a0<em>apparent time<\/em>\u2014meaning that younger and older speakers\u00a0use the variants at different rates\u2014the only way to tell whether there is change in progress is to look at data collected at multiple time points. While the evidence is limited, it looks like\u00a0<a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=14058\">both age grading and change in progress are occurring<\/a>\u2014they are not mutually exclusive, after all.<\/p>\n<p>Unfortunately, some evidence from\u00a0style shifting problematizes\u00a0this\u00a0view of\u00a0<em>um<\/em> as a prestige\u00a0variant.\u00a0O&#8217;Connell and Kowal (2005) look at\u00a0<em>uh<\/em> and\u00a0<em>um<\/em>\u00a0by analyzing the speech of\u00a0professional TV\u00a0and radio personalities\u00a0interviewing Hillary Clinton. If\u00a0<em>um<\/em> is the more prestigious variant, then we would expect a higher\u00a0<em>um<\/em>\/<em>uh<\/em> ratio in this formal context compared to the more casual styles recorded in other\u00a0corpora. But in fact these experienced public speakers have a particularly\u00a0low\u00a0<em>um\/uh<\/em> ratio. Hillary Clinton produced 640\u00a0<em>uh<\/em>s and\u00a0160 <em>um<\/em>s, for an\u00a0<em>um<\/em>\/<em>uh<\/em> ratio of 0.250; in contrast, Mark found that on average,\u00a0<a href=\"Mark%20found that the average um\/uh ratio for female speaker in the Fisher corpora\">female speakers in the Fisher corpora favored <em>um<\/em>\u00a0more than 2-to-1.<\/a><\/p>\n<p>So why is Hillary Clinton hating on\u00a0<em>um<\/em>? Can an incoming variable be associated with women and the upper classes yet still avoided in formal contexts?\u00a0Or are we simply wrong to think of\u00a0<em>uh<\/em> and\u00a0<em>um<\/em> as variants of a single variable? Is it possible that, given\u00a0our limited\u00a0understanding of the functional differences between\u00a0<em>uh<\/em> and\u00a0<em>um<\/em>, we have failed to account for associations\u00a0between discourse demands and\u00a0social groups (or speech styles)? Perhaps Clinton just needs\u00a0<em>uh<\/em> more than we could ever know.<\/p>\n<h1><strong>Endnotes<\/strong><\/h1>\n<p>[1] This finding is so robust, it even holds in Dutch, which has very similar fillers to those of English (Swerts 1998).<br \/>\n[2] Note that, at least according to the <em>Oxford English Dictionary<\/em>, British\u00a0<em>er<\/em> and\u00a0<em>erm\u00a0<\/em>are just orthographic variants of\u00a0<em>uh<\/em> and\u00a0<em>um<\/em>, respectively. That&#8217;s not to say that they&#8217;re pronounced identically, just that they are functionally equivalent.<br \/>\n[3] Early studies geared at speech researchers\u00a0were conducted by <a href=\"http:\/\/cslu.ohsu.edu\/~heeman\/\">Peter Heeman<\/a> and Rebecca Lunsford. Other coauthors include Lindsay Olson, Alison Presmanes Hill, and Jan van Santen.<\/p>\n<h1><strong>References<\/strong><\/h1>\n<p>E.K. Acton. 2011. On gender differences in the distribution of\u00a0<em>um<\/em> and\u00a0<em>uh<\/em>.\u00a0<em>Penn Working Papers in Linguistics<\/em> 17(2): 1-9.<strong><br \/>\n<\/strong>H.H. Clark &amp; J.E. Fox Tree. 2002. Using\u00a0<em>uh\u00a0<\/em>and\u00a0<em>um<\/em> in spontaneous speaking.\u00a0<em>Cognition<\/em> 84(1): 73-111.<br \/>\nD.C. O&#8217;Connell and S. Kowal. 2005.\u00a0<em>Uh<\/em> and\u00a0<em>um<\/em> revisited: Are they interjections for signaling delay?\u00a0<em>Journal of Psycholinguistic Research\u00a0<\/em>34(6): 555-576.<br \/>\nP. Rayson, G. Leech, and M. Hodges. 1997. Social differentiation in the use of English vocabulary: Some analyses of the conversational component of the British National Corpus.\u00a0<em>International Journal of Corpus Linguistics<\/em> 2(1): 133-152.<br \/>\nM. Swerts. 1998. Filled pauses as markers of discourse structure.\u00a0<em>Journal of Pragmatics<\/em> 30(4): 485-496.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you&#8217;ve been following the recent\u00a0discussions on Language Log, then you know that there is a great deal of inter-speaker variation in the use of the fillers\u00a0uh and\u00a0um, despite their superficial similarity. In this post, I&#8217;ll discuss some published results, summarize some of the Language Log findings (with the obvious caveat that none of it &hellip; <a href=\"https:\/\/www.wellformedness.com\/blog\/how-uh-and-um-differ\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;How &quot;uh&quot; and &quot;um&quot; differ&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2,4,9],"tags":[],"class_list":["post-232","post","type-post","status-publish","format-standard","hentry","category-autism","category-language","category-sociolinguistics"],"_links":{"self":[{"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/posts\/232","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/comments?post=232"}],"version-history":[{"count":0,"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/posts\/232\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/media?parent=232"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/categories?post=232"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/tags?post=232"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}