TalkDrill Team
English Learning ExpertsYou say "tree" when you mean "three." You say "dat" when you mean "that." Nobody corrects you because they understand what you mean, but every mispronounced TH, swapped V/W, or rolled R quietly signals that English isn't your comfort zone. And in interviews, client calls, or IELTS speaking tests, these small errors add up fast.
These three sounds are the most common pronunciation challenges for Indian English speakers. Research from the Journal of Phonetics found that dental fricatives (the TH sounds) are among the most frequently substituted consonants in varieties of English influenced by Hindi, Tamil, and other Indian languages (Wiltshire & Harnsberger, Journal of Phonetics, 2006). The reason isn't laziness or lack of effort. It's that your mother tongue genuinely doesn't have these sounds, so your mouth never learned the positions.
This guide gives you the exact tongue and lip placements, common mistakes to avoid, and drills you can practice alone. No vague advice. Just step-by-step phonetics explained in plain language.
Key Takeaways
The core issue is phonological transfer. A study on Indian English phonology found that speakers systematically replace unfamiliar English sounds with the closest equivalent from their native language (Sailaja, Indian English, 2009). Your brain picks the nearest match from Hindi, Tamil, Telugu, or Bengali, and your mouth follows.
Here's what happens with each sound.
Hindi, Tamil, Telugu, Kannada, and Bengali have no dental fricative. They do have dental stops (the "soft" त and द sounds where the tongue taps the teeth). So when Indian speakers see "TH," the brain maps it to that familiar dental stop. "Think" becomes "tink." "That" becomes "dat."
Hindi has a sound written as "व" that sits somewhere between English V and W. Linguists call it a labiodental approximant. It's softer than a true V but not as rounded as a W. So Hindi speakers often use this single sound for both, making "vine" and "wine" sound identical.
Indian languages use a retroflex R (tongue curled back and tapped against the hard palate) or a trilled/tapped R. English uses an approximant R where the tongue curls slightly but never touches anything. The difference is subtle but audible.
These aren't random errors. They're predictable patterns based on your first language's sound system. Once you understand why you make each substitution, fixing it becomes much simpler.
Indian English speakers systematically replace dental fricatives (TH sounds) with dental stops because Hindi, Tamil, and other Indian languages lack fricative consonants at the dental place of articulation, a pattern documented across multiple Indian language backgrounds (Wiltshire & Harnsberger, Journal of Phonetics, 2006).
English has two TH sounds, and most learners don't realize they're different. The voiceless TH (as in "think") appears in roughly 4.7% of English words, while the voiced TH (as in "the") is one of the 10 most frequent sounds in spoken English (Mines et al., Journal of the Acoustical Society of America, 1978). Getting these right dramatically improves how natural your English sounds.
Mouth position: Place the tip of your tongue lightly between your upper and lower front teeth. Not behind the teeth. Between them. About 2-3 millimeters of your tongue tip should be visible if you looked in a mirror.
The action: Blow air gently over your tongue. You should feel a soft, airy friction. There's no vibration in your throat. Put your fingers on your Adam's apple. You should feel nothing.
Common mistake: Pulling your tongue back and saying "tink" instead of "think." If your tongue touches the ridge behind your upper teeth (the alveolar ridge), you're making a T sound, not a TH.
Practice words:
Test yourself: Say "three trees." If both words sound the same, your TH needs work. "Three" should start with air flowing over your tongue. "Trees" should start with a crisp stop.
Mouth position: Identical to the voiceless TH. Tongue tip between the teeth, light contact.
The difference: This time your vocal cords vibrate. Place your fingers on your throat and hum. Feel that buzz? Keep it going while your tongue is between your teeth. That's the voiced TH.
Common mistake: Saying "da" instead of "the," or "dat" instead of "that." The D sound is made with your tongue behind your teeth, tapping the alveolar ridge. The voiced TH keeps the tongue between the teeth with continuous airflow, no tapping.
Practice words:
A helpful trick: start by saying a long "zzzzz" sound. Now, while keeping that buzzing going, slowly push your tongue forward between your teeth. The sound will shift from Z to the voiced TH. That's the exact position you need.
Say these pairs slowly, then faster. If they sound identical, you're substituting.
| Correct TH | Common Substitution |
|---|---|
| think | tink |
| three | tree |
| thin | tin |
| the | da |
| that | dat |
| then | den |
| math | mat |
| bath | bat |
| breath | bret |
| ooth (as in tooth) | oot |
The V and W distinction trips up speakers of Hindi, Urdu, Punjabi, and several other Indian languages because these languages use a single sound that falls between the two English consonants. Research on Hindi-English bilinguals showed that Hindi speakers often produce a labiodental approximant (softer than V, less rounded than W) for both sounds (Ghosh, Laboratory Phonology, 2018). Fixing this requires learning two distinct lip positions.
Mouth position: Lightly press your upper front teeth against your lower lip. Your teeth should rest on the wet-dry border of your lip.
The action: Push air through the narrow gap between your teeth and lip. This creates audible friction. Your vocal cords vibrate (V is voiced). You should feel a slight buzzing on your lower lip.
Key point: V is a fricative. There's continuous friction, like a soft buzz. It's not a gentle tap. Hold the sound: "vvvvvv." You can sustain it as long as you have breath.
Mouth position: Round your lips into a tight circle, as if you're about to whistle or say "oo." Your teeth are NOT involved at all. No teeth touching lips.
The action: Start from the rounded position and quickly glide into the vowel that follows. W is actually a glide (semivowel), not a true consonant. It's the movement from "oo" to the next vowel that creates the sound.
Key point: W cannot be sustained. Try saying "wwwww" by itself. You can't hold it. It only exists as a transition into a vowel. If you can hold the sound, you're probably making a V.
Hold your hand in front of your mouth.
If both feel the same, you're blending them.
| V Word | W Word |
|---|---|
| vine | wine |
| vest | west |
| veil | wail |
| vent | went |
| verse | worse |
| vow | wow |
| very | wary |
| vigor | wiggle |
Practice each pair back to back. Exaggerate the difference at first. For V, really press your teeth into your lip. For W, really round those lips. Over time, the positions become automatic.
Hindi speakers commonly produce a labiodental approximant for both V and W, a single sound that splits into two distinct consonants in English. The V requires teeth-on-lip friction, while the W requires rounded lips with no dental contact, a distinction absent in most Indian languages (Ghosh, Laboratory Phonology, 2018).
The English R is one of the hardest consonants for non-native speakers worldwide. A phonetic analysis of Indian English found that speakers from Hindi, Tamil, and Dravidian language backgrounds consistently use a retroflex flap or tap instead of the English approximant R, creating a distinctly "Indian" quality to their speech (Maxwell & Fletcher, Clinical Linguistics & Phonetics, 2009). The fix requires retraining a deeply automatic tongue movement.
In Hindi, your tongue curls back and quickly taps the hard palate (the roof of your mouth, slightly behind the alveolar ridge). This is a retroflex flap. In Tamil and other Dravidian languages, you might use a similar tap or even a trill (a rolled R like in Spanish).
The key feature: your tongue makes contact with something.
Mouth position: Curl your tongue tip slightly upward and backward, but do NOT touch anything. Your tongue hovers in the middle of your mouth. The sides of your tongue should lightly touch your upper back teeth (molars).
The action: Voice the sound while keeping your tongue suspended. Round your lips slightly. The sound comes from the shape of the space inside your mouth, not from any contact or tap.
Common mistake: Tapping or flapping. If your tongue touches the roof of your mouth at any point, it's not the English R. Think of it this way: the English R is like pointing at the roof of your mouth without touching it.
Here's something most pronunciation guides miss. When R comes after a vowel (as in "car," "four," "better"), many Indian speakers actually pronounce it more clearly than native British English speakers (who often drop it entirely). So in post-vocalic positions, the Indian R may actually be closer to American English. Focus your correction energy on R at the beginning of words and in consonant clusters like "three," "break," and "through."
Consistent short practice beats occasional long sessions. Research in motor learning shows that distributed practice (10-15 minutes daily) produces better skill retention than massed practice (one hour weekly) for articulation skills (Maas et al., Journal of Speech, Language, and Hearing Research, 2008). Use these drills daily for two weeks and you'll notice real change.
Read each sentence slowly. Focus on correct tongue and lip position for every bolded sound.
Minutes 1-3: TH warm-up. Alternate between "think-the-think-the" slowly. Feel your tongue move between the teeth each time.
Minutes 4-6: V/W contrast. Say each minimal pair from the list above five times. Exaggerate the lip positions.
Minutes 7-9: R practice. Say "right, really, run, around, world" five times each. Check that your tongue never taps the roof.
Minute 10: Read two sentences from the practice list above at natural speed. Record yourself and play it back.
Distributed articulation practice of 10-15 minutes daily produces significantly better retention of new speech motor patterns than equivalent total time spent in longer, less frequent sessions, according to motor learning research applied to speech therapy (Maas et al., Journal of Speech, Language, and Hearing Research, 2008).
Most adults can produce an unfamiliar sound correctly in isolation within one to two practice sessions. The real challenge is using it consistently in conversation. A longitudinal study of pronunciation training found that learners who practiced 15 minutes daily showed statistically significant improvement in target sounds within 6-8 weeks (Thomson, Studies in Second Language Acquisition, 2011). Automaticity, making the sound without thinking, takes longer.
Weeks 1-2: You can produce each sound correctly when you focus on it. Conversation speech still defaults to old habits. This is normal.
Weeks 3-4: You start catching yourself mid-word. You'll say "da-THE" as you self-correct. This awkward stage means your monitoring system is improving.
Weeks 5-8: Correct sounds appear automatically in slow, careful speech. Fast or emotional speech still triggers old patterns occasionally.
Months 3-6: New sounds become the default for most words. Some high-frequency words like "the" and "that" might still slip, especially under pressure.
What speeds this up? Immediate feedback. If someone (or something) tells you the moment you substitute a D for TH, your brain recalibrates faster than if you practice alone without correction.
The biggest mistake learners make is trying to fix all three sounds at once. Pick one sound. Spend two focused weeks on it. Then move to the next. TH is the best starting point because it appears in the most common English words ("the," "this," "that," "they"), so you get hundreds of natural practice repetitions every day.
No. Indian English is a recognized variety with its own valid phonological patterns. The goal isn't to erase your accent. It's to expand your range. In contexts where intelligibility matters, like international calls, IELTS, or interviews with non-Indian speakers, standard TH, V/W, and R pronunciations reduce miscommunication. About 26% of communication breakdowns in cross-cultural business settings involve pronunciation issues (Jenkins, The Phonology of English as an International Language, 2000).
Yes, but it takes longer. The tongue and lip positions described here are enough to produce the sounds correctly. Self-recording and playback help you self-correct. However, studies show that real-time feedback accelerates pronunciation improvement by helping learners notice errors they would otherwise miss (Saito, Language Learning, 2013). AI-based tools can provide this feedback at scale.
Neither. Focus on clarity, not accent. The TH sounds are consistent across all major English varieties. The R sound differs (British English often drops post-vocalic R, while American English keeps it), but producing a clean approximant R works everywhere. Choose one model and be consistent.
Usually because of early exposure. Children who attend English-medium schools from age 4-5 often acquire these sounds naturally through immersion. Adult learners must learn them consciously because the critical period for effortless phoneme acquisition is roughly the first 6-8 years of life (Flege, Second Language Speech Learning, 1995). But conscious learning absolutely works. It just takes practice.
Start with TH. It appears in "the," "this," "that," "they," "them," "their," "there," and "then," some of the most common words in English. You'll practice it hundreds of times daily in normal conversation. Once TH feels natural, move to V/W, then R.
Three sounds. That's all that stands between your current pronunciation and a noticeably clearer English accent. The TH requires your tongue between your teeth. The V requires teeth-on-lip friction while the W requires rounded lips. The R requires a hovering tongue that never taps.
None of these positions are difficult. They're just unfamiliar. Your mouth can make every one of these sounds right now. It just needs practice, repetition, and feedback.
Start with the 10-minute daily drill outlined above. Record yourself reading the practice sentences. Compare your recordings week over week. You'll hear the difference before anyone else points it out.
And if you want real-time feedback on every TH, V, W, and R you produce, TalkDrill's AI catches these exact pronunciation errors as you speak. Practice conversations, get instant corrections, and repeat until these sounds feel as natural as your mother tongue.
References:
Wiltshire, C. R., & Harnsberger, J. D. (2006). The influence of Gujarati and Tamil L1s on Indian English. Journal of Phonetics, 34(2), 227-243. https://doi.org/10.1016/j.wocn.2005.06.002
Sailaja, P. (2009). Indian English. Edinburgh University Press. https://doi.org/10.3366/edinburgh/9780748625598.001.0001
Maas, E., et al. (2008). Principles of motor learning in treatment of motor speech disorders. Journal of Speech, Language, and Hearing Research, 51(2), S240-S258. https://doi.org/10.1044/1092-4388(2008/007)
Practice speaking about what you just read with our AI tutor.
Get the latest English learning tips and AI insights delivered to your inbox.
Continue reading more from TalkDrill Blog