pronounced as /notice/ARPABET (also spelled ARPAbet) is a set of phonetic transcription codes developed by Advanced Research Projects Agency (ARPA) as a part of their Speech Understanding Research project in the 1970s. It represents phonemes and allophones of General American English with distinct sequences of ASCII characters. Two systems, one representing each segment with one character (alternating upper- and lower-case letters) and the other with one or two (case-insensitive), were devised, the latter being far more widely adopted.[1]
ARPABET has been used in several speech synthesizers, including Computalker for the S-100 system, SAM for the Commodore 64, SAY for the Amiga, TextAssist for the PC and Speakeasy from Intelligent Artefacts which used the Votrax SC-01 speech synthesiser IC. It is also used in the CMU Pronouncing Dictionary. A revised version of ARPABET is used in the TIMIT corpus.
Stress is indicated by a digit immediately following a vowel. Auxiliary symbols are identical in 1- and 2-letter codes. In 2-letter notation, segments are separated by a space.
IPA | Example(s) | |||
---|---|---|---|---|
1-letter | 2-letter | |||
@ | AA | pronounced as /link/~pronounced as /link/ | balm, bot (with father–bother merger) | |
a | AE | pronounced as /link/ | bat | |
A | AH | pronounced as /link/ | butt | |
c | AO | pronounced as /link/ | caught, story | |
W | AW | pronounced as /aʊ/ | bout | |
x | AX | pronounced as /link/ | comma | |
AXR[2] | pronounced as /link/ | letter, forward | ||
Y | AY | pronounced as /aɪ/ | bite | |
E | EH | pronounced as /link/ | bet | |
R | ER | pronounced as /link/ | bird, foreword | |
e | EY | pronounced as /eɪ/ | bait | |
I | IH | pronounced as /link/ | bit | |
X | IX | pronounced as /link/ | roses, rabbit | |
i | IY | pronounced as /link/ | beat | |
o | OW | pronounced as /oʊ/ | boat | |
O | OY | pronounced as /ɔɪ/ | boy | |
U | UH | pronounced as /link/ | book | |
u | UW | pronounced as /link/ | boot | |
UX | pronounced as /link/ | dude |
IPA | Example | |||
---|---|---|---|---|
1-letter | 2-letter | |||
b | B | pronounced as /link/ | buy | |
C | CH | pronounced as /link/ | China | |
d | D | pronounced as /link/ | die | |
D | DH | pronounced as /link/ | thy | |
F | DX | pronounced as /link/ | butter | |
L | EL | pronounced as /link/ | bottle | |
M | EM | pronounced as /link/ | rhythm | |
N | EN | pronounced as /link/ | button | |
f | F | pronounced as /link/ | fight | |
g | G | pronounced as /link/ | guy | |
h | HH or H | pronounced as /link/ | high | |
J | JH | pronounced as /link/ | jive | |
k | K | pronounced as /link/ | kite | |
l | L | pronounced as /link/ | lie | |
m | M | pronounced as /link/ | my | |
n | N | pronounced as /link/ | nigh | |
G | NX or NG | pronounced as /link/ | sing | |
NX | pronounced as /link/ | winner | ||
p | P | pronounced as /link/ | pie | |
Q | Q | pronounced as /link/ | uh-oh | |
r | R | pronounced as /link/ | rye | |
s | S | pronounced as /link/ | sigh | |
S | SH | pronounced as /link/ | shy | |
t | T | pronounced as /link/ | tie | |
T | TH | pronounced as /link/ | thigh | |
v | V | pronounced as /link/ | vie | |
w | W | pronounced as /link/ | wise | |
H | WH | pronounced as /link/ | why (without wine–whine merger) | |
y | Y | pronounced as /link/ | yacht | |
z | Z | pronounced as /link/ | zoo | |
Z | ZH | pronounced as /link/ | pleasure |
0 | No stress | |
1 | Primary stress | |
2 | Secondary stress | |
3... | Tertiary and further stress | |
- | Silence | |
Non-speech segment | ||
+ | Morpheme boundary | |
/ | Word boundary | |
Utterance boundary | ||
Tone group boundary | ||
1 or . | Falling or declining juncture | |
2 or ? | Rising or internal juncture | |
3 or . | Fall-rise or non-terminal juncture |
In TIMIT, the following symbols are used in addition to the ones listed above:[3]
Symbol | IPA | Example | Description | |
---|---|---|---|---|
AX-H | pronounced as /ə̥/ | suspect | Devoiced pronounced as //ə// | |
BCL | pronounced as /b̚/ | obtain | pronounced as /[b]/ closure | |
DCL | pronounced as /d̚/ | width | pronounced as /[d]/ closure | |
ENG | pronounced as /ŋ̍/ | Washington | Syllabic pronounced as /[ŋ]/ | |
GCL | pronounced as /ɡ̚/ | dogtooth | pronounced as /[ɡ]/ closure | |
HV | pronounced as /link/ | ahead | Voiced pronounced as //h// | |
KCL | pronounced as /k̚/ | doctor | pronounced as /[k]/ closure | |
PCL | pronounced as /p̚/ | accept | pronounced as /[p]/ closure | |
TCL | pronounced as /t̚/ | catnip | pronounced as /[t]/ closure | |
PAU | Pause | |||
EPI | Epenthetic silence | |||
H# | Begin/end marker |
. Jurafsky. Daniel. Daniel Jurafsky. Martin. James H.. 2000. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall. 94–5. 0-1309-5069-6.