Comments

Flat
Problems in generating the dict and monophones1 file
User: prithviraj
Date: 3/29/2018 1:17 am
Views: 2202
Rating: 0

Hi,

I am working for Odia language.I followed the step-2 of VoxForge tutorial generating dict file and monophones1 file.I have created my own lexicon file.While I am running with the command given in the tutorial it is not showing any error.But the content in dict file and monophones1 file is not in odia lnguage.

voxforge_lexicon.txt

SENT-END        []              sil

SENT-START      []              sil

ଆଠ [ଆଠ] ଆ ଠ୍ ଅ 

ଏକ [ଏକ]            à¬ କ୍ ଅ

ଚାରି [ଚାରି]            à¬š ଆ ର୍ ଇ

ଛଅ [ଛଅ]            à¬›à­ ଅ ଅ

ତିନି [ତିନି]            à¬¤à­ ଇ ନ୍ ଇ

ଦୁଇ [ଦୁଇ]            à¬¦à­ ଉ ଇ

ନଅ [ନଅ]            à¬¨à­ ଅ ଅ

ପାଞ୍ଚ [ପାଞ୍ଚ]            à¬ªà­ ଆ ଞ ଚ୍ ଅ

ଶୂନ [ଶୂନ]            à¬¶à­ ଊ ନ୍ ଅ

ସାତ [ସାତ]            à¬¸à­ ଆ ତ୍ ଅ

wlist
SENT-END
SENT-START
ଆଠ
ଏକ
ଚାରି
ଛଅ
ତିନି
ଦୁଇ
ନଅ
ପାଞ୍ଚ
ଶୂନ
ସାତ
I used the command
HDMan -A -D -T 1 -m -w wlist -n monophones1 -i -l dlog dict ../lexicon/voxforge_lexicon.txt
The content of dict file what I optained is
SENT-END        []              sil
SENT-START      []              sil
\340\254\206\340\254\240 [\340\254\206\340\254\240] \340\254\206 \340\254\240\340\255\215 \340\254\205 sp
\340\254\217\340\254\225 [\340\254\217\340\254\225] \340\254\217 \340\254\225\340\255\215 \340\254\205 sp
\340\254\232\340\254\276\340\254\260\340\254\277 [\340\254\232\340\254\276\340\254\260\340\254\277] \340\254\232 \340\254\206 \340\254\260\340\255\215 \340\254\207 sp
\340\254\233\340\254\205 [\340\254\233\340\254\205] \340\254\233\340\255\215 \340\254\205 \340\254\205 sp
\340\254\244\340\254\277\340\254\250\340\254\277 [\340\254\244\340\254\277\340\254\250\340\254\277] \340\254\244\340\255\215 \340\254\207 \340\254\250\340\255\215 \340\254\207 sp
\340\254\246\340\255\201\340\254\207 [\340\254\246\340\255\201\340\254\207] \340\254\246\340\255\215 \340\254\211 \340\254\207 sp
\340\254\250\340\254\205 [\340\254\250\340\254\205] \340\254\250\340\255\215 \340\254\205 \340\254\205 sp
\340\254\252\340\254\276\340\254\236\340\255\215\340\254\232 [\340\254\252\340\254\276\340\254\236\340\255\215\340\254\232] \340\254\252\340\255\215 \340\254\206 \340\254\236 \340\254\232\340\255\215 \340\254\205 sp
\340\254\266\340\255\202\340\254\250 [\340\254\266\340\255\202\340\254\250] \340\254\266\340\255\215 \340\254\212 \340\254\250\340\255\215 \340\254\205 sp
\340\254\270\340\254\276\340\254\244 [\340\254\270\340\254\276\340\254\244] \340\254\270\340\255\215 \340\254\206 \340\254\244\340\255\215 \340\254\205 sp
and 
monophones1 file is
sil
\340\254\206
\340\254\240\340\255\215
\340\254\205
sp
\340\254\217
\340\254\225\340\255\215
\340\254\232
\340\254\260\340\255\215
\340\254\207
\340\254\233\340\255\215
\340\254\244\340\255\215
\340\254\250\340\255\215
\340\254\246\340\255\215
\340\254\211
\340\254\252\340\255\215
\340\254\236
\340\254\232\340\255\215
\340\254\266\340\255\215
\340\254\212
\340\254\270\340\255\215
content of dlog file
WARNING: no script file ../lexicon/voxforge_lexicon.txt.ded
Dictionary Usage Statistics
---------------------------
  Dictionary    TotalWords WordsUsed  TotalProns PronsUsed
voxforge_lex        12         12         12         12
       dict        12         12         12         12
12 words required, 0 missing
New Phone Usage Counts
---------------------
  1. sil   :     2
  2. ଆ   :     4
  3. ଠ୍ :     1
  4. ଅ   :     9
  5. sp    :    10
  6. ଏ   :     1
  7. କ୍ :     1
  8. ଚ   :     1
  9. ର୍ :     1
 10. ଇ   :     4
 11. ଛ୍ :     1
 12. ତ୍ :     2
 13. ନ୍ :     3
 14. ଦ୍ :     1
 15. ଉ   :     1
 16. ପ୍ :     1
 17. ଞ   :     1
 18. ଚ୍ :     1
 19. ଶ୍ :     1
 20. ଊ   :     1
 21. ସ୍ :     1
Dictionary dict created
I have set global.ded file as it is there in the tutorial.
I am using notepad++ which supports unicode format for writing these odia words.I think it may be the problem with unicode supprot.So plz suggest.
Prithviraj
PreviousNext