|
Speech Synth! |
Paul Pridham
Member #250
April 2000
|
Heh, it's cool to see that there are more of us out there. ---- |
Matt Smith
Member #783
November 2000
|
I've jiggered them a bit. they sound less gimpy now (I blocked the gap between my front teeth for dh1). Some of them still aren't quite right yet, but that can wait until I have a better editor. I'm surprised how much it sounds like me. This suggests that everyone should record their own set and get their gf to make one too. I hacked at the speech string too, it's more intelligible with either allophone set char *str="hh1 eh ll ow pa3 pp iy pp el pa3 ax tt1 pa3 ax ll ll eh pa2 gg2 rr2 ow pa3 dd2 ao pa2 tt2 pa3 ss iy pa3 ss iy pa5 pa5 " "dh1 ih ss pa3 ih zz pa3 dh1 ax pa3 vv oy ss pa3 ax vv pa3 yy2 or pa3 nn1 uw2 pa3 mm aa ss pa2 tt2 er2 pa5 pa5 " "bb2 aw pa3 dd2 aw nn1 pa3 bb2 iy ff or pa3 mm iy pa5 ae nn1 dd1 pa3 ay pa3 ww ih ll pa3 ss pa2 pp xr pa3 " "yy2 or pa3 ll ay vv zz pa5 pa5 dd2 iy ff ay pa3 mm iy pa5 ae nn1 dd1 pa3 mm ay pa3 rr1 ow bb2 ao pa2 tt2 pa3 " "ar mm iy zz pa3 ww ih ll pa3 dd2 iy ss pa2 tt2 rr2 oy pa3 yy2 uw2 pa5 pa5 " "hh1 ao hh1 ao hh1 eh hh1 eh hh1 eh hh1 iy hh1 ae hh1 aa hh1 aa aa hh1 ao"; Some of the "phonemes" in the SP-0256 set are definitely diphones, is this why it's called an allophone set rather than phonemes? |
Cheradenine Zakalwe
Member #2,046
March 2002
|
Matt: I d-loaded yor first set of phonemes...You manage to make "F***-Off" sound very convincing!!! |
Paul Pridham
Member #250
April 2000
|
Well, I think they are called allophones because they aren't really true diphones, but special phoneme cases... like the front/back vowel related transitions for a particular phoneme. By the way, I made a bit "nicer" of a front end for this, I'll post it all up later. I just need to make a few changes so that you can specify which allophone set to use. ---- |
piccolo
Member #3,163
January 2003
|
come on guy tell me how too do the link thing where do i upload i want to show off my code too you know wow |
Paul Pridham
Member #250
April 2000
|
Well... you have to have somewhere to upload it to! I'm using my internet account's measley 5MB of webspace to store stuff. I think you may be able to start up a page on Geocities or somewhere and put files online as well, though I've never tried those free webspace providers. ---- |
piccolo
Member #3,163
January 2003
|
thanks ill it yahoo should be good wow |
Paul Pridham
Member #250
April 2000
|
OK, I've made a little front-end demo for "Speechy," and you can specify an allophone set to use from the command line. You get to type in allophones, press ENTER to play them, TAB to save the speech to "out.raw", DEL to clear the whole line, and ESC to quit. I've included the allophone sets I've made. Matt's set should also work with this as well, just copy it into a folder under the "allophones" folder like the other sets, and make a copy of the spo256.txt and call it something else for Matt's set. Also, change the name in the copy of spo256.txt file to the directory you placed the new allophones under. Anyway, here she blows: http://www3.sympatico.ca/ppridham/misc/sounds/speechy.zip If you want to convert the out.raw to a WAV, load it up in Goldwave or somesuch thing and save it back as a WAV. Make sure that the sampling rate you use matches that in the associated .txt file. One thing I have noticed is that I don't think a full diphone set is needed to make the speech intelligible. Certain "fricatives" sound pretty intelligible when mixed with the various vowel and dipthong sounds... for instance: V, DH, Z, CH, SH, S, etc. basically, any unvoiced sounds seem to stand well alone. Diphones would need to be made for most vowel-to-vowel and voiced consonant-to-vowel transitions, although I think that many of these could be made into specialized allophones or generic diphones, rather than a plethora of every possible diphone transition. ---- |
Matt Smith
Member #783
November 2000
|
I'm thinking that seperating the voiced and fricative parts into seperate samples would help in various ways. It would probably double the size of a phoneme/allophone set but would make a diphone set much smaller because of all the duplicates. It would also let the two parts be mixed and matched for greater variety of voices. I'm working on my general purpose animation editor now, as that makes a good basis for a "voice tracker" too. All the allophones will need loop points so they can be synced to frame rates. |
Cheradenine Zakalwe
Member #2,046
March 2002
|
Quote: I'm working on my general purpose animation editor now, as that makes a good basis for a "voice tracker" too. All the allophones will need loop points so they can be synced to frame rates. I noticed you using Shockwave Flash on parts of your site Matt (ie the News link) would this be useful in those sorts of situations?? |
Matt Smith
Member #783
November 2000
|
I'm so transparent The Flash Editor is nearly good enough to use but would be a pain as you would have to manually create a key frame for each allophone and drag each one into place. Ideally I'd like to generate FLA files from my editor for post-production in Flash, but only SWF is an open format, so I'll have to write them directly. |
Thomas Fjellstrom
Member #476
June 2000
|
Quote: I'm so transparent Funny you should say that Matt most of your head dissapears when your eyes glow -- |
Cheradenine Zakalwe
Member #2,046
March 2002
|
Quote: I'm so transparent Naah! Just a case of putting Pi and root 2 together and getting... err.. Quote: The Flash Editor is nearly good enough to use but would be a pain as you would have to manually create a key frame for each allophone and drag each one into place. Right! Never used shockwave myself but I get what you mean.. look forward to seeing what you come up with... |
dudaskank
Member #561
July 2000
|
Quote: If you want to convert the out.raw to a WAV Why not saving directly to wav? Only change this piece of code in main2.c ^__^
^__^ Toque a balada do amor inabalável, eterna love song de nós dois |
Paul Pridham
Member #250
April 2000
|
Go right ahead. You've got the source code. ---- |
dudaskank
Member #561
July 2000
|
My copy is changed ^__^ Is possible to "translate" a real string, like hello world, into the allophones. I mean, you type hello world, and not h eh l l oh pa4.... ^__^ Toque a balada do amor inabalável, eterna love song de nós dois |
Paul Pridham
Member #250
April 2000
|
Yes, there is public domain code available to convert text to speech. Some code that should be easy to adapt can be found here: http://www.wps.com/products/Story-Teller/technical/T2A/ ---- |
Anomalous
Member #3,112
January 2003
|
There is also the Microsoft speech API <hide>... really quite versatile, text-to-speech, speech-to-text, incorporates easily with TAPI. Good stuff.
_____________________________________________________________ |
Matt Smith
Member #783
November 2000
|
Is it versatile enough to work in Linux? |
Cheradenine Zakalwe
Member #2,046
March 2002
|
Well I'll give you ONE guess! |
Matt Smith
Member #783
November 2000
|
I bet it doesn't have stuff like THIS! These have a 1-to-1 relationship with the SP-0256 allophone set. This could plainly be improved by making the di/triphones use multiple frames, but it's a start. Unzip into speechy dir gcc -o mouthdemo.exe mouthdemo.c speechy.c voice_mgr.c -lalleg |
Paul Pridham
Member #250
April 2000
|
Heh, whoa... that's awesome. Can't wait to see it. Just promise you won't be pasting those luscious red lips over that hairy Matt Smith avatar. ---- |
CGamesPlay
Member #2,559
July 2002
|
MattSmith: [edit] cl mouthtest.c speechy.c voice_mgr.c /ling alleg.lib -- Ryan Patterson - <http://cgamesplay.com/> |
Matt Smith
Member #783
November 2000
|
aha, you need to either make an allophones/spo256-2/ dir and unzip matt-0256-2.zip in there or edit mouthtest.c and change spo256-2.txt to spo256.txt because Paul has removed my allophone set from his download. |
Paul Pridham
Member #250
April 2000
|
Err... sorry, I never added your allophone set Matt, because they're not mine and didn't want to make any assumptions. Should be a simple fix-up though. ---- |
|
|