Speech Synth!

Speech Synth!

Paul Pridham

Member #250

April 2000

Heh, it's cool to see that there are more of us out there.

----
<:=
The Sword of Fargoal -- Saucelifter -- www.saucelifter.com

Matt Smith

Member #783

November 2000

I've jiggered them a bit. they sound less gimpy now (I blocked the gap between my front teeth for dh1). Some of them still aren't quite right yet, but that can wait until I have a better editor.

matt-0256-2.zip

I'm surprised how much it sounds like me. This suggests that everyone should record their own set and get their gf to make one too.

I hacked at the speech string too, it's more intelligible with either allophone set

  char *str="hh1 eh ll ow pa3 pp iy pp el pa3 ax tt1 pa3 ax ll ll eh pa2 gg2 rr2 ow pa3 dd2 ao pa2 tt2 pa3 ss iy pa3 ss iy pa5 pa5 "
    "dh1 ih ss pa3 ih zz pa3 dh1 ax pa3 vv oy ss pa3 ax vv pa3 yy2 or pa3 nn1 uw2 pa3 mm aa ss pa2 tt2 er2 pa5 pa5 "
    "bb2 aw pa3 dd2 aw nn1 pa3 bb2 iy ff or pa3 mm iy pa5 ae nn1 dd1 pa3 ay pa3 ww ih ll pa3 ss pa2 pp xr pa3 "
    "yy2 or pa3 ll ay vv zz pa5 pa5 dd2 iy ff ay pa3 mm iy pa5 ae nn1 dd1 pa3 mm ay pa3 rr1 ow bb2 ao pa2 tt2 pa3 "
    "ar mm iy zz pa3 ww ih ll pa3 dd2 iy ss pa2 tt2 rr2 oy pa3 yy2 uw2 pa5 pa5 "
    "hh1 ao hh1 ao hh1 eh hh1 eh hh1 eh hh1 iy hh1 ae hh1 aa hh1 aa aa hh1 ao";

Some of the "phonemes" in the SP-0256 set are definitely diphones, is this why it's called an allophone set rather than phonemes?

Cheradenine Zakalwe

Member #2,046

March 2002

Matt: I d-loaded yor first set of phonemes...You manage to make "F***-Off" sound very convincing!!!

Paul Pridham

Member #250

April 2000

Well, I think they are called allophones because they aren't really true diphones, but special phoneme cases... like the front/back vowel related transitions for a particular phoneme.

By the way, I made a bit "nicer" of a front end for this, I'll post it all up later. I just need to make a few changes so that you can specify which allophone set to use.

----
<:=
The Sword of Fargoal -- Saucelifter -- www.saucelifter.com

piccolo

Member #3,163

January 2003

come on guy tell me how too do the link thing where do i upload :'( i want to show off my code too you know

wow
-------------------------------
i am who you are not am i

Paul Pridham

Member #250

April 2000

Well... you have to have somewhere to upload it to! I'm using my internet account's measley 5MB of webspace to store stuff. I think you may be able to start up a page on Geocities or somewhere and put files online as well, though I've never tried those free webspace providers.

----
<:=
The Sword of Fargoal -- Saucelifter -- www.saucelifter.com

piccolo

Member #3,163

January 2003

thanks ill it yahoo should be good 8-)

wow
-------------------------------
i am who you are not am i

Paul Pridham

Member #250

April 2000

OK, I've made a little front-end demo for "Speechy," and you can specify an allophone set to use from the command line. You get to type in allophones, press ENTER to play them, TAB to save the speech to "out.raw", DEL to clear the whole line, and ESC to quit. I've included the allophone sets I've made. Matt's set should also work with this as well, just copy it into a folder under the "allophones" folder like the other sets, and make a copy of the spo256.txt and call it something else for Matt's set. Also, change the name in the copy of spo256.txt file to the directory you placed the new allophones under.

Anyway, here she blows: http://www3.sympatico.ca/ppridham/misc/sounds/speechy.zip

If you want to convert the out.raw to a WAV, load it up in Goldwave or somesuch thing and save it back as a WAV. Make sure that the sampling rate you use matches that in the associated .txt file.

One thing I have noticed is that I don't think a full diphone set is needed to make the speech intelligible. Certain "fricatives" sound pretty intelligible when mixed with the various vowel and dipthong sounds... for instance: V, DH, Z, CH, SH, S, etc. basically, any unvoiced sounds seem to stand well alone.

Diphones would need to be made for most vowel-to-vowel and voiced consonant-to-vowel transitions, although I think that many of these could be made into specialized allophones or generic diphones, rather than a plethora of every possible diphone transition.

----
<:=
The Sword of Fargoal -- Saucelifter -- www.saucelifter.com

Matt Smith

Member #783

November 2000

I'm thinking that seperating the voiced and fricative parts into seperate samples would help in various ways. It would probably double the size of a phoneme/allophone set but would make a diphone set much smaller because of all the duplicates. It would also let the two parts be mixed and matched for greater variety of voices.

I'm working on my general purpose animation editor now, as that makes a good basis for a "voice tracker" too. All the allophones will need loop points so they can be synced to frame rates.

Cheradenine Zakalwe

Member #2,046

March 2002

Quote:

I'm working on my general purpose animation editor now, as that makes a good basis for a "voice tracker" too. All the allophones will need loop points so they can be synced to frame rates.

I noticed you using Shockwave Flash on parts of your site Matt (ie the News link) would this be useful in those sorts of situations??

Matt Smith

Member #783

November 2000

I'm so transparent

The Flash Editor is nearly good enough to use but would be a pain as you would have to manually create a key frame for each allophone and drag each one into place.

Ideally I'd like to generate FLA files from my editor for post-production in Flash, but only SWF is an open format, so I'll have to write them directly.

Thomas Fjellstrom

Member #476

June 2000

Quote:

I'm so transparent

Funny you should say that Matt most of your head dissapears when your eyes glow

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Cheradenine Zakalwe

Member #2,046

March 2002

Quote:

I'm so transparent

Naah! Just a case of putting Pi and root 2 together and getting... err..

Quote:

The Flash Editor is nearly good enough to use but would be a pain as you would have to manually create a key frame for each allophone and drag each one into place.

Right! Never used shockwave myself but I get what you mean.. look forward to seeing what you come up with...

dudaskank

Member #561

July 2000

Quote:

If you want to convert the out.raw to a WAV

Why not saving directly to wav? Only change this piece of code in main2.c ^__^

1if(save)
2{
PACKFILE *pfp;
int bps = speak->bits/8 * ((speak->stereo) ? 2 : 1);
int i, s;
pfp = pack_fopen("out.wav", F_WRITE);
pack_fputs("RIFF", pfp);                /* RIFF header */
pack_iputl(36+length2, pfp);              /* size of RIFF chunk */
pack_fputs("WAVE", pfp);                /* WAV definition */
pack_fputs("fmt ", pfp);                /* format chunk */
pack_iputl(16, pfp);                    /* size of format chunk */
pack_iputw(1, pfp);                    /* PCM data */
pack_iputw((speak->stereo) ? 2 : 1, pfp);      /* mono/stereo data */
pack_iputl(speak->freq, pfp);              /* sample frequency */
pack_iputl(speak->freq*bps, pfp);          /* avg. bytes per sec */
pack_iputw(bps, pfp);                  /* block alignment */
pack_iputw(speak->bits, pfp);              /* bits per sample */
pack_fputs("data", pfp);                /* data chunk */
pack_iputl(length2, pfp);                /* actual data length */
if (speak->bits == 8) {
  pack_fwrite(speak->data, length2, pfp);    /* write the data */
}
else {
  for (i=0; i < (int)speak->len * ((speak->stereo) ? 2 : 1); i++) {
    s = ((signed short *)speak->data)<i>;
    pack_iputw(s^0x8000, pfp);
  }
}
pack_fclose(pfp);  
save=FALSE;
31}

^__^

Toque a balada do amor inabalável, eterna love song de nós dois
Eduardo "Dudaskank"
[ Home Page (ptbr) | Blog (ptbr) | Tetris 1.1 (ptbr) | Resta Um (ptbr) | MJpgAlleg 2.3 ]

Paul Pridham

Member #250

April 2000

Go right ahead. You've got the source code.

----
<:=
The Sword of Fargoal -- Saucelifter -- www.saucelifter.com

dudaskank

Member #561

July 2000

My copy is changed ^__^

Is possible to "translate" a real string, like hello world, into the allophones. I mean, you type hello world, and not h eh l l oh pa4....

^__^

Toque a balada do amor inabalável, eterna love song de nós dois
Eduardo "Dudaskank"
[ Home Page (ptbr) | Blog (ptbr) | Tetris 1.1 (ptbr) | Resta Um (ptbr) | MJpgAlleg 2.3 ]

Paul Pridham

Member #250

April 2000

Yes, there is public domain code available to convert text to speech. Some code that should be easy to adapt can be found here: http://www.wps.com/products/Story-Teller/technical/T2A/

----
<:=
The Sword of Fargoal -- Saucelifter -- www.saucelifter.com

Anomalous

Member #3,112

January 2003

There is also the Microsoft speech API <hide>... really quite versatile, text-to-speech, speech-to-text, incorporates easily with TAPI. Good stuff.

_____________________________________________________________
(EDIT - spelling/grammar/presentation/emoticons/content)

Matt Smith

Member #783

November 2000

Is it versatile enough to work in Linux? ::)

Cheradenine Zakalwe

Member #2,046

March 2002

Well I'll give you ONE guess!

Matt Smith

Member #783

November 2000

I bet it doesn't have stuff like THIS!

These have a 1-to-1 relationship with the SP-0256 allophone set. This could plainly be improved by making the di/triphones use multiple frames, but it's a start.

http://www.the-good-stuff.freeserve.co.uk/allegro/speech/mouths1-0256.png

download the demo

Unzip into speechy dir

gcc -o mouthdemo.exe mouthdemo.c speechy.c voice_mgr.c -lalleg

Paul Pridham

Member #250

April 2000

Heh, whoa... that's awesome. Can't wait to see it. Just promise you won't be pasting those luscious red lips over that hairy Matt Smith avatar.

----
<:=
The Sword of Fargoal -- Saucelifter -- www.saucelifter.com

CGamesPlay

Member #2,559

July 2002

MattSmith:
The demo is missing allophones/spo256-2.txt

[edit]
I copied the original spo256.txt file, but I get some kinda error when I try to kill the program.

cl mouthtest.c speechy.c voice_mgr.c /ling alleg.lib

--
Tomasu: Every time you read this: hugging!

Ryan Patterson - <http://cgamesplay.com/>

Matt Smith

Member #783

November 2000

aha, you need to either

make an allophones/spo256-2/ dir and unzip matt-0256-2.zip in there

or edit mouthtest.c and change spo256-2.txt to spo256.txt

because Paul has removed my allophone set from his download.

Paul Pridham

Member #250

April 2000

Err... sorry, I never added your allophone set Matt, because they're not mine and didn't want to make any assumptions. Should be a simple fix-up though.

----
<:=
The Sword of Fargoal -- Saucelifter -- www.saucelifter.com