[asterisk-users] cepstral vs festival (MRCP)

John Todd jtodd at digium.com
Wed Dec 3 10:03:49 CST 2008


On Dec 2, 2008, at 6:55 PM, Erik (Caneris) wrote:

>>
>> Erik -
>>   Have you found RealSpeak to be worth the cost?
>>
> Actually my last note was probably a bit misleading because in the  
> particular cases I mentioned RealSpeak, the platform wasn't Asterisk  
> and Cepstral wasn't even on the radar.
>
>> Can Cepstral, with
>> the hourly $ spent on tuning, be made to be a reasonable substitute?
> Nuance would say "no" :)

Of course, and perhaps they're right in some circumstances.   But I  
don't think I'd be able to predict in what percentage of cases that's  
true.

> I'd say "maybe". Call up +14164854854, it's a recent project we did  
> for a client using Asterisk, Cepstral, and a lot of custom code.  
> It's a free phone-in service that allows folks to get local traffic,  
> weather, news, commuter transit, border crossing wait times, and  
> more. There's obviously quite a bit of domain-specific, dynamic,  
> constantly changing text, so this is certainly an example of pushing  
> it to the max. Just think of all the street names it has the  
> potential to mispronounce.
> It's a work in progress, but it's very promising. Definitely an  
> example of a lot of "hourly $ spent on tuning" as you put it.

Sounds decent.  Some inter-word delays might be in order, but I'm sure  
that's how you're earning your keep.

>> My results: The RealSpeak sample was more clear than the Cepstral.
> Depends on what you mean by "more clear". As Brent Davidson  
> mentions, make sure you're comparing 8khz to 8khz, or similar. If  
> you mean it pronounces things better, then I agree.

Of course, my test was hardly scientific.  But I re-tested at 8khz for  
both voices, and both myself and someone else in the room (a "non- 
expert") were not overwhelmed with the quality difference between the  
two voices.  Totally subjective, but an apples-to-apples comparison.

>> That being said, I'd really be interested in hearing if anyone has
>> done a RealSpeak-to-Asterisk conduit.  I wasn't able to quickly
>> uncover how they interact with third-party systems - is it VoIP?  A C
>> library?  Some sort of HTTP socket?  The more methods we can get
>> working with Asterisk, the better, because not every implementation  
>> of
>> a voice system has the same requirements...
>>
> MRCP is the standard for interfacing with ASR and TTS engines  
> (including RealSpeak) in other platforms. Brief Googling reveals a  
> previous flame war on asterisk-dev regarding MRCP. I have no idea if  
> it's implemented in Asterisk now.


No, it is not currently implemented.  Note, though, that someone in  
another post mentioned that they had built an app_realspeak, and I'll  
try to follow up with that.

However, that doesn't mean that it shouldn't be implemented.  This is  
an area in which I think there is a disproportionate amount of "non- 
discussion", since many people who would use or be interested in MRCP  
simply don't participate in the Asterisk project because it doesn't  
meet their needs out of the gate.  Therefore, we see few people asking  
for it, in a self-fulfilling loop.

Is MRCP something that is significantly lacking in Asterisk?  Is it a  
difficult protocol to implement?  Is there anyone here on -dev with  
the experience to do it?

JT

---
John Todd
jtodd at digium.com        +1-256-428-6083
Asterisk Open Source Community Director







More information about the asterisk-users mailing list