Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
News

Ask Slashdot: Linux and Telephony 153

This one is a doosy. I've received various submissions from people who were looking for information on how to make their Linux box into an answering machine. I've also received submissions asking about Voice Synthesis and Speech-To-Text. I have to admit I haven't found much information on either while browsing on the net, so I'm turning the question over to you folks. However I wonder if there isn't a issue hidden here? Can Linux be used as an Interractive Voice Response(IVR) platform? If not, why not?
First off, let's NOT forget the actual questions:

Metiu and Sri both want to know if a Linux box with a voice modem can be used as an answering machine.

Gextyr is looking for information on Voice Synthesis packages that are available for Linux.

This Clan AC Member wants to know if there are any applications or APIs for Linux that deal with Speech-To-Text or Text-To-Speech.

Lastly, there have been quite a few submissions asking whether or not Linux can be used as a demand fax server. Can it?

If Linux can be used for all of the things above, what's stopping it from performing as an IVR system? IVR systems are simply systems designed to use a telephone as the computer interface (using both touch tones and voice). IVR systems are used everywhere, from your voice mail, to ordering systems, and corporations are adopting more and more IVR systems for various tasks.

I've seen IVR implemented on DOS systems but most of these have moved to NT. What's preventing Linux from operating in this market? Are there existing IVR projects in progress, or is this another area where Linux falls behind?
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Linux and Telephony

Comments Filter:
  • by Anonymous Coward
    I set up my Linux box as my answering machine several months ago, and it works just fine. If you're using redhat the tools are already there. Just edit voice.conf in /usr/lib/mgetty+vgetty. You'll also need to start the vgetty process on that com port. I have it set to receive faxes and voice. You can also have it do data connections. I didn't need this since I have a cable modem. I also set up a script to convert incoming messages to wav format, and put them on a password protected web page, so I can check my voice messages from anywhere on the net.

    If you want a full voice mail system, search for mwm, which is a bunch of scripts that sit on top of vgetty.

    I use a USR Sporster Voice, and the sound quality is not great (8bit 8hz). Check the vgetty FAQ for better quality modems.

    Linux, it's my router, it's my firewall, it's my web server, it's my development platform, it's my answering machine, it's the brain of my recording studio.

    It doesn't make my coffee, yet...
  • It's very easy to implement an answering machine
    using ISDN and vbox. Vbox can be found at
    http://www.mayn.de/michael/vbox/
    The thing can be configured to listen to arbitrary
    numbers and respond automatically after a certain
    number of rings.
  • There is `Festival' which does turn text into pretty normal sounding voice. You can plug in different voices. And there is `say' (part of `rsynth') which turns your multi-kilo-buck PC into a Speak-and-Spell. Don't have URLs but both are Debian packages.

    Browsing Debian's packages also turns up a few others so this like, all Linux solutions, begins with: ``well, first install Debian...''.

    -Brett.

  • If you are looking to get the same results you could use xringd. It runs commands based on incoming ring patterns to your modem. xringd [unc.edu]
  • There's been a Mini-HOWTO on the subject for quite some time... shame on you for not reading your documentation!

    http://metalab.unc.edu/LDP/HOWTO/mini/Coffee.htm l
  • Posted by hersh:

    I heard recently that some people at CMU will be porting the Sphinx 2 speech recognition system (developed there) to Linux. Not sure about licensing though.
  • Posted by Aven:

    I had been wanting to set up a house answering machine with a WWW gateway for some time. I started to write something myself, but found that there are a few utilities out there that do the job for you. I haven't played with mvm too much, but it looks promising. vgetty seems to work pretty well despite having to write the frontend in shell scripts.

    http://alpha.greenie.net/mgetty/
    http://www-internal.alphanet.ch/~schaefer/mvm/

    You'll also need rsynth or some type of text-to-speech package for mvm to work. Good luck.
  • Posted by RichDrewes:

    I have hacked together an IVR/answering machine that uses an ordinary modem (not Zyxel type voice modem) and a SoundBlaster type soundcard in Linux. This requires construction of a simple circuit ($5 in parts from Radio Shack) to interface the sound card to the phone line, and a bit of software. I have coded a fast fourier transform DTMF (touchtone) recognizer and I use an 'expect' script for the call flow. If there is sufficient interest I can make a web page with a circuit description and post the code.
  • Posted by smich:

    Wasn't there an article in LJ a year or two ago about setup like this? Maybe the guy was just using Linux to control his phone system, but I remember something about voice synthesis.

    If you knew my memory...
  • I'm amazed that so many people here say they were able to get vgetty working; you guys must all lead charmed lives, because I spent months fighting with it, and couldn't ever get it to correctly answer the phone and record a message twice in a row.

    I settled for having my machine simply use the modem to listen to the caller ID info, and pop up a big old dialog box telling me who's calling, then let my real answering machine take the call.

    Features:

    • I can read the dialog box from across the room;
    • It also checks for the incoming number in my address book ( BBDB [jwz.org]);
    • The phone doesn't ring if it's late at night or early in the morning and the screen saver [jwz.org] is active;
    • It securely logs calls to my web server, so if I'm at another site, I get asynchronous notification that someone has called me at home, and I know to call in and check my messsage!

    Works pretty good. Get the code and read all about it [jwz.org].

  • Why bother using a computer as an answering machine at all? Just buy an $30.00 digital answering machine instead. Spend a little more ($40-50 dollars) and you'll get multiple mailboxes on some of these things along with caller ID...
  • You're right. Who really needs TAPI? Just buy an $30.00 digital answering machine and free up your computer....
  • by robin ( 1321 )
    Look into xringd. This can recognise complex sequences of rings.
    --
    W.A.S.T.E.
  • http://www.linuxtelephony.org/ [linuxtelephony.org] and
    http://www.opentelecom.org/ [opentelecom.org]

    Neither of them answer *my* basic question, which is how to add touch tone response to a web based application I'm working on.
  • I'm writing an application where people enter stuff either on the web or through a touch tone phone. The connection is through a database that either application can update.
  • What I need is something that can put up an audio menu along the lines of "press 1 for foo, 2 for bar, 3 for qux", and call the appropriate C/perl functions or shell scripts and put up another audio menu. Is this what they mean by IVR? I don't need to recognize verbal or non-touch-tone responses, just touch-tone.
  • by Hans ( 1533 )
    The Voxilla Project (http://www.voxilla.org) is working on some stuff, but the web pages are a bit outdated unfortunatly...
  • Really? That's good news. The last time I had checked Dialogic was refusing to support Linux at all. I used to work for West Interactive, which is probably the largest VRU company around. They had several hundred VRUs with a couple of single T1 Dialogic boards apiece, and they've only gotten larger since I left. They were using SCO on all of their systems, and their own in house software to drive the calls. At one point I was looking into Linux support for the Dialogic cards because I was toying around with the idea of setting up a few VRUs of my own and going into business dealing with some of the smaller customers that West didn't like to handle because of their smaller size. SCO would have been all right, but I saw Linux as an advantage, particularly with being small, because being open and easily customizable, it would be easier to have the kind of flexability that you often need as the smaller company, but they had no Linux support at all at the time. Ended up deciding against the whole idea anyway, but it's nice to know that Dialogic came around.
  • Well, SOMETHING that I know a bit about...

    First off.. IVR CANNOT be handled by a Voice Modem. Now before you startup the flamethrowers give me a sec... I am dealing with MINIMUM 1 incoming T1, and we have a couple boxes with 8-10 T's... That is up to 230 incoming calls at once, and not one line at a time.

    The current KING of the hill (in hardware) is DIALOGIC. BUT Dialogic is DEEPLY in bed with Microsoft. People have been pressuring Dialogic for about 6 years to come out with drivers for Linux, but nothing yet. There have been a couple times when activity on the Dialogic mailing list where rumors have been flying about (last one was something was supposed to happen March 25th), but again NOTHING!

    There are a couple companies that seem to be embracing Linux... Pika (Ya! Canadian!), Acculab, and NMS. I suspect that Rhetorex will be the next one to throw their hat into the ring, and the only one missing is Dialogic.

    Now for the next important thing... IVR Software. Currently there is nothing out there to handle the IVR back end. BUT there is ample open source languages that could be extended. Database support used to be an issue, now with pretty much every major database on Linux, that is not a problem anymore.

    Once you have your basic "Interface" card working, there are a whole bunch of other cards you can add in to get the extended support that you want. SCSA pretty much the standard for connecting cards. Once you have the card, you need the firmware.. Hopefully they will port it over!

    Anyway, enough ranting... summary:

    ACCULAB, PIKA, and NMS have support for Linux. Dialogic needs to be beaten over the head with a wet fish and have some sense beat into them!
  • now, now. there are drivers for the pika line of telephony cards (http://www.pika.ca) that support fax (without artifacts like modems) voice, switching, tode detection, caller id, etc. You can find a Linux based phone system that can be used with these cards on http://www.tycho.com. It's called ACS.

    You can also use the AT+V command set with any voice modem card. I have the chase PCI-RAS4 at home running under linux and it works quite well.

    The reason I haven't had time to work on voxilla is that I have been working at VA Research fo the last 6 months and they keep people busy. I'll update the site tonight with a bunch of links to the developments that have been going in all over the place that almost no-one has asked about.
  • Pika, Aculab, and NMS all have cards that support switching...but they are also all in early beta as far as driver development goes. The best you can do about switching at the moment is to lobby pika and aculab to open source their drivers or work with nms on porting their open source driver to one of their cards that does support switching.
  • I wrote some audio widget software that
    used festival fot text-to-speech and some
    custome hardware for DTMF decoding. It is
    in perl and handels menus quite nicly right
    now I have not realeased it only becouse I
    wanted to make it work with a voice modem
    instead of with my sound card and custom
    hard ware but I do not have a voice modem
    so it hasent happed. I would gladely give it
    out under the GPL if anny one was instread.
  • AT&T (was Olivetti) Research Labs in Cambridge used news broadcasts with simultaneous captions to train their voice mail search program. ISTR they found it the cheapest way of getting both text and speech.
  • Wish I'd though of that sooner. You know what the freqs are?

    /dev
  • You ever used the UPS tracking number system over the phone? It's really really good. You can reattle off the number in a natural voice and as fast as you like and it gets it damn near every time. Granted it's going character by character, but I think it demostrates what you can do with the BW that the phone allows.

    BTW, IBM via voice for Linux beta SDK is out for free. I think they said it ships with RH6.0 on the app disc. I downloaded it from IBM, but at 40meg it's quite a hit for a modem.

    /dev
  • Where do you buy the hardware and how much does it cost?
  • by vinn ( 4370 )
    A while ago I needed to do some crazy stuff like this. I put some pieces together that actually worked fairly well.

    I used a Multitech 5600 ZDXV voice modem. I
    highly recommend Multitech's if for no reason
    other than their no-hassle/no-RMA return policy
    and 10 year warranty.

    Then I used vgetty to handle answering the phone,
    recording stuff, and decoding DTMF tones.

    It all works, make sure you have ALL the vgetty
    patches. Maybe join the vgetty mail list. If
    you don't like hacking scripts and crap together
    you won't enjoy setting this up. vgetty is
    HIGHLY undocumented.
  • It doesn't make my coffee, yet...

    Well, slacker, get cracking! You could probably wire a soft power switch to a serial port or something and use your carrier to indicate wheter the switch should be on or off...

    0 8 * * * 1-5 /home/me/coffeemaker -1
    0 10 * * * 1-5 /home/me/coffeemaker -0


    Or go for the whole thing and wire your sockets with X10..
  • Linux has proved itself to be a very stable OS. That's the main reason I have chosen it as my primary OS. At work, however, we develop our telephony software for Windows NT. This is because major players in the IVR hardware industry have chosen to ignore Linux as a possible platform for their buyers' systems. I would _love_ to be developing for Linux and getting paid, but Dialogic (probably the biggest telephony hardware producer) will not write drivers for Linux nor will they release hardware specs. Not even under an NDA. This has kept Linux out of a market where it has much potential. IVR hotline systems are very commonly mission-critical machines and, as most of you are aware, NT isn't the most stable platform in existence. I have had to develop complicated startup and shutdown processes for our machines in the field so that they may babysit themselves. We also require pcAnywhere to be installed on the systems in order to fix them when NT decides to get ugly. VNC would fill that requirement on a Linux box. Now all we need is the help of some complacent hardware companies and the Linux community will be on its way!

    (Somehow, I'm not too hopeful. Let's see the slashdot effect in action...)
  • That is cool. Yeah!
  • At least Dialogic has started to develop their CTI products for SCO Unixware - can Linux be far behind?
  • You can get a box which will emit those three, whiney annoying tones that tell a dialling computer that the phone line doesn't exist whenever you pick up the phone. The systems then take you off of thier list.


    Orcslicer
  • The URL should have read http://www.cis.ohio-state.edu/hypertext/faq/usenet /fax-faq/mgetty+sendfax+vgetty/faq.html
  • There also is a nice graphical frontend for vgetty/sendfax installations named PalMail. It hasnt been developed for ages but its pretty configurable and it worked for me for years.


    Fionn

  • Using Dialogic's D/41E Linux can use the (slightly modified) Dialogic APIs to create a fully functional IVR/phone system... While the boards are not cheap, they do offer performance...
  • I've asked about this before. I wouldn't mind donating some cycles to the project. I have a contract that would work very well with Speech input. Mostly I need a system that will record/recognize stuff for data capture in real time with offline verification/correction. Basically a voice dump/sink. I have also heard that Dictaphone was supporting an offline/after the fact server for speech to text.

    Will just wants to yell at his robots 8^)
  • Are there notes made available from these meetings? I have a class in central mass tonight that I need to go to. Is there additional info available on BVU?
  • I was thinking of putting it out on the Net, but due to creeping featurism the code is embarassing. :) (That and I find little time to fix things that aren't quite broken)


    Isn't that the point of the open source initiative? To release the code so others can have a crack at it, to iron out all these bugs? I say 'release the code!'
  • by dario ( 9486 )
    You could avoid DTMF by counting "RING"s from
    the modem. A script which starts ppp when it
    notices, e.g. three rings, pause, four rings,
    would do nicely. That way you can even call from
    another country and not spend a penny for the
    phone call (assuming that the modem will see as
    many "RING"s as you hear on your end).

    --d
  • Even when my questions get answered, they don't get answered. A while back, I asked "Ask Slashdot" about a similar application, but all I wanted it to do was to read DMTF codes and complete a set of commands based on those codes.

    Example:
    I call up my linux box (which I can't keep dialed in 24-7)
    It picks up and plays some random audio file, piped into the phone line.
    "Press 1 to start ppp interface"
    I press 1.
    It hangs up. I do the same.
    I wait a minute, and then ssh into my machine, thanks to the wonders of a dynamic DNS service and a remotely mailed ifconfig dump.

    If this is possible, I'd love to know how.

    CK
  • I've been using mgetty+sendfax/vgetty for about three years. There has been a flurry of new code recently; the new version of vgetty is highly scriptable, and I've haven't even scratched its potential. Look up:

    http://alpha.greenie.net/mgetty/index.html
    http://www.leo.org/~doering/mgetty/index.html

    I've been using 1.1.20, with the ZyXel 1496E+ .
    I use it for voicemail, incoming logins (i.e. data) and faxes (in and out). Since I work at two locations connected at T1 speed, I like being able to hear my voice messages at each place. And I can also dial it up like a standard answering machine to hear my messages (that I have to re-configure).
    I have a number of users who dial in to read their e-mail and do some (slow...) surfing; ppp works
    well. The current vgetty appears to be quite stable, and doesn't slip into recording white noise instead of voice as much. My PII-350 box runs Slackware 3.6.

    What I would REALLY like to see is a voice-mail system which can connect a voice channel over the LAN to another CPU selected according to the tones punched in by a caller. That sort of thing has been developed for MS ...
  • For those of you looking for a commercial application try: http://www.entropic.com/ they specialize in EXACTLY what your looking for, not only speech recogition software but for TELEPHONY systems !

    A nice Open Sourced options is EARS, one of the few.. and probally one of the first UNIX speech -> text applications. Although I personally have had some difficulty compiling it under Linux. http://www.tmt.de/~stephan/ears.html


    Also, there are many output programs, the best one available is called festival and can be found in above comments and on freshmeat, but is somewhat bloated. Alternatives maybe found in the "say" program, which is distributed with "GxEdit" (find it on freshmeat as well), Emacs text->speech: http://www.cs.vassar.edu/mirror/emacspeak/emacspea k.html, and DECtalk at: http://www.ultranet.com/~rongemma/indext.htm


    If none of these are suited for your needs try:
    http://www.bright.net/~dlphilp/linux_soundapps.h tml#speech

    The Linux sound/midi page is a valuable tool ! :)

  • FYI, there's a mailing list about
    speech recognition on Linux. The
    home page for the list is
    http://leb.net/ddlinux/

    From the "Current Status":

    Discussion on ddlinux is currently on hiatus while we wait for various open-source speech recognition engines to mature (a process which is likely to
    result in something useful to us in mid-1999). Ddlinux therefore operates as an announcement list, so traffic is very low.

    They have a few useful links and discussions.

    Emre |=^)
  • vgetty can do this. Just add you your voice.conf:

    dtmf_program /usr/local/bin/dtmf.sh

    /usr/local/bin/dtmf.sh will get called with all the numbers you typed.
  • A group of medium to big players in the telecommunications business have released some of their software with open source licensing. You still have to buy the hardware though. NMS (Natural MicroSystems) has very recently released software and drivers for their hardware for Linux. For more information please check http://www.opentelecom.org/

  • Anybody know what's going on with speech recognition? There are now several good, cheap speech rec packages in the commercial shrink-wrap world (at least they claim to be good, I haven't tried any of them myself) in the sub-$100 range.

    Last I heard, the only open-source thing of this sort was something called 'ears'. I've never heard of anybody actually using it, which suggests it might perform underwhelmingly (tho, again, I've never tried it myself so this is just a guess).

    A good OSS speech-rec program would have all kinds of uses. It would be a no-brainer for the wearable-computer people. It would be great for any PDA with a microphone. Maybe speech recognition could ease some peoples' fears of the command line, or avoid wrist injuries.

    There is some relevant stuff in the FAQ for comp.speech. A web search for "speech recognition", "phoneme", and "hidden markov model" turned up a lot of interesting hits.

  • I've had pretty good luck with Festival for Text to Speech. I think I had to hack an include file here and there to get it to compile on SuSE, but I was able to get it to compile none the less. I have it read a fortune from my fortune file when ever I open a shell or log in. It sounds a little like an altra advanced Speak and Spell with a slight Scottish lint. Loads of fun.

    One of these days I'm going to see if I can get it to work from my Netwinder.
  • http://www.gel.usherb.ca/grpetudiants/speechi/
  • Isn't stepping on big toes what Linux is all about?
  • Not that this answers any of the questions, but we recently looked at adding more storage to our voicemail system. The vendor wanted $6,000 to add what was essentially a 500 MB hard disk.

    We then looked at a PC-based voicemail system that would integrate with our LAN mail package, and the stumbling block there wasn't storage as disk space is so cheap when you're buying computer disks, but all the proprietary interface cards for interconnecting our Nortel switch. We would have needed something like $5k worth of boards (all ISA, no PCI available) to do the interfacing.

    I'm guessing this is going to be a stumbling block for any linux-based system. Most PBX systems are pretty proprietary and so are the interfaces, which means that not only are you single-sourcing interfaces but you're paying a fortune for them as well.

    I don't know how voice modems get the voice part into the computer -- through the serial port or through the sound card -- but if it's through the serial port, you could probably do a standalone voicemail-only system with a multiport serial card and a bunch of voice modems.

    You'd still be screwed when it came to interfacing with PBXs, though. You could do the voicemail, but there would be no access to the PBX itself, which means you could only leave messages. No rollover to operators, calling back into the PBX, etc.

    Most organization that have PBXs have also sunk a ton of money into them, so convincing people that the low-cost linux voicemail system would be a good investment when they'd throw out PBX features might be a tough sell.


  • Take a look at The MBROLA Project [fpms.ac.be].. I played around with text-to-speech a few months ago and this was one of, if not the best I found. "Freely Available multilingual synthesizer!"
  • Yes, please do so. I would love more info on how you did this.
  • If you can put together a reliable Fax on Demand system under linux for less than $3000 up front and less than $1200 a year in support contracts then you have a viable chance to start your own company. And I'm talk about SOFTWARE prices only this excludes the $4000 to $15,000 in hardware costs!!

    Reliable is defined as your software crashes less often than Windows-NT.
  • I played with Abbott Demo a while back. It was only a technology demonstration. It was more amusing than useful. About 20% errors with no speech training.
  • by Jose ( 15075 )
    can anyone point me to a FAQ, or a HOW-TO on howto set up my box to act as an ISP...what I want to do is to be able to dial in to home, and connect to the net from my box. I have an ethernet connection to the net.
  • I worked for Dialogic HQ a few mounths back as a Co-op. The idea of Linux drivers was always being thrown about the table. There is a small group of people that is currently working on porting the Dialogic current UNIX drivers over to Linux. I have no idea the status of this project. (I left while it was still a twinkle in Howard Bubbs Eye) I know however that some people have gotten Dialogic cards (the low end one mostly D41E, D41d etc) up and running. You may want to toss the question up onto the descusion board on the support web site. http://support.dialogic.com

  • Just for the Record Dialogic was not "gobbled" up by MS. They where commissioned by them to write the new TAPI API

    -Yuk
  • Dialogic has has several UNIX drivers for quite sometime now. Unfotunately they seem to be dragging there feet when it comes to Linux.

    -Yuk
  • You might want to start with the comp.speech FAQ:

    http://www.speech.cs.cmu.edu/comp.speech/ [cmu.edu]

    In particular, take a look at:

    http://www.speech.cs.cm u.edu/comp.speech/Section5/Q5.5.html [cmu.edu]


    Two speech synthesis programs I have played with are:

    rsynth: ftp://svr-ftp.eng.cam.ac.uk/ pub/comp.speech/synthesis/ [cam.ac.uk]

    Festival: http://www.cstr.ed.ac.uk/projects/f estival.html [ed.ac.uk]
  • One can find the tarball "rsynth-2.0.tgz" at the Metalab Linux archives in the directory apps/sound/speech. It "tries" to speak through your existing sound card. I say "tries" because I'm getting buffer overruns that the program isn't accounting for. Those with the Cheapbytes Linux Archive CD set for Winter 1999, the tarball is on the second disk under the same directory.

    For those with the DECtalk card, one might consider "emacspeak" available from the Debian or Slackware distributions.
  • In the June 1999 Popular Electronics magazine they have an article which shows how to build a Tele-computer Controller and it allows you to have ring detection and dtmf detection for only a $57 kit probably around $20 in parts. And it hooks up to a parallel port. I'm going to be using a modified circuit to build a mini pbx system for my house. http://www.geocities.com/sil iconvalley/foothills/1897/ [geocities.com]
  • If you do "festival --tts" or "festival --tts filename" it bypasses all that lisp stuff.
    It will read out the contents of filename. I have
    set it up with a female voice and use netcat to
    attach it to a port on my client machine, then
    other computers on the network can telnet to it
    (using expect) and send ascii words that get spoken on my workstation. E.g., "I cannot ping
    the mailserver at so and so". Well, I think it
    is interesting. tts of this quality used to cost
    thousands.

  • While IBM is making very aggressive moves to fix this gap, Linux currently lacks the software which would connect it to big blue enterprise systems. A large Bank for example might need to make a query using LU 6.2 (SNA) into its mainframe. Many Airlines run their CRS (Customer Reservation systems) on Unisys mainframes which uses some more obscure protocols (even if theey are IP based) to connect into its boxes.

    The second problem is that a lot of the IVR systems are custom built by system integrators, who are usually more familiar with NT or OS/2 (large in the IVR market) than with NT.

    Are there any moves been made by an existing IVR vendor to port their systems over to linux?

  • Very cool. I just tried xringd, and it seems to work pretty well. The source is pretty simple, too.
  • I emailed a question to AskSlashdot which hasn't gotten posted. The question goes: are there any combination hardware-API packages available for developing CTI on Linux?

    Our choice platform used to be Dialogic with their nice hardware or Talking Technologies, Inc. with their Powerline II card on MS-DOS (the Windows NT platform was never stable enough and we used to get complaints about that), but I can't seem to find an API/lib for Linux.

    Is there anybody out there doing CTI/Linux right now? Our company is willing to try any hardware that supports a stable and free OS such as Linux.
  • Dialogic [dialogic.com] has one of the better cards for CTI development. I asked them [mailto] casually several weeks back if they would support Linux and they said they were "looking into it". I then asked a friend who worked very closely with Dialogic on MS-DOS and NT (I should mention he tries to avoid this because it necessitates a reboot once a week), and he showed me an email from Dialogic stating they were NOT yet supporting Linux.

    There's a survey at http://www.dialogic.com/uk/forms/ossu rvey.htm [dialogic.com] which asks what OSes we use (and perhaps would like to use CTI on). It says UK on the URL and Diaogic customer on the page, but I'm sure they won't mind if we showed them our support for our favorite OS. If you would like to see a Dialogic SDK for Linux, please sign up!

  • I am impressed with Festival [ed.ac.uk]. RedHat RPMs and Debian packages are available.

    It comes with several British voice. Several American, a Mexican Spanish and a German voice are available from Oregon Graduate Institue [ogi.edu].

    Call me a nerd but I like to hear the original voices quotes my favorite lines from Monty Python.
  • Dual tone, multi-frequency
  • A roommate and I took an external Cardnal voice modem and hacked a answering machine server for it. The thing could handle caller ID as well. We had great fun recording individual messages for everyone the box could resolve. Multiple mailboxes... Course the playback and interface was pretty primative, some perl scripts and sox.
    I was thinking of putting it out on the Net, but due to creeping featurism the code is embarassing. :) (That and I find little time to fix things that aren't quite broken) I would imagine that each modem would have it's own little method for all the funtions. Everything on the Cardinal was AT commands. I'd recommend the box to anyone.
  • Touch tone and web based application? What's the connection? There are DTMF to serial converters available that would allow you to pull touch tone info off the phone line, and some modems ought to be able to do this also. But why web?
  • I had a reasonable text -> speech app, But I can't get hold of the name right now because I'm in combat with my partition table and the 1024Cyls limit :-(.

    It was somewhere in the sound section of metalabs.unc.edu .

    QuMa
  • i'm writing a complete voice-mail and fax system for linux which uses vgetty. the only problem is that your modem needs to be supported by vgetty.

    currently, vgetty uses hard-coded modem support, and is a little difficult to hack to get working with modems that aren't supported in the release. so if you don't have a modem that vgetty currently supports, it takes a little hacking to get it working. this could be overcome if vgetty moved to a plain-text file for modem configuration, sort of like the (eek) .inf files in win because essentially it is a matter of sending AT commands to the modem to put it into different states (record, playback, etc...) and those could easily be inserted in a text file, like say "Play AT+TX" to play a file, for example (probably not the right command. ;)

    anyways... there are programs out there. mvm, and other vgetty scripts that sit on top of vgetty.
  • Right up front I'll note I'm not unbiased, since I'm an engineer working for Aculab [aculab.com] (not Acculab, thats a company that makes cute little balances and scales).

    The best resource I've found for finding out about Linux and Computer Telephony is www.linuxtelephony.org [linuxtelephony.org]. They even leaked about Aculab's Linux support policy last year (it was only announced in February officially).

    As an engineer working in the CT industry, I'm seeing more and more companies considering real, money-earning projects using Linux. We're starting to see support for Linux from other CT vendors, and I wouldn't be surprised if Microsoft's investment in Dialogic is a reaction to that. My personal prediction is that we'll see Linux and Solaris grabbing an increasing share of these types of applications, so it should get easier and easier to get Linux drivers and software for CT cards. Aculab certainly plans to expand on Linux support from what I can see.

    Solid Compact PCI and hot-swap support under Linux would help a lot in this application area though.

    --- Calum

  • Most of the hardware manufacturers I've spoken to recently are finally (reluctantly?) acknowledging Linux's popularity and are planning Linux releases.

    My sources at Dialogic [dialogic.com] say they'll probably have a Linux port of their drivers out sometime early next year, while the good folks at Aculab [aculab.com] already have Linux drivers for some of their products and will release more Linux drivers in the coming months.

    Aculab is also pretty good about releasing hardware specs if you're really interested in doing a port yourself. Dialogic has never quite understood that they could boost sales of their products by releasing enough information to allow outside parties to write more drivers.

  • This ISN'T Linux, it's SCO, but read on.

    http://www.nexpath.com [nexpath.com]

    This is a SCO based phone system which we happen to use here at work and it works quite well. I was browsing their FTP site one day and happened to notice some Linux files...some headers and what looked like a lowlevel driver of some sort. I didn't look into it much. (in /pub/linux) I'm suspecting that they are working on porting their stuff over to Linux to cut down on licensing costs.
  • Slowly it is happening. I develop IVR's and have started to see a few packages forming for linux telephony.... If you have the bucks for a decent telephony card Natural Microsystems(www.nmss.com) sells some fairly decent stuff. They also recently released their CTAccess SDK source code for linux, pretty cool! There is also a site (www.linuxtelephony.com) that you can read all about stuff, and I think the CTAccess source is on there..
  • Comment removed based on user account deletion
  • Not much help on the telephony-specific angle, but for general purpose speech synthesis and analysis info for Linux, check out the Linux Audio Developers' Resource Page [bright.net].

    Why does it seem like suggesting this link is my answer to many Ask Slashdot questions? Maybe we need a FAQdot!

    Div.


    --
    But my grandest creation,
    As history will tell,
    Was Firefrorefiddle,

  • I've been wanting to do something like this, but finding hardware, inexpensively to do 2 lines hasn't been easy. I know Dialogic is supposed to be the stuff, but about two years back, I saw an ad for some vendors making inexpensive dialogic compatible hardware. Around $400 for a two line board. I'd love to be able to conference two lines together, do voicemail, etc, but need more than one line, and I'm not wild about the voicemodem approach. I think it needs to be a dsp board!
  • Natural Microsystems announced Linux drivers for their AG cards last week. These include a range of cards from an 8-line loop-start to a dual T1 card. You can read more at http://www.nmss.com/nmss/nmsweb.nsf/n ew/linux [nmss.com]
    You might also want to check out http://www.opentelecom.org [opentelecom.org]

    One thing that I found particularly interesting about Natural Microsystems is that rather then developing Linux drivers, they open-sourced their existing drivers and, of course, Linux drivers soon followed.
  • Of course you mentioned emacs. It does everything. :))

    Also be sure to have a look through the LDP howtos -- there is an Emacsspeak howto. Should be handy.
  • Any chance that your setup is/will/would be available as a
    HOW-TO ... (hint, hint)

  • Another link of interest:

    http://www.openH323.org/ [openh323.org]

    Juan

  • by whoop ( 194 )
    ringconnectd will allow a simple setup, but not (that I know of, a little tweaking could allow it) the sort of dual counts you suggest.

    I used it for a long time with vgetty to act as an answering machine. If the phone rang once, it dialed up ppp. If it went for 4 or 5, I forget now, vgetty would pick up and record a message.

    My beef with vgetty was that it would not play any message to greet callers. So only family/friends knew that when it beeped (it was quite a loud beep too), just start talking. The many times I tried, it either left the phone on hook until I went back home to reset it, or would just play an empty hiss for the length of the sound file.
  • The sound quality coming out of voice modems sucks. At least on my modem, using some voice modem package that was around sunsite in 1997, the modem playback was unintelligable. The modem requires some horrible variation of ulaw compression. If the sound quality on modems was usable, voice modems would make a great touch tone interface.

    To get really intelligable sound, you need some kind of dedicated, expensive, phone hardware.
  • by Matts ( 1628 )
    KVoice handles my voice mail, although it's a bit unstable, and the pickup feature is crap (there's no automatic pickup - you have to load kvoice and click on pickup - which is impossible to do in time if you're not logged on!).

    You can do demand fax serving with HylaFax.

    Other than that, I don't know of any text->speech or speech->text projects. Unfortunately it's not something that can be done very easily for free - it requires a huge investment of time, hence why these speech->text systems were originally hugely expensive.

    Matt.

  • I was very impressed with the Festival software. Anyone looking for speech synthesis should definately take a peek at it. Text-to-speech isn't quite as nice in it as the speech synthesis itself, but its not bad.

    Its a system-hog though. I tried to use it to read e-mails to me through my voice system (see my other posting in here about it), but I found it took several minutes per message to put the audio together... Hardly worth it. Hell, my system is so slow, even using say to generate timestamps is too slow. :)
  • by tgd ( 2822 )
    Yes its possible. Pretty easy to set up too, once you've got vgetty working with your voicemodem. You need a voicemodem that works with Linux and vgetty though (most voicemodems these days seem to be winmodems...)

    I shied away from dynamic DNS and just e-mail the number to my pcs phone.

    One tip -- make sure you have an activity timeout on it, so if you dial it up accidently, or you (for whatever reason) don't get the dynamic DNS to update or get the e-mail that you can still cause it to disconnect.

    Throw a secure webserver on there, and just make some simple CGI's to trigger a delay to bring the machine back off the network.

    On my system I've got an X10 automation setup too, so I can remotely turn on other systems in my apartment. (Useful if I'm a bonehead and leave a file I need at home...)
  • Having worked on this for a while...

    The main problem with speech recognition over the telephone is that the digital standard currently used by the PSTN samples voice at 8khz, with each sample being 8-bits wide. As a result, the speech recognition engine just doesn't have a whole lot of data to play with -- Speech recognition algorithms typically use a lot of statistics to determine how well a given chunk of speech matches a word stored in its vocabulary. The less data in the incoming speech, the harder it is to be accurate with a match. In fact, it actually gets harder, as many cell phones use various encoders to further reduce the data rate. Add that to interference and background noise, and ASR over the phone is decidedly not easy.

    Many of the shrink-wrapped ASR applications that you see are designed to work through the microphone jack on a computer, which provides much more data than is available over the phone network. IBM, L&H and Dragon are the vendors I'm aware of.

    Now, there are various vendors out there who do ASR for phone applications. Nortel (my employer, but not my project) has one, as does VCS, Nuance Communications and several others. These, however, are not generally priced for the consumer market. In addition, many of these solutions run on Digital Signal Processors, which require additional cards....

    OSS speech rec would be a good thing, but I'm afraid that it's going to be a while before it comes to pass, just because of the complexity of the statistics and the specific knowledge required. Those reasons also mean that it'll probably be a while before a PDA has the juice for it.

    (There's the urban legend of the guy presenting ASR control of his computer at a voice conference, when a voice from the back of the room shouts "Format c: Return" and somebody else chimes in "Yes Return")
  • There are two major sites corralling telephony projects for Linux:

    linuxtelephony.com is an omnibus site, which has seemed not to have had any updates recently, and

    opentelecom.org which, well, has. :-) These folks are supported by Natural Microsystems, who have released a bunch of their code as open source under some license or another. I mean internal switching and driver code and like that.

    On a lower level front, it's possible to use mgetty+sendfax and Gert Doering's vgetty to build answering machine type stuff and also, possibly, 2-call fax response. I'm not sure about 1 call; switching modes can be messy.

    This stuff works with the old Zyxel 1496+ modems, among others, and _maybe_ with the Rockwell voice chips, but I'm not sure; the Zyxel's ought to be, roughly, free, by now.


    Cheers,
  • Ohio-state [ohio-state.edu] has a FAQ on using [mv]getty for voice mail.
  • Natural Microsystems has better boards, especially for industrial applications. They have released Linux drivers as well as source to their API at http://www.opentelecom.org.
  • Just what I need. I reliable system to increase the number of unsolicited calls I get every evening when I'm eating dinner.

    I wonder how long it will be before that happens? I'm not sure what systems are used now, but they can't be cheap.

    Maybe I can set up my box to call them back? Or at least filter out the unsolicited calls or maybe even have preprogrammed answers to use up their time. Now there are some ideas. :)

    ~afniv
    "Man könnte froh sein, wenn die Luft so rein wäre wie das Bier"
  • Reveal's VM100 Telesound ($59 list) plugs into a serial port, phone line, and sound card. It is basically just a ring detector, on/off relay, and interface between phone line and sound card. I sometimes see them at electronics sales.

    Some VM100 FreeBSD code here. [freebsd.org]

    A press mention of the VM100 in Byte [byte.com]

  • That's nice. Wish they had not said no two years ago when I could have used it. Too late now for that project.
  • I use some source I ported over from NeXTSTEP called am. IT drives a Zyxel modem, and allows callers to either page me, or leave a message, or recieve a fax. When a fax or voice mail arrives the caller id number is sent to my pager via an email to pager gateway. I then forward the voice and fax mails to myself via email, so that I can get them and store them on my note book on the road.

    I'm also in the middle of using this technology to provide a replacement for an old VRU (Voice Response Unit) from IBM. It grabs data from an AS/400 and provides information to customers on current shipments etc.

    Very easy to write. My next project involved with this is to use ears, or something like it to convert the voice to text (and then send it to my pager)

  • At PIKA we already have the API in beta. See www.pika.ca.
  • At PIKA we have a beta version of our API running on Linux. Supports all basic telephony and fax.
    No text to speech or voice recognition.

    For more on Linux telephony see:
    http://www.linuxtelephony.org/
  • by schwantz ( 5095 ) on Monday April 12, 1999 @04:09PM (#1937507) Homepage
    There are AT commands to do all this stuff, if you want to roll your own software. You'd have to do the system side (sound, etc) yourself. Rockwell (now Conexant) supports this through the use of what they call "business audio," which uses half-duplex digital PCM audio data from your computer (over the serial port/ISA slot). They also have an analog path to and from the chip, but that would be trickier, as unless you have a speakerphone version, the mic from your PC is probably not hooked up to your modem. Here's a few Rockwell (they're the MOST comman modem chipset manufacturer) AT commands (including fax and CLID)to get you started:

    7.5 CALLER ID COMMANDS
    #CID=0 Disable Caller ID.
    #CID=1 Enable Caller ID with formatted presentation.
    #CID=2 Enable Caller ID with unformatted presentation.
    7.6 FAX CLASS 1 COMMANDS
    +FCLASS=n Service class.
    +FAE=n Data/fax auto answer
    +FRH=n Receive data with HDLC framing.
    +FRM=n Receive data.
    +FRS=n Receive silence.
    +FTH=n Transmit data with HDLC framing.
    +FTM=n Transmit data.
    +FTS=n Stop transmission and wait.
    7.7 FAX CLASS 2 COMMANDS
    +FCLASS=n Service class.
    +FAA=n Adaptive answer.
    +FAXERR Fax error value.
    +FBOR Phase C data bit order.
    +FBUF? Buffer size (read only).
    +FCFR Indicate confirmation to receive.
    +FCLASS= Service class.
    +FCON Facsimile connection response.
    +FCIG Set the polled station identification.
    +FCIG: Report the polled station identification.
    +FCR Capability to receive.
    +FCR= Capability to receive.
    +FCSI: Report the called station ID.
    +FDCC= DCE capabilities parameters.
    +FDCS: Report current session.
    +FDCS= Current session results.
    +FDIS: Report remote capabilities.
    +FDIS= Current sessions parameters.
    +FDR Begin or continue phase C receive data.
    +FDT= Data transmission.
    +FDTC: Report the polled station capabilities.
    +FET: Post page message response.
    +FET=N Transmit page punctuation.
    +FHNG Call termination with status.
    +FK Session termination.
    +FLID= Local ID string.
    +FLPL Document for polling.
    +FMDL? Identify model.
    +FMFR? Identify manufacturer.
    +FPHCTO Phase C time out.
    +FPOLL Indicates polling request.
    +FPTS: Page transfer status.
    +FPTS= Page transfer status.
    +FREV? Identify revision.
    +FSPL Enable polling
    +FTSI: Report the transmit station ID.
    7.8 VOICE COMMANDS
    #BDR Select baud rate (turn off autobaud).
    #CLS Select data, fax, or voice.
    #MDL? Identify model.
    #MFR? Identify manufacturer.
    #REV? Identify revision level.
    #TL Audio output transmit level.
    #VBQ? Query buffer size.
    #VBS Bits per sample.
    #VBT Beep tone timer.
    #VCI? Identify compression method.
    #VGT Set playback volume in the command state.
    #VLS Voice line select.
    #VRA Ringback goes away timer (originate).
    #VRN Ringback never came timer (originate).
    #VRX Voice receive mode.
    #VSD Enable silence deletion (no function, command response only).
    #VSK Buffer skid setting.
    #VSP Silence detection period (voice receive).
    #VSR Sampling rate selection.
    #VSS Silence detection tuner (voice receive).
    #VTD DTMF/tone reporting.
    #VTM Enable timing mark placement.
    #VTS Generate tone signals.
    #VTX Voice transmit mode.
    7.9 VOICEVIEW COMMANDS
    +FCLASS=n Service class
    -SVV Originate VoiceView data mode
    -SAC Accept data mode request
    -SIP Initialize VoiceView parameters
    -SIC Reset capabilities data to default setting
    -SSQ Initiate capabilities query
    -SDA Originate modem data mode
    -SFX Originate FAX data mode
    -SMT Mute telephone
    -SDS Disable switchhook status monitoring
    -SQR Capabilities query response control
    -SCD Capabilities data
    -SER? Error status (read only)
    -DTP VoiceView transmission speed
    -SSR Start sequence response control
    +FLO Flow control select
    +FPR Serial port rate control
    -SSV VoiceView data mode start sequence event
    -SFA Facsimile data node start sequence event
    -SMD Modem data mode start sequence event
    -SRA Receive ADSI response event
    -SRQ Receive capabilities query event
    -SRC: Receive capabilities information event
    -STO Talk-off event
    7.10 DSVD COMMANDS
    -SSE=1 Enable DSVD
    -SSE=0 Disable DSVD

  • by tgd ( 2822 ) on Monday April 12, 1999 @03:57PM (#1937508)
    Its very possible.

    I've currently got an old 486/50 DX running Linux 2.2.5 at home that handles voicemail for me using mgetty and some custom shell scripts. (Unfortunately I was never able to get get vgetty perl module working... its very old and there's almost no docs for it...)

    Its pretty slick. People calling can leave voice messages or faxes. I've got it set up so either one gets packaged up in a mime attachment to my e-mail and queued to send to me. Next time the system is online it sends them off. If they sit there more than two hours it'll dial itself up and send them and get back offline. Also archives them so I can get them through a web browser on any systems in my apartment, or I can just hit the reset switch on the front of the system (which is plugged into the parallel port) and it plays any new messages for me. The turbo light blinks when I've got new messages.

    I can also control all the X10 stuff in my apartment (mostly useful for options #1 -- turn off all the halogen lights, and #2 -- turn of coffee pot, both reducing the chances that my spacing out one morning will result in my apartment burning down) ;)

    Last thing I can do is use it to cause my network to dial up. The system handles my masquerading and internet access as well as voicemail, so when it dials up my entire network is online, then it e-mails the IP address it got to my PCS phone. Secure SLL webpage on that IP address lets me control all those devices directly (especially turning on other PCs), check my messages, or disconnect the network...

    The real limiting factor I'd see in using it as an IVR system is more limited support of multi-line voice products, and the poor documentation and difficult programming for vgetty. I'm not sure there are any options other than vgetty.

    Using vgetty in combination with packages like HylaFAX gives you easy ability to do fax-on-demand and other services like that.

    I also used a system with three 14.4k voicemodems and vgetty as a way of validating information on a system that required the user give their true phone number. User was e-mailed a code to punch in after storing their supposed phone number and that code in a database. The voice system would use caller id and compare the code they entered with the code matching that number in the database. Match? Voila! Flag is set, account is activated.

    Worked great, client never used it though. C'est la vie.

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...