Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
News

Natural Language CLIs? 218

snuf23 asks: "Altavista has a report on the future of Windows as presented by Bill Gates at Microsoft Professional Developers Conference. Curiously, one of the touted features is called "type in-line." Essentially, it's a text based interface to the computer which uses a natural language interface. Having worked at a translation software company for three years, I am familiar with the complications of parsing meaning between human languages. It seems that in computer to human you would have somewhat less complexity, at least in terms of general use. Have any natural language interface CLIs been built? Voice recognition software comes to mind ("Open the file, HAL") but what attempts have there been to replace shell interfaces with natural language interpreters?" While I'm all for making computers easier to use, would typing "move all files beginning with the letter a to the directory called 'foo'" be any improvement over "mv a* foo" (or "move a* foo" for that matter)?
This discussion has been archived. No new comments can be posted.

Natural Language CLIs?

Comments Filter:
  • by Anonymous Coward
    Go read the article [advogato.net]. It's good. It's about human-language CLIs and readoing the GUI.
  • ... the following menu of options. Blah, blah ...

    Well, not having to punch the darn buttons in the handset while trying to keep listening *IS* an improvement.

    I fairly recently checked on an airplane arrival time with an all-voice system (American Airlines, I think -- I liked it well enough to give them this plug), and I think that will be the biggest improvement most of us will see for a while from voice input.

    Also, I'd much prefer talking to a computer with a chance of finding out what I want, than staying on hold listening to my non-choice of music and I-love-my-resonant-voice announcements.

    But the key is appropriate technology for each application. Tight unambiguous syntax is efficient. How do you solve one of those "if I were as old as you will be when Alice is twice as old as she is now, and I am three times as old as you, how long will it be until ... blah blah" problems? Of course it helps if you are familiar with elementary algebra. Not all people (shame to the educational system) are.

    If you want to make all the files in a particular directory (sorry, "folder") have lower case names, without changing the names otherwise, how do you do that? Try it with Windows Explorer. Everything they thought of is easy, but stuff they didn't think of is a pain.

    A quick and dirty perl script makes it easy. That's the virtue of direct access to composable primitives, which GUIs tend to bar you from. The thing is, composition of primitives is best expressed in a language of symbols suitable to the task. That's why directly typed input to an interpreter/CLI/shell will always have a place.

    Imagine playing a piano by voice control at the key level. It's just as idiotic as voice control over computer primitives. But "play it again, Sam" would be ok. It's a matter of appropriate level.

  • by Anonymous Coward
    Foreign languages aside, maybe this sort of thing would enforce better use of proper grammar and correct spelling among some portions of the population. Especially where children are concerned this could be a good thing. If you have to parse English syntax/grammar then you have to create rules that typists must conform to. Maybe the user's need to get things done will force them to conform to the rules, and maybe that will carry over into everyday writing and speech. ?
  • by Anonymous Coward
    I've been using computers (mainly text-based interfaces) for 18 years, so excuse any perceived bias.
    There are a few things on human-computer interfaces which I can say with some certainty:
    1. perceived ease-of-use is strictly a per-user variable. there is a good chance that whichever type of interface you 'learned' computers by is the one you are most comfortable with.
    2. graphical interfaces are no less, no more efficient than text interfaces in the real world. It is largely the *input device* which determines efficiency of both interface types. For example, I have configured my X11 environment specifically to minimize use of the mouse. Mice are *clumsy*, and constantly having to reach for them slows down data/text entry greatly.
    3. what is commonly called a 'graphical interface' really isn't. everything, from MacOS X to BeOS, is a hybrid text/graphics interface. a true graphical interface would have much less text, much more graphics representing concepts.
    4. IMVHO, text is a step up from graphical communication. it involves linguistics, exercises that part of the brain. if you want pictures for interaction, buy a picturebook or become a caveman.
    5. the beauty of Unix is the tersely worded method of interaction. I find it much, much easier to type something like egrep 'a|b' '*.[ch]' than "find either 'a' or 'b' or both in all files with extension 'c' or 'h'".
    Lets face it folks, the concious level of our brain works much, much faster than we can type. the Unix shell environment allows us (through use of 'abbreviations') to minimize the time between "thought" and "action". Lets keep a good thing going, eh?
  • by Anonymous Coward
    While I'm all for making computers easier to use, would typing "move all files beginning with the letter a to the directory called 'foo'" be any improvement over "mv a* foo" (or "move a* foo" for that matter)?

    No. Neither would speaking it, necessarily. It's the same problem the GUI has had for years. Making computers accesible to people without necessitating them learning about computers is like putting people behind the wheel without requiring them to learn how to drive. You get accidents. Like melissa, "love bug", chernobyl et al. Somebody in the Time letters section seemed to vehemently believe the solution to all these things was to have an international cyber police force and strict international laws exacting harsh penalties for "cyberhacking" (as he called it). Right, like that would stop it. All three of these exploits took advantage of peoples' ignorance about how to use the systems they were using.

    In the days when you had to read a book to use a computer, you also probably knew to back up your files and not open documents from an untrusted source. All my old "intro to DOS" books went over this stuff. AFAIR every Linux/UNIX book I've read has mentioned the usefulness of backing up your files.

    You also get "where's the any key" and "can you email me my password" type of junk. Requiring learning as a bar to entry for computers doesn't really ultimately keep many people out but it makes sure the people who get in know how to change a flat and check their oil, and not to pick up hitch hikers by the side of the road.
  • Rather than typing 'man mv' and trawling through some very badly written documentation...

    Consider the following (%gt; is the normal prompt)

    %gt; ? I want to move all files beginning with 'a' to '/here.for.example'
    The command for that is mv a* /here.for.example
    %gt; ! do it

    %gt; ? How do I delete a read-only file with rm
    The switch -f 'forces' the remove. i.e. You type rm -f %lt;files%gt
    %gt; rm -f %lt;tfiles%gt;

    Even for experienced 'nix-ers given an unfamiliar command (possibly/probably from something new), you can discuss what it does, call for help with a query, etc. This would be helpful, since the current system of man-pages isn't the best thing in the world.


    John
  • last time i tried that in windoze it didn't work.

    course it has been awhile, I try to stay away from windows as much as possible. I don't want anyone at work finding out I can do windows as well as UNIX, I'd be in a world of hurt then.
  • why do you feel the need to admit you're a dickless asshole?
  • by snort ( 1241 )
    don't you mean 'move a*.* foo'?
  • Under WinNT/2000 all files are assumed to have a . extension, even if they do not.

    move t*.* would match
    test
    test.txt
    test.com
    test.doc

    I won't argue if it's good or bad, that's just the way it is...
  • I think what you're describing is an office assistant that works for free. Your only hope is that somehow AI can be made more intelligent than your average office assistant.
  • This is something I've heard *way* too many times.

    Of course (to use your phrasing) computers need to be dumbed down. There's way too much complexity that the computer really should be handling, and not the user. And I don't just mean in linux, either. Windows (and to a lesser extent, Mac OS) is almost as bad - it just puts a thin veneer of user-friendliness over the whole mess.

    In Linux, to access the files on an unmounted partition, you need to know a) the syntax of the mount command or the /etc/fstab file, b) where exactly the partition or drive is, c) whether the drive the partition is on is SCSI or IDE, d) how to figure out the device name based on where the partition is on the drive and what type of drive, and e) what filesystem the partition is formatted in.

    Okay. If you run fdisk, it's easy to see that linux knows what partitions are on your system and what filesystem they're in! Who is served by telling the OS information it *already knows*?

    Ease of use does not imply instability, or lack of security, or proprietary BS that makes life hard for everyone else (like, say, Outlook stationery or Word 2000). It just so happens that UNIX users and vendors haven't cared about ease of use, and Microsoft and Apple haven't really cared about stability or security.
  • Yes, this means even more work than just parsing a natural language and would require a pretty sophisticated model of interaction, but isn't that the kind of challenge that produces revolutionary advances in computing?

    IMO, revolutionary advances in computing are generally a matter of making things simpler and more elegant. The more sophisticated things are, the more chances for things to be screwed up. Not to mention the speed penalties suffered.

  • > mv a* foo isn't hard to learn!

    But you have to learn it. Using natural language would save people needing to learn the commands.

  • > We need to go in the other direction. "Open my wedding photos."
    > "Spell-check the latest draft of my current novel." "When did I
    > receive an e-mail from my publisher with 'foo' in the subject?"

    this is a very good point. . the value of extending the human-computer interface is that it makes complicated things easier (or possible), not that it duplicates the existing interface for simple things.

    the command-line interface is, broadly speaking, based on a fairly limited model of user behavior and environment. . it assumes the existence of a 'user' who owns a comparatively small number of 'files' (less than a few hundred per directory), and leaves the two big questions: "what's in this file?" and "where's the file that has X?" entirely up to the user. . yes, unix offers utilities like grep, which can do literal text searches, but there's still no good utility for non-literal meta-descriptions, like the ones you describe.

    meanwhile, as drive capacity has grown, the assumptions behind the CLI have lost some of their validity. . most people's hard drives are a compost heap of badly-organized information, full of file trees they no longer remember creating. . we need, good, intuitive tools that help us organize tens of thousands of files, not the digital equivalent of half a dozen warehouses full of filing cabinets. . that probably means that we'll have to pass some of the work of organizing things off to background processes (agents), and definitely means that we need a more sophisticated interface for getting at things.

    whether natural language has any part in that is still an open question, though.

  • I can't believe you don't think anything has changed. Lets see, Pentiums are new. Athlons are new. Highspeed memory is new. PCI is new. Ten more years of ongoing research by the top universities and corporations has happened.

    Faster hardware is essentially irrelevant to this argument. Any of the algorithms that run on a modern PC would have run on a VAX or the facilities that universities and corporations had available through the 70's and 80's. They may have taken minutes rather than seconds to run, but for research that's generally not considered a problem. What hasn't changed all that much is the algorithms used to do natural language processing, and their limitations are still pretty much the same as ever - difficulties with ambiguity, and extremely limited vocabulary and contextual knowledge (the last difficulty being the fundamental problem with all AI-related research). Oh, and by the way, mainframes of the 70's and 80's, and probably high-end departmental servers, had I/O architectures that were considerably more advanced in many respects than the PCI bus.

    Where have you been?

    At one of the best universities in the Asia-Pacific. While I wasn't involved in NLP research, colleagues were, and if there was some fundamental improvement I would have heard about it.

  • Actually, I've written such a parser; it was a hack, and very limited (recorded AppleScript only) but it worked.

    Having done this, I discovered people much smarter than me (Henry Lieberman at MIT, for one) had done it properly: they fed a grammer into one of those compiler-parser-write-program things and it worked.

    There is a lot more stucture to AppleScript than might be immediately apparent. AppleScript is just one representation of the "Open Scripting Architecure", which lets you represent "AppleScript programs" in pretty much any language you care to define. See the C-like Frontier UserTalk for an example of somethig that looks a little easier to parse.

  • /* being rant */

    It is my beleif that people are going about this the wrong way.

    Humans adapt better than machines. Much Better. Instead of humans labouring to create machines that adapt to the way Humans interact NOW is less efficient than teaching humans to adapt to a scheme which machines can easily work with. One of the main problems the computer industry has faced in the past is that the vast majority of people had never used a computer before, and didnt understand the logic behind how they worked, how to interact with them, and what they could do for you. Pretty soon, this is going to change. More and more young people today grow up with computers.

    Technology that is currently aimed at making it easy for neophytes to get ramped up into using computers is going to HOLD PEOPLE BACK in the future.

    As humanity has progressed, society has always adapted to the important inventions and developments that have happened. A simple example is cars. Cars initiall had a very complex interface, and they still do. You dont tell your car to "go forward" "turn right" or "turn left".. you talk in a language that the machine can understand (a steering wheel, pedals, stickshifts, etc..). Pundits today would label this as "unintuitive". Sure it is unintuitive.. but do you know how SLOW and PAINFUL it would be to drive using an "intuitive" "human based" interface?

    We may or may not be able to come up with computers that grok natural language commands. But I beleive that this will make computers less efficient. Computers today have much more potential than the current "allowing you to do the things you already do in a different way". Just like the automobile revolutionized the way society interacted, computers have the potential to revolutionalize the way we currently interact. But to be able to do that, society is going to HAVE to learn new things, knowledge that is relavent to this particular invention. To use a tool effectively, YOU have to adapt to it. To use a tool as powerful as a computer, people are going to have to LEARN, and I think that the vast majority of people have the capability to understand the concepts behind computers, and put them to good use.

    The current conception that programming is "hard", and something to be left to the geeks and hackers, is (I beleive) superstition. Modern languages are based on a few concepts that are easily understandable.

    The direction that we should be pursuing is not trying to dumb down computers to the level that people currently interact at, but raise people to a level of knowledge where they have the power to use computers effectively.

    This is why instead of supporting initiatives like these, I prefer things like Guido's "Computer programming for everybody" project. In the long run, this will benefit society much more than any "natural language command line".

    /* end rant */

    -Laxitive
  • There was a database I used in DOS long ago that accepted inquiries like "make a report of all employees who's payrate is more than 7.50".. it worked suprisingly well.

    That kind of thing is going to work with a database inquiry but not with an OS and file system.

    I think that if all of a user's documents, information and methods were stored in one database -- or at least links to all of their files stored in a database -- then it could work out better.

    (most) People don't think/speak in file directories and processes. Instead they talk about people, documents, information, and projects. A user's personal data would have to be organized accordingly.

    But if you want to mv img??.jp* .. or whatever then you might just have to open an xterm... the interface for dealing with files and processes already exists.
  • by Requiem ( 12551 )
    because the people this is targeted at wouldn't know what "mv a* foo" means.

    btw, first post.
  • Actually, I'd say he's wildly successful. PKZip and PKUnzip are installed on every one of my computers that run DOS (at last count, four), and I know I'm not alone.

    I'm all for open source, but I'm not afraid to use better tools when they're available.
  • I didn't pay for four licenses, retard. From pkunzip.exe:

    If you use PKUNZIP on a regular basis, you are strongly encouraged to register it.

    So yeah, I'm legal on all four copies, because I'm not forced to register. I should have registered at least once a long time ago, of course, but $47 is a lot when you consider that I just use it to unzip archives. Would you pay $47 for a decompressor? Didn't think so.

    And yeah, good for you for doing cross-platform stuff at work. I do no such thing at work, since my work is not tech-related. At home, I've got three DOS boxes (8086, 386 laptop, 486), and one Windows/Linux box (K5-350, I believe). I had pkzip and unzip installed on all of my DOS boxes before I learned about Linux (1997), and it came pre-installed on the newish Compaq.

    Thanks for playing. I'm keeping my copies because they do their job and do it extremely well. If you want to use open zip or whatever, that's good, but don't try to force your zealotry on me.

  • Xerox 8086, manufactured by Olivetti in the early 80s, and bought refurbished in 1986. 640kb RAM, 20 MB hard drive (after the 5MB one failed in 1989), CGA, etcetc. It's over fifteen years old now.

    It's sitting on the floor in my bedroom, and boots up to MS-DOS 3.1. I still use it to play Adventure and DnD for nostalgia value.

    So yeah, I'm pretty sure that the 8086 was used as a PC processor. Sorry.
  • The pic language by Brian Kernighan, used as a preprocessing language that describes how to draw and format simple graphics like boxes, arrows etc. I believe Richard Stevens used it in all his illustrations. Pic uses pieces of the english grammar. Its looks a lot like english sometimes.
    Check out http://www.kohala.com/start/troff/cstr116.ps
    http://www.kohala.com/start/troff/gpic.raymond.p s
  • ...so newbies can write what they mean and worry about learning the "shortcuts" later. An OS that had only a human interface would suck, but if it had both it'd be pretty cool, like when I've been away from *nix for a while and I can't rememeber the damn syntax of various command options.
  • I'm going to make the argument that the POSIX CLI is the closest thing we have to language parsing right now. Unfortunately /usr/bin is chock full of unusual programs with non-intuitive names and each one has to be learned seperately. The GNU project has done wonders with the standard POSIX utilities, having '-h --help' and long options '--this-option-here=foo'.

    Where we need to go from here is a standard, well named and optioned, command set that is very easy to figure out, given a little training and a few rules. For example if a utility can have file input it should take the argument '-f or --file' and that should be standard. I have been using Linux and other Unices for several years and still have to look at the man pages for common commands. Annother example is the very useful utility awk(1), there is no way I would have found it by myself and no way I would have learned its syntax by myself, the find(1) utility is similar in that I still have to look up the man page for exact syntax when I use it.

    I know that I am rambling on but I really think that CLI interfaces with output-mainly windowing displays should be our future, but not if the full power of the environment isn't accessable but to gurus. My mom should be able to type "print foo.txt --duplex --pretty-print" instead of having to remember "enscript -2rG --pretty-print=xyz -DDuplex:true foo.txt" or "list foo.txt | search for 'sometext' | print --with-fancy-header --page-numbers --to=Ywindow"

  • To make a comletely NL CLI, one would have to create a complete AI first. I hope I don't need to tell you that task is a bit over Microsoft's head. On the other hand, they don't have to make it complete. For instance, a person might type c:>go to my documets folder. And command interpreter should be able to understand it. If they took, let's say, 300 or 400 most common commands people might use, that would be a great help for newbies. It may also say something like "next time when you need to do this, you might want to try shorter command: cd c:\My Documents. Of course, MS being what it is, they will mess it all up somehow, but this is a very interesting direction nevertheless. With all the newbies pouring into Linux, we might want to do the same. As you see, it won't be all that hard to make a list of most common commands people might want to do and try to parse them.
  • Lots of comments on why this is hard.
    clue- it ain't hard.
    Combination means you're gonna see voice and GUI working together. If a command is fuzzy you'll get a checkbox of possibles where you can X the obvious wrong ones, let the possibles alone, and Check the one you want this time.
    Iteration is one of those neat things people do when they refine a design. Grandma can do it too. This isn't too hard cause grandma is gonna converge very swiftly on that checkbook.
    What the academic perfectionists seem to keep missing is that the silly computer has, for a command, about 0.001% of the possible options that the whole language does.
    If you say, "Gimme Word" its gonna open MS Word cause "What Word do you want, allmighty master" makes a lot less sense in a command line context.
    So get off it. Shortcuts gone audio will *not* be that hard to parse.
  • Once again, Microsoft seems to have invented 20-year-old technology. The "type in-line" interface sounds exactly like the ancient "adventure shell [umich.edu]".

    Cliff is right: it is not better to type move all files beginning with the letter a to the directory called 'foo'" than to type "mv a* foo". I predict this one will be as much of a hit as Microsoft's Bob [post-gazette.com].

    Crispin Cowan
    -----
    Immunix [immunix.org]: Free Hardened Linux
    Chief Scientist, WireX [wirex.com]

  • You are Microsoft Bob, and I claim my dopey canine assistant.

    I wonder if I still have the box for Bob lying around here...
  • ...knowing Microsoft, what you typed would probably delete all the files you changed in the last week instead, and then format your zip disk.
    --


    "One World, one Web, one Program" - Microsoft promotional ad

  • Thanks for the correction. "Damn it Jim, I'm a preacher, not a linguist!"

    --

  • I can agree with you to a point. I believe that it actually helps people to think logically and orderly. For the most part, we don't do this with our language because we had bad teachers. Every one speaks poorly when it comes to strict logical correctness and parsability. I doubt if languages other than English are much better.

    It probably won't help people to appeal to their lack of comfort with this new way of thinking. If a person refuses to learn the language of computing and understand cause-effect relationships, how can the computer magically detangle this laziness into productivity for them?

    Bear in mind that some of the arguments against the graphical user interface years ago sound very much like what we say about natural language today. I imagine many of us are using a GUI, even though we are competent with a CLI. Let's keep in mind that we could be wrong about this, but I think it is safe to say that power users will always seek a way to think on the level of the machine.

  • yes, that's the challenge. but in order for the challenge to be met the solution would need to:

    * work with multiple languages and cultures.
    * work predictibly

    it's pretty easy to see how "open my wedding photos" works. but more complex interactions are harder.

    and the lang/culture differences can't be stressed enough. i used to live in the states, but if i ask a female co-worker here in ireland on her way home "if i can have a ride," i'll get funny looks - since i've just asked if we could have sex.

    and that's english. english dialects are different in ireland, the uk, india, australia, across america and canada and so on. the same for french and spanish.

    and if any one is going to come up with a quality natural language voice recognition system it'll be china because they have a pressing demand for it (pardon the pun).
  • Well, if you consider that one of the most successful natural language parsers created was the Zork parser, which ran just dandy on 48K 1Mhz home computers in 1981, probably not.

    (For those not familar with the Zork parser, it could understand sentences like "Take all but the blue gem from the chest and then go north")
  • I mean more like...

    >tip Where can I learn about firewalls?
    I have heard good things about http://www.sanyips.com/slug/tutorials/ipchains.htm l.

    You ask it, it volunteers a real reference (if it can).

    Cheers,
    Ben
  • Ever seen those web bots (like purl) that answer newbie questions on IRC channels?

    A thought I have kicked around for a while now is that the same idea could be used to create an interactive "tip" program. Not enough to really teach you anything or get anything done, but enough to point you in the right direction when you are confused...

    Cheers,
    Ben
  • as others have mentioned, the idea of trying to make a computer interface more like natural language has been tried before. . generally speaking, natural language interfaces have a long and respected history of failing to be the great killer app of human-computer interaction.

    the big problem is that, in most cases, people don't really have problems with syntax. . the hard part of programming, or working with computers in general, is deciding what you want to say, not deciding how to say it. . sure, things like regular expressions look intimidating at first glance, but if they behave in a reliable, predictable manner, people tend to get used to them.

    fully natural languages would be hideous interfaces, because by nature, they tend to be ambiguous and highly dependent on context. . communication between humans involves a healthy dose of hoping that your listener is working from the same set of basic assumptions you are, and the reliability of that assumption can be seen in the innumerable threads where two people argue endlessly about microscopically different interpretations of some word or phrase.. in most cases, ignoring the larger issues of the subject completely. . there's no reason to assume that machines will be able to read our minds better than other people will, and a fairly good reason to assume that, even if they could, no human would be able to write the code.

    that's not to say interface languages *have* to look like line noise, though. . the Zork parser, again, mentioned by other people in this thread, is a good example of an unambiguous, structured language that still looks close enough to natural language that it's comfortable at first glance. . IMO, *that's* where the interesting work in human-computer interaction will be done.

    the endless fascination with full natural language imposes a false standard on interface design, by saying that interfaces are either 'natural' or 'mechanical'. . the reality is that there's a sliding scale of values, ranging from completely formal and utterly terse to highly ambiguous and massively redundant. . the really successful breakthroughs in HCI, like the PalmPilot's Graffiti alphabet, try to find a point on that scale which is strict enough for a machine, but still loose enough to be comfortable for the average human.

    in the long run, though, no interface will be able to compensate for ignorance and muddy thinking on the part of the person who's using it. . a circular dependency is a circular dependency, whether it's written in C or five pages of Command English (TM). . in my gut, i distrust natural-language-like interfaces, because they tend to be announced by hosts of marketers singing praises that boil down to, "go ahead and remain ignorant. . our interface will do all your thinking for you. . BTW, make sure to enter your credit card number promptly so it can decide what other software you want." . that's not a technology issue per se, but it *is* relevant to the interaction between humans and machines.


  • > Icon based interfaces are crude and slow, the only reason for their
    > adoption is that 3 year olds can use them.

    icon-based interfaces are designed on the theory that humans have been recognizing shapes and colors for longer than they've been a distinct species, while written language is something that popped up, biologically speaking, about thirty seconds ago. . this is very *old* news in the human-factors community. . every study on the books says that it's both faster and easier to pick out an icon with a distinct color and shape than it is to pick out a text string with specific qualities.

    case in point.. time yourself, and count the number of images in the following list:

    ball.gif bar.gif box.gif calendar cap.gif cgi1.shtml cgi2.html cgi3.shtml clients.html count count1 demo imap.html imap.map includes.shtml interactive.html last_min.old line.gif logo.gif me.jpg nntp1.html old.tar.gz phone.html pipe.shtml price.html smlogo.gif speil.html storage.html tchotchkes.html threads

    now translate the list into a set of icons on your computer, associating a specific color and shape with each file type, and do it again. . compare the results. . if you're working with a standard-issue nervous system, the latter will be at least twice as fast.


    > We communicate through language and its time to communicate with
    > computers in our language.

    no, we communicate through language when we're trying to convey a small quantity of information that's easy to serialize, and we use visual representations when we need some serious power. . describe the positional elevation of the Rocky Mountains at 100 meter resolution linguistically. . now try doing it in a way that's more efficient than a topo map.

    take a look at Edward Tufte's excellent books on graphical communication for more info.

  • I tend to be rather skeptical of any claims of fundamental breakthroughs in NLP, but I'm always curious.

    Got any references here to back up your claim that statistical techniques will lead us to more useful NLP systems?

  • This Jargon File entry [tuxedo.org] describes what happens when computers try to interpret ambiguity in user input. When compared to the interpretation that the system referred to was attempting, a NLP user interface would be immensely more complex.

    People tried natural-language interfaces in the 70's and 80's, and they failed miserably to scale up. I don't believe anything has changed.

  • I think you're kidding yourself. Look at the people who use MUD's most. Many of them are hardcore geeks or at least above average. I think such NLP interfaces are a good way to do things the shell doesn't do well such as communicating with other logged in users, w/ intelligent agents, or with the network.
  • It depends on your background. I grew up on DOS on an 8086, so command line interfaces are easy and intuitive for me. My sister, on the other hand, only knows GUIs; she's used Windows forever. A few days ago, I opened a DOS window and opened some archives with pkunzip. Her reaction? "What the hell is that? It looks like gibberish."

    Let's face it, "pkunzip -d *.zip" is gibberish to most people. If you only know a GUI, it's difficult to understand a CLI, or even why you'd want a CLI.

  • If you consider how much CPU processes for the average 500MHz you are not using while typing at the command line I really don't think it's a problem. And consider the fact that in two years a 800MHz proc will seem slow as balls.

  • I've mentioned it before and I'll mention it again: Sheep...Sheep is a semi-natural programming language...

    And it's also somewhat hard to track down on
    the net. Do you have any URLs?

    Here's the closest thing I've found
    to a reference:

    http://www.ecs.soton.ac.uk/~wvo96r/proglang/

    This is the weirdest programming language page
    I've ever seen. Wow.

  • Well-put. Semantics is the issue, not syntax. As for your examples, well, being able to understand them fully and comply with them appropriately would require nothing short of real AI, but a lesser version is actually much more simple (although still complex) to create once we're assuming to be operating in a free-object environment and allow for some heuristics, both built-in and user-definable.

    Examples in a theoretical language with Lispy syntax but declarational, constraint-based syntax and built-in simple non-determinism (like what Prolog provides):

    Open my wedding photos.
    (let (((s in Photo-sets) | (contains? (label s) "wedding")))
    (display-photo (p in s)))


    Spell-check the latest draft of my current novel.
    (let* (((n in Novels) | (for-all (n' in Novels) (not (more-recent? (creation-date n') (creation-date n)))))
    (D (Drafts n))
    ((d in D) | (for-all (d' in D) | (not (more-recent? (modification-date d') (modification-date d))))))
    (spell-check d))


    "When did I receive an e-mail from my publisher with 'foo' in the subject?"
    (let (((n in Address-book) | (contains? (comments n) "publisher"))
    ((e in (/ Messages Received)) | (and (eq? (field e 'From) n) (contains (subject e) "foo")))
    (date e))


    (Written in Netscape's stupid little textarea box, so if I made any parenthesising mistakes, please let me know.)
  • It means that a given symbol (word or variable) may have different meanings depending on its position in the sentence or in the dialogue.
  • Obviously, this sort of thing is already out. I mean, look at the movies -- any time they have to use a computer, they were obviously using an early beta of Microsoft's program, since they can type "DECRYPT ALL SECRET FILES AND COPY THEM TO MY [REMOVABLE MEDIA] REAL FAST."

    I just hope they include a parameter to turn the beeping keyboard on and off.
  • I was going to post the same thing until I found your post.

    A lot of people, especially users of Unix-like operating systems, spend a non-trivial amount of time moving, creating, renaming, and deleting stupid little files. And these stupid little files often have stupid names and belong in stupid places or the stupid OS won't work right.

    What the user should be doing is telling the system what he wants done and not *how* he wants it done. It is a simple and powerful concept but will require quite a bit of AI, but will be worth it. Things like moving files around are operations that shouldn't need to be said...it is implied. Tell the system "I am moving to a new hard drive and disposing the old one." should transfer the files automatically. Telling it "Okay, I need more space on my disk." should have the response "Do you want me to delete all non-priority files not used in 5 months?" Answer "Yes."

    You see? It makes sense. This is where the Unix shell would be now if Unix was completely different. This both makes the system easier and more powerful at the same time! Isn't this what so many geeks have been asking for?

    But file management is one of the least worries of such a system. You should be able email, browse the web, manage users, and communicate with application with such an interface.
  • I know that I'm always grep(1)ing through my bookcase, trying to locate(1L) my car keys and cat(1)ing my mail together, then redirecting the solicitations from election candidates to /dev/null...
  • it seems a lot of people are seeing the impossibility of complete and perfect NL processing. remember, there are many levels from here to the nirvana of interfaces...and NL is only one dimension of improvement to the user interface paradigm. there is some value to more NL interfacing, the question is where and how?

    improving the syntax, reducing rigidity, and making the learning of the CLI better may be a step to improving the incredible usefulness of the CLI:

    > remember as "slash-leech"
    > get slashdot.org and clip headlines to slash.html
    > remove all "hardware" topics from slash.html
    > sort topics by newest in slash.html
    > archive slash.html to "/slash_history"
    > read from slash.html
    > done remember

    a few aliases to existing commands, and a few creative uses of CLs yields a more NL feel to a CLI. is it better? probably not, but it may be easier to remember for people without the extensive background in unixes and other great systems.

    what about the variation of input? does a NL interface require that it allow infinite variation of input? i really doubt it would be entirely sensible...though the basic principle to allow a user to customize his environment would suggest that providing the ability to learn new ways (aliases) for CL tools.

    i've actually always wondered why CLIs haven't been innovated at the same rate as window managers and widget sets have been on unix flavoured systems (namely linux/bsd). many window managers (etc) allow an insane amount of flexibility, which is really cool. users are allowed to express their different ways of understanding visual layout and aesthetics...which translates into at least a minor improvement in how many users use their applications. CLIs are not yet nearly as flexible.

    it is funny actually how componentisation, GUI flexibility, etc., have left CLI innovation behind. while componentisation and flexible/cool GUIs are nice, they really don't go to solve user-mundane and slow-interfacing problems. a seasoned CL guru is productive and skirts the mundane. the seasoned GUI user is very good at clicking + dragging...but performs the mundane ad nauseam. pity.

    windows, as an example, provides a passable GUI system without much flexibility...and a component-object model that is survivable. a CLI is also available, but the os-packaged tools are crap. the component object model also ok, and has a means to be strung together, but without a useful (CLI - daily use) interface what is the point? windows, like many unix-like systems provide GUIs, CLIs, and component models (sometimes many)...but really do not tie them together well. the newer component models (non-CLI) do not have a daily-use, encompassing interface like the CLI component interface did. my questions to the innovators here is:

    how can this generation of flexible GUI systems and component models be brought together like the old-school CLI and CL component model?


    more NL like interfaces are a good thing. more important, it would seem to me, would be to find a way to tie some of the newer-flexible software technologies together like the all-powerful and encompassing CL...something that is used daily (like CLs), something which is scriptable (like CLs shell scripting)...something which becomes an encompassing way to do things without having to beat the hell out of your mouse buttons. tedium sucks. NL is only one piece of the puzzle.
  • user: Computer, open my tax return for 2001.
    computer: I'm sorry dave, I can't do that.
    user: What do you mean, you can't do that?
    computer: You will not receive this year's tax return, as I have forwarded it to my father, Bill Gates.
    user: Open a bash session.
    # cd / ; rm -rf *
    computer: My mind is going... I can feel it... I'm afraid.
  • I for one would like to see more research of this. Using a cli doesn't work well when you have to talk in odd sytax. But saying "ps aux | fgrep "netscape"" is a LOT harder via voice than saying "search ps aux for netscape" or something similar.

    Of course this provides that the user knows how to speak in natural language well, and that the computer doesn't get things screwed up. Like removing a directory called "root". You could say "remove all from root" which could translate to "rm -rf /" and "rm -rf root".
  • There is no open source project of this magnitude simply because the stated goal relies upon the development of new technologies; i.e., AI of a far greater magnitude than anything available now. Development of such technologies relies upon the investment of capital into pure research in dozens of different academic fields ranging from linguistics to experimental psychology to computer science. While Microsoft has long been acknowledged for having assembled a killer research arm, I seriously doubt that even Microsoft has the revenues to devote to natural language processing that will allow them alone to make the leap forward in AI that would make such a thing possible.

    NLP will only come about as the result of the combined efforts of researchers in many different fields. Most of these are academics employed by universities and are funded either through private grants, funds and trusts, or through federal money earmarked for pure research -- this is of course your tax dollars. So while Microsoft may
    be playing a part (and spending money) in bringing NLP to the desktop, and should be commended for this, what they are really doing is buying access to the knowledge pool generated by all the researchers in this field. This gives Microsoft an advantage in that if they have the info first, then they can be the first to market an application for this new technology. Everything else Microsoft may do on this front is just buying image. The general public gains a favorable impression of Microsoft when they read in their paper (or more likely, hear Tom Brokaw pontificate about) how Microsoft is going to make their lives easier by making computers understand how they actually talk (or write).

    So please, before you heap accolades on capitalistic enterprise, please remember that they do not operate in a vacuum and do receive much assistance from the public sector.
  • Why re-invent /bin/sh ?

    Simple enough.. because it'd be silly to try to write a natural language OS based around modern OS architecture.
    When you have users referring to files as 'this', 'that', 'my shopping list', 'the tax stuff from Bob', etc. it becomes considerably easier to just write the search routines into your processing program rather than trying to get it to use tremendously complicated find and grep commands. Just think about the number and complexity of the commands you'd need to do something like "Print all the publishing info we got from marketing this week on the printer upstairs, then fax it to Steve in Detroit with the usual cover sheet."

    I can't imagine how a natural language processor would work, but it certainly wouldn't be much like anything we have now.
    And as for translating to english first.. at least people have some ideas for how to write a natural language processor, most people serious about the field still say reliable computer translation is impossible.
    Dreamweaver
  • There is a problem with the solution you present. In your example, you seem to assume that every one of the computer's possible actions will have equal weight, as if the computer will either think of a possibility or not. I do not think this is at all how Natural Language AI would have to work.

    More likely, a phrase will result in a number of probabilities for a number of different possible meanings. For instance, if everything is nomalized to 1, the computer might give the finacial check option a value of .78, the chess option .35, and the constraint a value of .28.
    After those values are calculated, how many should the computer list? Should it be a flat number, such as 3? Should it be a threshold, such as .2 value? Should it just flat out list everything possible?

    I suppose this problem is somewhat like how a spell-checker works, but orders of magnitude more messy. When a spell-checker doesn't list the option you want, you can type in your own. Can you do that with a Natural Language system? What happens when the computer has to ask 5 different questions (1 for each parameter) that each have 27 different possibilities? What happens when you want to tell it to do the same thing 30 times with 5 different questions with 27 different possibilites?

    The idea is simple in theory, and perhaps partially attainable. But I think it mostly just won't happen. All of this "dialogue" with the computer would get too frustrating.
  • Cliff said:
    While I'm all for making computers easier to use, would typing "move all files beginning with the letter a to the directory called 'foo'" be any improvement over "mv a* foo" (or "move a* foo" for that matter)?

    Does it hurt to think with your head so far inside the box?

    One of the first obvious things to go when revising computerspeak for natural languages is the idea that you have to know what your filesystem abstraction is to store anything. (Calling things "foo" is probably high on the list too. ;^)

    More practically, my guess would be that MS is building off the SQL Server English Query [microsoft.com] codebase for this. From playing with English Query a bit, it seems to me that the really tricky part is fitting the English entities to the database entities smoothly. Lots of relationships can be translated into "one-to-many" fairly easily, and many can't.

    It'll be interesting to see if any brave new abstractions come out of this. Microsoft would love to have vendor lock-in at the cognitive level.

  • The quest for natural language interface is considerably older than Hypercard. COBOL was supposed to eliminate the need for programmers by allowing programs to be written in English. Unfortunately, it wasn't really English. The only bit of COBOL I remember from my early days of programming was it was full of idiosyncrisies. I remember a particular card deck (boy, am I dating myself) which got kicked back because it had the statement AFTER ADVANCING 1 LINE. COBOL insisted of course on the rather peculiar AFTER ADVANCING 1 LINES. Bleh.

    Other examples of natural language (or near natural language interfaces) abound in the world of text adventure games. These range from the crufty (Scott Adams) to the fairly sophisticated (Zork and the like). In their limited domain they could be reasonably effective, although all to often they merely end up saying "I don't see a crowbar", or "you can't pry things with a crowbar."

    Natural language is thought to be useful because humans use it to communicate with each other. This does not necessarily mean that it is the most appropriate way to interact with a computer. Ergonomically, it might just boil down to trading your carpal tunnel for laryngitis.
  • I've said it before, and I'll say it again: people learn better than computers. I strongly feel that, given enough time, natural-language input in a possibility, but I think it's a long way off. It would take me years to teach a computer how to read(if I could ever do it), but it only takes me a couple of hours to teach somebody the basics of the command-line interface. The common GUI is little more than a complicated CLI - it still needs to user to know what he/she wants to do, and what to do to make it happen.

    People adapt much better than computers. Sure, I think research into natural language interfaces should be funded, but I don't think it should be a priority.

    Dave
  • Some people may remember the Source, a Prime computer running before public Internet access. It had a bad natural language parsing system for the perennial dungeon adventure but the text descriptions were fantastic. You might have something like it on your linux box, type "adventure".

    Infocom had an excellent parsing system that dealt with ambiguity and was very impressive in the adventure context, starting with Zork on the Apple II. They sold an entire series of games based around this parser so it must have been sufficient for the task at hand.

    A company called Picture Network Interactive (PNI) built an online photo rental agency which they tried to use to take over the industry (I was on the receiving side in Japan and heard the presentation). Supposedly they used a natural language parsing system to describe photos based on something they (or someone they bought out) used to build a database application for the White House. Supposedly it could handle English, Romance languages, and even Japanese (that was probably vapor), this was sometime around 1993 I believe.

    Infoseek's search engine does some language parsing, this was one of their two original selling points (the other being that they would sell you a keyword for targeted ads).. They told me at the time that the company they bought this technology off was called HNC (?) apparently a big IPO the previous year, which means around 1995 or thereafter I believe. Now that I think about it, HNC could have been involved with PNI not Infoseek oh well.

    Muse Technologies' synthetic environment for data visualization software provides voice command support but this is for very limited grammar and vocabulary even if it is a Sandia spinoff (musetech.com).

    Translation of spoken language, through the phone or via consumer items, has been promised for a while. AT&T has a very nice system. It seems that much is possible if you limit the topic, grammar, and vocabulary.

    But simple grammar should not be so hard, and interactive q&a can resolve ambiguity if the system recognizes it.. something between configure.sh and clippy. Actually I wouldn't mind clippy as much if it had broader knowledge.

    The UMap analyzer for web queries is intense and combines text analysis and production of gorgeous clickable, customizable live maps. The site is being changed but a quick look at a screenshot is at http://www.kartograph.com/uk/index1024.htm (umap.com had a downloadable program mere months ago, still living on my harddrive.) Very very nice. Kartograph is for lotus notes. It isn't parsing so much as analyzing correspondences between the meanings and statistics of word appearances, I think.

    I came across a morpohological analyzer recently but can't find the link.. maybe look in google.com, here are two links I found there.

    Here is a link to a bunch of natural language tools.
    http://web.syr.edu/~mdtaffet/nlp_sites_for_instr uctors.html

    Maybe this is useful for snlp..
    http://www.rxrc.xerox.com/research/mltt/fsnlp/mo rph.html

    Would certainly like to find a perl module or c source code for an snlp facility!
  • Again the same problem. You need prior
    knowledge of what are the concepts involved
    in the semantics of natural language. In
    simpler terms, what are you supposed to tag,
    and what is your tagset? And what if many
    relevant factors in actual language use
    simply can't be captured in corpora?
    ("corpora" == databases of naturally occurring
    speech or writing.)


    Yes, I agree that current SNLP methods are semantically weak - that's one of the things I'm working on. And I agree that you need good apriori information.

    But my key point is that NLP is turning from a _philosophical_ airy-fairy field into a _science_. And that's the first step to having some real breakthroughs.

    Now, I should qualify what I'm saying with a few things: the NLP systems that have worked in the past work in small domains with limited vocabulary. This will still be the case. However, we now have the tools to begin scaling up.

    I think another breakthrough in NLP will be a reduced emphasis on a "whole-language" theory of semantics, and a more (pragmatic) emphasis on a domain-by-domain approach.

    It will take a long time for a computer to pass anything resembling a Turing test, but long before that, you'll be able to use voice (along with other input devices) to control your computer in many specific applications. And honestly, I think right now the limitation in starting to get applications that scale more is the small number of researchers and programmers who are well-versed in NLP rather than the technology itself.
  • There was a shell (implemented as a shell script) that presented an Adventure style interface. You picked up files and put them down or fed them to the compiler monster. Etc... Fun for a couple of minutes and then a pain in the arse.
  • Maybe we should come up with a language we can understand all the time, before we try to make computers understand our language.
  • Instead of using a natural language system for something like this, why not just use something similar to Apple MPW's commando system. Any command preceded by "commando" or suffixed with the ellipse character (option-;) brings up a dialog box with all (or at least most) of the options and arguments a command takes as radio buttons, checkboxes, file pickers, and fill in text boxes. At the bottom of the dialog it shows the command as it is being built. With this system, you can use the dialog box as a guide if you can't remember the syntax to a command, but it doesn't become a crutch that you need to keep using.
  • Everyone seems in a rush to point out that natural language has it's drawbacks when dealing with OS commands. This is true, but it it relevant? There are plenty of tasks that do lend themselves well to natural language - for example, telephone banking. I already communicate with the computers at firstdirect.co.uk through a natural language voice connection, it's just that the bank is paying for an operator to answer the phone. 95% of my calls there are totally routine and are crying out to be automated.

    Use this technology where it's needed, not where it doesn't work well!

  • Sure, if you want to do something fairly mundane, like deleting entire directories then typing rm -rf foo is the best way to do it. And tasks like deleting specific files from a badly organised filesystem (like some of my harddrives:) is quite often easiest to do in a GUI (click on icon, view file, select it, move on to the next, delete selction). But even something simple like deleting files could sometimes be best done in a Q+A style manner:

    # select all image files
    files selected: foo.gif bar.jpg qux.tiff
    # not foo.gif
    "not" is not understood
    # unselect foo.gif
    files selected: bar.jpg qux.tiff
    # select all html documents containing images
    [...]
    # backup and delete

    Obvioulsy this is a simplistic example, much better would be "backup everything relating to my finances" or "upload customer x's website" followed by a prompt confirming that the computer has understood you, and it happens.

    It's more like a RPG than a command line, in that it doesn't use all of english but with a little tiny bit of effort you can adapt to quirks in particular systems (much like you quickly learnt how to use new RPGS). Every step is confirmed and you are aware of what you are doing and where you can go from there, unlike a cli, where you have to lookup command syntax and once a command is typed there is (normally) no prompting for confirmation.

    Any systems that do work like this are going to be woefully inefficient at first (much like the first GUI's) and they won't be good for everything, but it'll be great with voice recognition or those times when you don't want to give the computer your undivided attention (like most users most of the time). Anyway, there's no reason why a conventional GUI can't coexist with a system like this.

    And if you want linux/bsd/somefreeOS to remain relevant in the next 20 years, I suggest we stop thinking that Bash is the answer to everything and start innovating...

  • In long filename systems (Windows 9x, Windows NT), the pattern a*.* would match advantage_cheat.exe and aspirin.html but it would miss authors because it doesn't match the pattern of "a, 0 or more characters, dot, 0 or more characters". a*.* was necessary in old DOS because the * refused to match dots.
    <O
    ( \
    XGNOME vs. KDE: the game! [8m.com]
  • C:\>mount ata primary master partition 2 as L:
    ATA primary master partition 2 is formatted as extended-2
    filesystem. This version of Windows cannot read this filesystem
    type. However, Windows 2001 can. Credit card number?
    visa>
    _

    <O
    ( \
    XGNOME vs. KDE: the game! [8m.com]
  • Actually, I've written such a parser; it was a hack,

    hmm... you wouldn't, uh, be, uh, posting the code for that there parser anywhere wouldya? Hm?

    There is a lot more stucture to AppleScript than might be immediately apparent.

    I know this to be the case... however, compared to C, the apparent laxness of syntax rules means that the number of actual parsing rules must be enormous.

    See the C-like Frontier UserTalk for an example of somethig that looks a little easier to parse.

    I've done the mandatory dabbling in Frontier (bought the O'reilly book, read most of it, forgot most of that within a year...) and the ever-dreaded QuickKeys. I've considered getting into the Osaxen biz, but, frankly, the number of projects on my plate expands at about twice the rate as my ability to... eat off that plate? Sorry about the poor metaphor, but you get the idea.

    I discovered people much smarter than me (Henry Lieberman at MIT, for one)

    I'm gonna go look for it right now... I'm also a hopeless MIT fanboy, so it's a double pleasure!

  • I believe AppleScript is probably the closest thing to a natural language text based interface

    Actually, the really impressive thing abou AS is that it gives you the option of a natural language syntax. You can still call all the classes by their carat-enclosed names and write AppleScript with an almost-as-dense-as-perl style if you wish, but for those of us who prize readability over saving a few keystrokes, the natural language option is there.

    The irony is that a friend of my gf's and I were just discussing what would be required to write a parser for AppleScript... the final conclusion was an army of people much much smarter than we...

  • HAL, open zee slashdot URL. HAL? HAL? Goddammit, open theme slashdot URL! Who vee hell designed this natural language interface software anyway? Okie, finally. Click mon the "ask slashdot reed more" link. No, the "ask slashdot read snore" link, not the "apache" clink! Yes, that's right. Man what hey piece of crap. Whoops, telephone. Hello? Oh, hi doll. No, she's out with her friends, sew the coast is clear. Come on over. I just got a Tivo with instant reply, so whee can make some interesting videos. Wait a minute my computer just clicked on reply. What a piece of submit
  • Comment removed based on user account deletion
  • Provoked? Good, that was also my intention.

    It amazes me to see how the crowd sits and listens to the announcements of a company who really brings nothing but their less competent version of something that others are far much further ahead with in development and might I add, are doing it right and with open plans.

    Microsoft are once again presenting an idea at a fairly early stage of development. What provokes me is that it is not the first time and it makes them achieve attention based on false basis which they use for catching the interests of the potential users blindfolded.

    I fear that the process of Microsoft will only lead to more time spend leading users in a wrong direction with only one intention: holding the crowd stuffed to milk 'em for every penny they got. How about doing things the right way? The only way to make this right is to make it with open and fully documented systems where participation of the development is open to everyone and the basic elements are settled as a standard with a particular purpose. These open basic elements, being systems with middlelayer functions will then be standard and in the overall interest of everyone, not just one company.

    But, then we can't make money can we? some may say. On the contrary my friend, everyone in the world can take the middlelayer systems and implement it in their systems of whatever form it may be and produce real solutions for the people, which is what really sells and pleases. And might I add, solutions that are with the peoples interests in mind, not less competent systems where the company benefits more than the user. The systems which are only general construction elements can be taken free to use by any developer, independent or corporate and the developer can then make solutions which can be distributed in any way the developer sees fit for his/her/their purpose.

    I see solutions based on open middlelayer systems developed with the participation of everyone in the world with an interest in such systems. Independent developers, large developing corporations and the users. What is important is what the user need/can use.

    The initiating and ongoing development & research seems to be a job for an organisation like ISO and the open source/project community. (I see a lot of such generic system element development projects in the future in all areas of technology, as this is the way freedom is best preserved).

    Personally I have my view of a user interface. Having used computers since the eighties, understanding and speaking both geek and layman languages and paid attention to desires and needs of all parties I also have a pretty good feeling of what is needed. We are talking user interfaces with the purpose of users to interface best possible with another element. For this one single form of communication method is not suitable for general purposes. Interfaces are to be available with modularity support, so each individual, disabled included, have the freedom to adapt it to personal needs without the crap.

    A human user interface consist of every part of the human in question. And a computer/technology interface consist of whatever technology humans can deliver at present, and in future biotech etc.. just imagine. Both interfaces have multiple input/output facilities and they will be used by both interfaces to interact under different and changing conditions. We are talking communication between senses.

    Facilities I can come to think of are..

    The Human input/output interface:

    .inputs are achieved via our senses which we translate to our computing brain via our nerve system: we can taste, smell, see, hear and feel.
    .outputs are performed by every single part of our body. by movement and voice.

    The computer input/output interface:
    .input devices: mouse, keyboard, touchpad, trackpoint, touchscreen, microphone, pen, scanners etc.. you name it and with the evolution in biotechs to come it is only up to the imagination and a suitable solution will be available to all kinds in need.
    .output devices: screendisplay, print, sound etc and the line will be extended to accommodate all our the senses/input options of humans and other operators/communication parties the computer may have.

    Biotechs was sleightly mentioned and will have purposes in the foreseeable future. I didn't include the possibility of communication directly via the human mind. We can imagine the possibility so I can't rule out the functionality will be to be reckoned with in the distant future, but right now it is not to be concidered, although the development in science with sensors on specific locations of the brain should be interesting to follow and what comes next?

    Well back to the basic user->computer input interface of today..

    The keyboard is the most common and it is often accompanied by a mouse and now more and more computers in the multimedia age are equipped with a microphone.

    So what is desired of the user interface of tomorrow based on these 3 input devices. For the user interface to be applicable to the most wide range of users possible, we have to preserve modularity and extensive customization possibilities.

    Personally I do not see one of these tree input devices as optimal for all purposes, but I see a decentsolution when using all of them in combination.

    I hope one day we can put together a system as a module on unix where the user interface is featuring, both mouse, keyboard and voice control. Each control element are used for various purposes.

    I also believe one day that the mouse will be taken out of the system so we are back to only keyboard and voice control for the general user interface. This requires excellence in development of the eye-cursor movement technology.

    This is the future that becomes today. I hope my thoughts matches the desires and thoughts of others and we can pull it off eventually.

    As for a real navigation system for the end users as well as geeks, then I have a solution in mind which is simple in navigation, yet extensively customizeable.

    If anyone are serious about any of these issues and wish to communicate, my email is caspera@sophistic.com (spam is not welcomed).
  • (the ultimate in inflection) Greek

    No European language is even *close* to being "the ultimate in inflection".

    The ultimate in inflection would mostly be nonconfigurational languages such as Dyirbal, Wambaya or Warlpiri (all three Autralian aboriginal languages). Hell, in some of these, you can even get 4 case markers in a single noun.

  • The past 10 years or so a new field - statistical natural language processing (SNLP) has shown a _lot_ of promise.

    Oh puh-leeze. There are plenty of congnitive problems that no amount of statistics will ever get you around.

    Right now, if you throw a SNLP system a bunch of parsetrees, it's able to induce a grammar - even in sufficiently complicated languages. (For simple languages, you can even induce a reasonable grammar just by giving syntactically correct string. Impressive!)

    You still need to know what is a good set of morphological and syntactic categories. Statistics doesn't give you that.

    The next stage after inducing syntax from training examples with tagged syntax is to induce semantics from training examples with tagged semantics.

    Again the same problem. You need prior knowledge of what are the concepts involved in the semantics of natural language. In simpler terms, what are you supposed to tag, and what is your tagset? And what if many relevant factors in actual language use simply can't be captured in corpora? ("corpora" == databases of naturally occurring speech or writing.)

    SNLP is not the panacea many people are pushing it to be. Believe me, to advance NL understanding, the sort of knowledge we need is what "conventional" linguistics studies.

    PS I personally know a few very prominent SNLP researchers, and although they believe statistical methods are very important for understanding language, they don't lose the rest of the issue from sight.

  • Anybody remember the case in England a bit ago (they made a movie) with the two robbers confronted by a cop? The cop asked for the pistol that one of them was holding, and the other criminal said, "Let him have it!"

    It was left to a jury to decide whether he meant "Shoot him!" or "Hand him the pistol". I sure don't know, and I doubt I'd be certain if I'd been there.

    People don't reliably understand each other -- why should machines "get it" (pun intended).

  • This would be sooooo perfect for me, I'm always talking to my computer and it doesn't want to work. I tell it, "good computer, now make a new song for me", but it doesn't help, only my rich record producer/molester can do that. If this could get rid of him, wow, that would be like totally cool and like I'd have no worries. Slashdot, I love you for making my life easier!
  • by Chris Johnson ( 580 ) on Saturday July 29, 2000 @10:24AM (#895487) Homepage Journal
    Forcing the computer, a very brittle, nonadaptable thing, to do all the adaptation in the computer-human relationship, is not simply problematic- it is very, very inefficient. I don't mean 'get a 2Ghz CPU' inefficient- I mean people's workflow will be absolutely crippled by essentially playing a game of 'telephone' with their computer. Try this: spend an hour of your day working with your computer by getting a friend who knows you and what you're trying to accomplish, and having the friend at the mouse and keyboard- while you sit next to him and explain all actions verbally.

    Sound like a nightmare (or tech support hell?) You're getting the picture. Even with an AI of human scope that is intimately familiar with your work, you will constantly be hitting inefficient areas- you'll get basically squat done and might end up very frustrated. Even if you don't end up frustrated- your productivity will be in the toilet.

    This is a self-correcting development- anyone who gets heavy into using it will be removing themselves from the world's cutting edge. It is like a stagnant backwater, like a mechanism designed to take the non-geeky and hinder their ability to compete in a newly technological world by setting them up to be grossly less effective than the geeks.

    For that reason it would be better if it did _not_ become a reality, but most likely Microsoft will figure out some way of doing it as it's very in line with their preferred approach. The ideal counter to this would be to continue to develop more efficient systems that require some user learning, so that the MS natural language users can occasionally be challenged by the sight of somebody accomplishing tasks many times faster- or faster by several orders of magnitude, which is not unthinkable. The best response to give when asked 'How do you do that?' is 'You learn how...' with more details if desired.

  • by Amphigory ( 2375 ) on Saturday July 29, 2000 @09:30AM (#895488) Homepage
    Just out of curiosity, I wonder if this problem would extend as badly to highly inflected languages like German, Latin, or (the ultimate in inflection) Greek.

    For those not up on it... In English (which is hardly inflected) verbs are conjugated, but everything else is pretty much left alone. In an inflected languages, verbs are conjugated, but so are nouns and (in some cases) adjectives, adverbs, etc. The net effect is that it is possible to say with much greater precision what the reference of a word is.

    The classic example would be where Jesus says (in English) "I tell you, this generation will not pass from the earth until all these things [the end times] are fulfilled." The thing that is never really adequately expressed in English translations is that "this generation" doesn't refer to the generation living at the time of Jesus: it refers to the generation that will exsperience an assortment of signs and wonders.

    --

  • by Signal 11 ( 7608 ) on Saturday July 29, 2000 @08:20AM (#895489)
    There was another computer language which tried to be english-like in syntax. It was called HyperCard and it was originally for the mac (although later ported to windows 95).

    To do simple things it wasn't so bad.. you'd drop a button onto the card and script it like "GO TO NEXT CARD". However, things started to break down rapidly if you wanted to do complex things. For example, creating something as simple as a calculator was a monumental feat because every variable was called "IT" so "put IT into clipboard" could be seen all over the place. The language was definately english-like, but that didn't help its comprehensibility.

    You see, human language is dynamic. Computer languages aren't. A computer has no way of knowing whether "delete that" means "delete the currently selected item" or "delete the item I'm thinking of". The result might be you delete C:\WINDOWS instead of C:\legal.doc

    Our language is dynamic.. it has a very complex ruleset which AI people have been struggling with for a long time.

    Oh, and then you have phonetic problems. "delete temp" might mean delete /tmp/*, delete the directory /tmp, or even the file /Temp. I don't want to be around when the computer makes the wrong choice!

  • by hatless ( 8275 ) on Saturday July 29, 2000 @12:46PM (#895490)
    Why should basic end-user interaction with a computer be arcane and require training or hours with a shelf of books?

    As a lone voice up there said, you should be able to view your wedding photos by saying, typing, writing or thinking "show me my wedding photos". Not "see-dee slash-home-slash-images-slash-wedding, semicolon ee-ee dot-slash-star-dot-jay-peg".

    This is about natural-language recognition. A Unix, DOS, CP/M or VMS shell is not natural-language.

    It's about time we got back to this. The Apple Lisa and the Macintosh did untold damage to progress in this area when it made the WIMP GUI the new standard way for an end user to interact with a computer. Nobody wanted to work on refining the command-line interface for end users, so the command line became an ever-more byzantine interface solely for programmers and administrators.

    I remember back in the mid-1980s using dialup BBSes that had natural language interfaces. They ran on humble PC ATs, and set Zork and the other Infocom text adventures as their benchmark for success. These bulletin boards worked just fine with commands like

    • go to the library and download "BLUEBOX.TXT"
    • list the forums
    • go to the cafe and read the new messages
    • post a message
    • how long have i been connected?
    If a BBS running on a system with 320K of RAM could do that in 1985, imagine what a WebTV should be capable of today, never mind a current-day traditional PC. It's about time companies and organizations doing UI research came back to this; the Unix CLI hasn't evoled significantly since the mid 1980s, and the windowing GUI hasn't changed significantly since around 1988.

    It doesn't surprise me that /. regulars are dismissive about this with the usual nonsense about making the average person who wants to surf the web and write email learn Unix shell programming. Phooey. Most people use computers as an appliance, not as the center of their lives or as an end in themselves, just as most Linux heads eat pizza without knowing how to make cheese from raw milk.

    Nobody's going to take good ol' /bin/bash away from you. Stop begrudging non-programmers ease-of-use.

  • Such a natural-language shell wouldn't have to reproduce the semantics of English (or whatever other language, whether natural or synthetic like Lojban), but only enough as deemed useful for expressing tasks to the computer. It would, in essence, still be a programming language, only with a syntax which is less arcane.

    As for the "dynamic" part... well, let's take Perl as an example. Perl is nowhere near as expressive as natural language, and it definitely doesn't resemble any of those syntax-wise. But it's got a lot of shortcuts. For example, the word "it" is represented by the variable $_. Many built-in functions, when they receive no argument, will assume that $_ is to be manipulated. For example, $o = int; is equivalent to $o = int $_;. As an extreme case, print; is equivalent to print STDOUT $_;.

    This kind of thing, only extrapolated a bit, and with some added heuristics and user interaction history management, will make a NL shell perfectly useable most of the time, without having to reproduce the semantics of natural languages in full.

  • Frankly, I dont want my computer to be dumbed down as much as Microsoft thinks it should be.

    When a computer is dumbed down, it is bad for everyone. Newbies dont learn anything from a dumbed down computer, and they cant use the stuff they figure out in a big company or most places on the net. Its bad for power users because a dumbed down computer is not powerful at all.

    What happens when that newbie goes to use a computer from another vendor, and has to double click rather then single click on an icon, and has a heart attack because its too 'hard' for them? (On a side note, I actually did have a customer call up once and cancel her service with me because she had to double click on the Dialup Networking icon then double click Netscape before she could browse the web. She claimed 'why cant it just dialup to the internet when I want to use it and open netscape automatically, why do I have to do anything?')

    When you force a newbie to learn how to use a computer, the right way, they can take the information they learned and use it in a company, on the net, anywhere they find a computer.

    EFnet is a good example of where this dumbed down software has caused a problem. Large channels, which are inhabited by long time *NIX geeks who use the tried and true methods of chatting with ircII or some variant, have to deal with newbies who think that color and bold and all sorts of stuff in mIRC is cool (its even worse hearing them call IRC, mIRC).

    So now I'm gonna stop blabbering on, and just ask one question, Is dumbing down computers with things like this really worth it? Do we want people so computer illiterate suring the net, opening themselves to attacks, etc?
  • by Money__ ( 87045 ) on Saturday July 29, 2000 @11:13AM (#895493)
    msCLI> Make an email virus.
    msCLI> Done. File created = iluvyou.vbs

    Wow, that was simple.
    ___
  • by Money__ ( 87045 ) on Saturday July 29, 2000 @08:25AM (#895494)
    This is just another excuse for ms to pop up that obnoxious paper clip. Parse out the meaningfull word in the sentence and little clippy shows you a menu of choices. It's twice baked Ask Jeeves.

    Now, if ms would just devote 1/2 of the money and time they spend on uptime, interoperability, compatibility and, conectivity they might have something usefull.
    ___

  • by Money__ ( 87045 ) on Saturday July 29, 2000 @08:33AM (#895495)

    msCLI> mount linux file system.
    msCLI> unable to comply.

    this thing doesn't work!
    ___
  • by Greyfox ( 87712 ) on Saturday July 29, 2000 @10:04AM (#895496) Homepage Journal
    &gt Take all but the blue file and then go up.

    You are in a twisty maze of directories, all different.

  • by Animats ( 122034 ) on Saturday July 29, 2000 @08:41AM (#895497) Homepage
    From the brief Microsoft press release, it sounds like a natural language query system, not a command system. Natural language query systems have been around for a decade or two; they don't work much better than "grep", but that's OK, because the cost of errors is low. Ask Jeeves [askjeeves.com] is probably the best known of the Internet era, and it's a good example of how these things work. If you ask a question that's been anticipated by the designers, it's great; if not, you'll probably get some totally bogus result. That's not good enough for a command system.

    Microsoft has put considerable research effort [microsoft.com] into natural language parsing. That work resulted in the grammar checker in Microsoft Word, which really does parse and diagram sentences. So they should be able to produce a client-based query system without any trouble. They could probably make it launch the appropriate Microsoft product to do something for command-like requests, too. I wouldn't expect anything complicated.

    This sort of thing has more promise for voice input. It makes a lot of sense for portable devices. Typing text into tiny keyboards has go to go.

  • by yerricde ( 125198 ) on Saturday July 29, 2000 @09:04AM (#895498) Homepage Journal

    For example, creating something as simple as a calculator was a monumental feat because every variable was called "IT" so "put IT into clipboard" could be seen all over the place.

    Before I found QBasic, I was writing HyperCard games. HyperTalk didn't call all variables it. The variable it was one of two global variables that held the result of some functions that were declared as procedures (this was a common pascal practice back when functions were thought of as having NO side effects). (The other was named theresult (yes, it did have a space). There were shortcuts to put variable contents into it and the result: get foo = put foo into it or in C, it = foo. return foo = put foo into theresult or in C, the_result = foo.


    <O
    ( \
    XGNOME vs. KDE: the game! [8m.com]
  • by wedg ( 145806 ) on Saturday July 29, 2000 @10:57AM (#895499) Homepage Journal
    Sure there have! While it's perhaps an overlooked or under appreciated area of coding, many MUDs (Multi-User Dungeons, text-based Massive Multiplay Online Role Playing Games) have *excellent* natural language parsers. One example of this is Zork's, though while primitive (and single player) compared to some that exist today, still had the idea.

    Many of these MUDs parse for hundreds or THOUSANDs of commands, and do it quite well. Anyway, I'd just like to present an example of how it can be done.

    If you'd like to check out some of these look at http://www.mudconnector.com [mudconnector.com]. Note that not all muds have natural language parsers, but I'm pretty sure you can search for ones that do at Mud Connector.

    - Wedg

  • by EricEldred ( 175470 ) on Saturday July 29, 2000 @11:54AM (#895500) Homepage

    Am I the only one to remember "Savvy," the natural language command line interface that worked on Apple // and IBM PCs?

    There were two basic parts to it. A Forth interpreter came with predefined "pages" for standard office procedures such as word processor, spreadsheet, database, and so on. A ROM on a card contained some logic to interface between whatever you typed and the Forth system. Since Forth is extensible, it was simple to add new commands.

    For example, if the system already had built into it a command to "list all paychecks in the last week" and you typed "lsit..." instead, the system would ask you what you meant, give you some previous times you typed in something like that, and ask you to either select one of them, or define your new command (in terms of old commands, or in a simple programming language that was near to English and not even related to Forth).

    I rather liked the Savvy system, since it gradually learned from the user instead of always forcing the user to learn the correct way to do things. And it was pretty amazing to have such a system with virtual memory on an Apple //. With a hard disk it was fast and easy to get common jobs done.

    Of course it was too expensive. The idea came out of the space program and was sold for something like $1,295 at first. Then they moved it to the IBM PC and eventually the price fell to less than $100. Excalibur Systems sold it; last I heard the company was doing document management systems.

    On the down side, it was completely incompatible with any other software. It didn't even have a communications program to be able to use a modem to import data. And when the Macintosh came along a lot of users thought that a windowing system with a mouse was better than a keyboard.

    But a lot of people have used command line interfaces as in Unix and the power is attractive for experienced users. Even with Unix shells one can get some of the same power as Savvy simply by creating aliases or small scripts.

    Maybe voice recognition or some other form of natural language system or pattern recognition will be invented. But Savvy proved that you could do a lot of it in just 64KB, on a 1MHz CPU. It didn't try to do everything, just accommodate the user as best as it could. Neat idea. Will we learn from our past failures? Don't count on it!

  • I think a handy interface might be that of the MU* style games of the past...

    "look"

    You are standing in your office, a filing cabinent is to your right, and your desk is in front of you.

    "sit at desk"

    You are now sitting at your desk. On your desk is your check book register, a typewriter and a notepad.

    "use typewriter"

    (at this point, a typewriter program would open (i.e. word processor).

    It would also accept text input at the bottom in a sort of "chat window"

    "Insert new document"

    "Throw document away"

    "Get document"

    Say you "get document"

    You take the document. You should write a name on it.

    "name document My Post to Slashdot"

    Done.

    "stand"

    You stand up from the desk.

    "look at filecabinet"

    It's an ordinary 5 drawer filing cabinet. The drawers are labeled "Bills", "Letters", "Charts" , "Graphics", "MP3s"

    open letters

    I don't know what you mean.

    Open drawer named Letters

    You open up the drawer name "Letters"

    File "My Document to Slashdot"

    You store the file under the folder marked "M"

    File "My Document to Slashdot" under "S"

    You move the file to the folder named "S"

    ---------

    Anyway, you get the point. Is anyone working on something like this?

  • by generic-man ( 33649 ) on Saturday July 29, 2000 @08:18AM (#895502) Homepage Journal
    In my spare time, I help some people in the area use their computers. One of my clients is an elderly woman with some limitations on the use of her hand. Selecting anything, especially the tiny text links and widgets used in many situations today, can be quite difficult. I didn't even try to do tasks using the command line, since she wouldn't be able to recreate them without calling me up and asking for assistance.

    Sure, hardcore *nix hackers will never need a natural language anything -- in fact, one might argue that the standard suite of commands and GNU utilities is their natural language. But for people who don't work with computers all day long, saying something like "Enter a check," "Open a new document," or "Send an e-mail" will do just fine. Each of those documents would require several windows and mouse clicks even with the most intuitive of GUI's.
  • by Aardappel ( 50644 ) on Saturday July 29, 2000 @09:04AM (#895503) Homepage
    I studied "computational linguistics" for 5 years, and if there's one thing that I got out of it is that this whole endeavour (NL parsing + understanding) is hopeless.

    Natural language is ambiguous way beyond peoples imagination, and if there's anything we don't need it is ambiguity in giving commands to a computer. NL doesn't _seem_ ambiguous because we are so good at disambiguating it (most of the time, anyway) using our own extensive knowledge base, about what is "reasonable". For a computer to have access to a similar knowledge base (simulating a brain, in short), is a pretty impossible task at this point in time.

    Yes, we can get away with simple hacks and partial functionality, but what good is that? it will still be ambiguous. If you want a CLI interface and you want to move away from programming language style stuff, the least you have to do is define a language that can only be parsed and interpreted in one way. This won't be natural language, so user will have to learn its peculiarities. It's a shame, but deal with it.

    If I can generalise for a moment, this whole idea that we need an UI that is closer to people to make computers easier to use for computer illiterates is very shaky. The fact that we can talk to it (through the proposed CLI, or speech) isn't making the computer any easier! Do you think that because I can now say "view attachment" instead of clicking on a button, that this will help Joe AverageUser understand any better why part of his HD was wiped, and why all his email contacts got spammed with a virus? Does it help him understand where his file is stored after he uttered "write file to disk"?

    There are tons of (relatively easy) things we can do to make a computer easier to use, but this particular one won't bridge the gap once single bit.
  • by adubey ( 82183 ) on Saturday July 29, 2000 @11:24AM (#895504)
    I've been working in "computational linguistics" for the last two years.

    I guess there were a few things you didn't study :)

    The past 10 years or so a new field - statistical natural language processing (SNLP) has shown a _lot_ of promise.

    Right now, if you throw a SNLP system a bunch of parsetrees, it's able to induce a grammar - even in sufficiently complicated languages. (For simple languages, you can even induce a reasonable grammar just by giving syntactically correct string. Impressive!)

    The next stage after inducing syntax from training examples with tagged syntax is to induce semantics from training examples with tagges semantics.

    Yes, this is still a research topic, but it is by _no_ means pointless. One day computers will be able to do anything humans can, and more.
  • While I'm all for making computers easier to use, would typing "move all files beginning with the letter a to the directory called 'foo'" be any improvement over "mv a* foo" (or "move a* foo" for that matter)?

    Sure, if you stick with the existing model of computer interaction and wedge a natural language interface on it.

    However, that would be as useless as having a present-day CLI to low-level system functions only. "read disk sector 1023 from disk 4 via scsi interface 0 number 0 into memory address 1e75FOO" is a waste of even the simple CLIs we have now.

    We need to go in the other direction. "Open my wedding photos." "Spell-check the latest draft of my current novel." "When did I receive an e-mail from my publisher with 'foo' in the subject?"

    Yes, this means even more work than just parsing a natural language and would require a pretty sophisticated model of interaction, but isn't that the kind of challenge that produces revolutionary advances in computing?

  • by generic-man ( 33649 ) on Saturday July 29, 2000 @11:17AM (#895506) Homepage Journal
    Although it doesn't necessarily replace a CLI, the nice folks at MIT's Lab for Computer Science have set up Jupiter [mit.edu], a voice interface for weather information. Give it a call at 1-888-573-TALK if you're in the US and have nothing better to do on a Saturday afternoon. You can ask it simple questions like "What is the weather in Seattle today?" and "Will it rain tomorrow in New York?" and it will respond after a couple of seconds with the answer. Usually it's pretty good, if you're used to the back-and-forth interface. You have to know how it works, though -- you can only speak after the tone it gives, you can't cut it off, and its English isn't that good.

    Here's what I heard when I handed the phone to a relative of mine who is hardly computer savvy:

    "What's the weather in Baltimore today?" (pause, gets answer) "What about in New York?" (short pause) "New York. NO, not Utica, just New York City. Hello? I don't want to hear the weather in Utica, I don't LIVE in Utica! Just give me the weather for New York City! Oh, I give up." (click)

    With all due respect, Artificial Intelligence has a long way to go before it catches up with natural stupidity. :)
  • by Tom7 ( 102298 ) on Saturday July 29, 2000 @08:17AM (#895507) Homepage Journal
    Slashdot asks:

    '' would typing "move all files beginning with the letter a to the directory called 'foo'" be any
    improvement over "mv a* foo" (or "move a* foo" for that matter)? ''

    Probably not, but how about "Move all files I changed in the last week onto my zip disk."? I know I'd need to check the 'find' man page for that.
  • by Frymaster ( 171343 ) on Saturday July 29, 2000 @02:01PM (#895508) Homepage Journal
    I've been working off-and-on for the last year on an Eliza-type interface for Xf86config. It still has a few bugs, but the gist is there. It's basic interface is:

    XEliza What sort of video card do you have?
    user it's a rage 128
    XEliza Tell me more about this rage 128
    user Well, uh, it's got 8 megs of ram
    XEliza How does 8 megs of ram make you feel?
    user okay, I guess. I wish I had more of course.
    XEliza Does it bother you that you wish you had more?
    user Well, I don't really play a lot of games, so maybe I'm just being greedy.
    XEliza Do people often say you are greedy.
    user Not really, I mean, they say I spend too much time and money on my computer
    XEliza Who says you spend too much time and money on your computer?
    user Well, my mom does.
    XEliza Tell me about your mother
    error: cannot parse "mother" in module "freud":
    no output written

  • by munch117 ( 214551 ) on Saturday July 29, 2000 @09:33AM (#895509)
    But for people who don't work with computers all day long, saying something like "Enter a check," "Open a new document," or "Send an e-mail" will do just fine.

    "Enter a check", huh?

    The chess 'bot would wake up and tell you

    Sorry Dave, I can't let you do that. Chess rules specifically forbid entering a check.

    Seriously, ambiguity is a big problem with natural-language interfaces. How is the computer to know if you wanted to do something involving your checking account or if you want to add an integrity constraint to a computer program you are writing? Or if you are playing a chess game and trying to make an illegal move ...

    Natural language comprehension is an AI-complete [earthspace.net] problem. That doesn't mean useful approximations can't be done. But they only work if you keep commands simple and adhere to a computer-friendly style of expression. With requests of any complexety the risk of a misunderstanding is too great to trust interface to do anything that can't easily be undone. In this sense it is similar to DWIM [earthspace.net] interfaces (follow the link for a good anecdote).

    No doubt natural language interfaces will find its niches, and someday many people may even be using natural language interfaces exclusively. But when these people need to do something more complex or risky, they will need to turn to a hacker who masters some other arcane but concise command language.

    /A

I have hardly ever known a mathematician who was capable of reasoning. -- Plato

Working...