Open Source Grammar Checkers? 17
DaveBarr asks: "Maybe I'm more sensitive to this than most, but after continuing to see "it's" instead of "its" and "loose" instead of "lose" everywhere in the media and on web sites of supposedly
reputable origin, I began to wonder. Are there any Open Source
projects trying to develop a reliable grammar checker -- one
that would catch these common foibles? Are all these algorithms
proprietary? Are there any University research projects which
could be used as a basis for even a halfway-decent grammar checker?"
Slashdot could do it's [sic] part. (Score:1)
is posted *without* any grammar mistakes.
I've often wondered what would happen if
the "preview" function for submitting an
article included something like
s/([iI]t)'s/$1 is/g
Can we force geeks to recognize "it's"
for what it is through technology?
--kyler
Dr. Bruce E. Wampler and Grammatik (Score:1)
Hmmm, I wonder if Dr. Bruce [mailto] has any thoughts on designing an Open Source grammar checker? He probably could offer a lot of guidance to any group who wanted to start such a product.
1) Homophone checker: A needed addition to WPs (Score:1)
Building upon spelling checker code, a fairly small dictionary could provide all the data needed to identify most homophones. At the user's choice, each homophone could be flagged with alternate spellings shown in a dialog box, with really-concise meanings for each. The user would select the intended meaning.
So far, this idea seems to have generated little interest, but it would help create fewer ridiculous bodies of text.
Far more ambitious would be a lexical analyzer that would try to deduce whether a given homophone seemed appropriate for the meanings of the words (a bottomless pit?) in the surrounding text. (Bloatware, anyone?)
Nicholas Bodley // nbodley@world.std.com
Re:Parsing English (or any other language) (Score:1)
"When people try to get computers to learn, the people do and the computers don't" - Alan Perlis
Not open source but... (Score:1)
Koffice? (Score:1)
Consider the following: (Score:1)
Or:
The boy is hungry
The boy is a toad
Or:
The boy carried a sandwich to the playground and ate it. (the playground? Note that conjunctions are the most ambiguous words in the English language.)
It's easy for us to tell how to parse those, but a computer would have to maintain a database of the following:
playground is big
sandwich is small
people normally eat small things
when dogs bite, they harm humans
a noun indicating [a] human[s] (squad) would not harm humans.
One can argue that the purpose of learning is to fill in those pieces of knowledge, but:
1) The amount of knowledge that would have to be stored and recalled is *huge*.
2) Even if we have the storage and recall capacity, computers need to be able to interpret everything and know that, among other things, squad can be a group of people, "normally" may not always apply, etc. etc.
void recursion (void)
{
recursion();
}
while(1) printf ("infinite loop");
if (true) printf ("Stupid sig quote");
WHO WANTS TO WRITE ONE (Score:1)
Perl snippet, translated (Score:1)
Re:Parsing English (or any other language) (Score:1)
The question is what do you mean by a grammar checker? If you simply mean a program to read text and try to find obvious errors. You do not need to be able to parse English completely to do this. To extend the example from above you do not need to know exactly what "The cow is brown" means. Only if the tense agree. That program would just need to be able to recognize certain patterns as wrong. That is not impossible.
As for the other side of it, a program that actually understands what you are writing and figures out the best way to communicate that. This is much more complex. It would be a very cool program if it could be completed. Besides, what better than OSS to harness the immense mindshare that would require?
That being said, my grammar is so horrible I would love to see either one working as soon as possible.
Nate Custer
GNU/FSF has one (Score:2)
This is a GNU program still in development. It's available at:
this link [moria.de]
I've played with diction and it's not bad, not great but not bad.
Parsing English (or any other language) (Score:2)
I've even played with coding a C library that reads like English without proper writing mechanics. A natural language interpreter shouldn't be too hard, though it would be time consuming and would probably not produce a substantial return on investment to a financial sponsor.
I am inclined to think that the problem is ideological. There are so many disagreements among philosophers, linguists, and computer scientists as to the meaning of 'The cows are brown.' that unless one person is sufficiently savvy of all three and some other disciplines, no consensus or plan will ever be implemented.