False positives, as I understood it, is like when you go to a laboratory to check whether you got
AIDS or not, and after some months, the doctors inform you that you are
HIV positive. But in fact, the test was wrong and you aren't! Or another example: a woman you don't know affirms that you are the father of his baby boy. You go to a clinic to perform
DNA test to prove the contrary, and the test result reaffirms that the baby boy is really yours! Aaargh.
Okay, For a spellchecker, what I mean by
false positives is the words my spellchecker will list as misspelled, but which, in fact, have correct spelling. It's the spellchecker who ignored the existence of these words. The goal for a spellchecker would be to have a false positive rate of 0.00%. At the time of this writing (October 2008), the malagasy online spell checker was tested on
one of the most well-written malagasy blogs: my test submitted 2145 words, the spellchecker ignored 115 of them, and among these 115 misspelled were 24 false positives.
For me, the most shameful false positive of this first version of spellchecker is the word "telomiova". Shameful because I'm an involved in a project called
telomiova and I know what a telomiova is.
On the other hand, there is also the false negatives, which are the misspelled words that the spellchecker couldn't find. The reasons could be:
- The person who established the word list made spell errors, and registered a misspelled word into the database.
- The word is misspelled, but the misspelled form is a correct spelling form in another sense. Like when you wrote "I live a happy wife" instead of "I live a happy life".
For the first kind of false negatives, I'll try to validate the word list to the national malagasy academy of the language, as soon as it can be done. But for the second one, I'm afraid I won't be able to do much...