Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Levenstein is an absolute metric, I think something like soreson-dice would be more useful.

Regardless, if you take the short keywords and blacklist them by approximation with curse words from several languages I think it would be really hard to get something at all.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: