Well, that was fast

I already don’t like Akismet so I’ve deactivated it. I didn’t even get to the point where it misbehaved in handling a comment. The admin panel is what pisses me off. It tells me there’s 440 pieces of spam it’s flagged when it hasn’t been active long enough. As it turns out, anything marked “spam” in the comments database is fair game. And Akismet tells me that after 15 days it’s going to trash anything marked “spam”. I actually prefer to keep that in the database for historical reasons.

*sigh* Another one bites the dust. I wonder how difficult it would be to hack out the bits that just do an “is it spam” check from the plugin. That’s all I’m really interested in.

8 Responses to “Well, that was fast”

Matt Says:

I was saving my spam comments for “historical reasons” as well, but then it started building up and one day I looked and my wp_comments table was over 15MB. We were also getting a ton of complaints about bloated backup files.

The original reason was for some sort of analysis of the spam for plugins, but it’s been over a year since the feature was added and no one has does that yet. However Akismet does do that, just in a centralized fashion and on a global scale.

However if you really want to save the spam, look for the akismet_delete_old() function, deactivate that and you’re set.

Ryan Says:

I don’t actually mind Akismet wiping out the comments, but I want it to be optional for a little while so I can test it out without fear of losing anything. The problem is that I don’t know Akismet yet so I don’t trust it. I’d like it to not mess with anything until I’ve reached a level of trust with the system. Sort of a trial period where I could see what it would flag before deciding that it’s not going to cause too many problems.

Thanks for responding. Maybe I’ll hack up the Akismet plugin a little bit to get what I want. I really do want it to be good, god knows I need the help fighting off these spammy bastards.

Matt Says:

That makes sense. I guess from my point of view the 15 days is enough time to review what it’s caught (you can mark things as “not spam” too and it learns from that) and I’m pretty resistant to adding options on principle.

Perhaps the key misunderstanding was the 440 comments were things that were already caught by WordPress before Akismet got there, probably from words in your blacklist or things you’ve marked as spam before, they’ve just been invisible until the Akismet UI exposed. I’ll give some thought to how to make that easier.

Ryan Says:

Yes, that’s exactly what happened. The 440 spam that showed up were ones I had previously marked as spam dating back to May of this year. I was shocked to see that it had instantly flagged 440 pieces of spam until I finally realized what was going on.

As for the 15 day thing, that totally makes sense. It gives you time to play. That being said, you know on about day 10 that there’s a clock ticking in your system that you have to constantly be aware of if you don’t want to lose anything in your system.

Maybe when I get some time tomorrow I’ll edit the PHP file and comment out the call to remove the old spam and then let the system run a while. I think I ought to be able to change all of the “spam” in my database to something like “oldspam” so Akismet doesn’t see it anymore. That would at least let me start with a clean slate with Akismet while still retaining my old spam.

IO ERROR Says:

You’re a strange one. Most people don’t want to keep their spam around. :) Are you secretly a fan of texas hold ‘em and cheap home mortgages? No? How about phentermine and viagra? Now, see, Akismet didn’t flag this post as spam, despite all the nasty keywords, so it’s a little smarter than you think. :)

Most of us testing Akismet are seeing a 0.1% false positive rate and that rate is dropping. That’s easily better than most other spam plugins out there.

Now, why won’t you come to my online gambling site? :)

Ryan Says:

Ha…actually the only reason I’ve kept all my spam around is because I’ve hoped that there might be some sort of bayesian spam plugin at some point, so the corpus of spam would actually be useful.

And Akismet didn’t flag your post because I currently have it disabled. I’ll be reenabling it tonight to give it a second go.

FiReaNG3L Says:

Actually I think that Akismet is bayesian-filter based, so keeping you old spam to train yourself a filter is kind of pointless (as long as their service stay up, anyway). But what happen if they go down / out of business?

Ryan Says:

Yeah, the thing is, back when I was doing this I was hopping from one anti-spam solution to the other. So it was more important to keep that corpus of data in case I opted to switch to some locally-based filter.

Leave a Reply



You are viewing a mobilized version of this site...
View original page here

Mobilized by Mowser Mowser