Threativore: An anti-spam automoderator for Lemmy

db0@lemmy.dbzer0.com · 8 months ago

Threativore: An anti-spam automoderator for Lemmy

Snot Flickerman@lemmy.blahaj.zone · 8 months ago

Great work, thank you for making the effort!

bdonvr@thelemmy.club · 8 months ago

I’d really like a spam-signup type detector. Like if someone signs up and immediately starts posting or commenting way too much they should be given a few days ban, and if done again - permanently.

db0@lemmy.dbzer0.com · 8 months ago

That could affect power user who just moved instances

bdonvr@thelemmy.club · 8 months ago

That’s true but there’s still a difference between power user and spambot.

Blaze@reddthat.com · 8 months ago

Great, thank you for this!

FaceDeer@kbin.social · 8 months ago

Another more general property that might be worth looking for would be substantially similar posts that get cross-posted to a wide variety of communities in a short period of time. That’s a pattern that can have legitimate reasons but it’s probably worth raising a flag to draw extra scrutiny.

One idea for making it computationally lightweight but also robust against bots “tweaking” the wording of each post might be to fingerprint each post based on rare word usage. Spam is likely to mention the brand name of whatever product it’s hawking, which is probably not going to be a commonly used word. So if a bunch of posts come along that all use the same rare words all at once, that’s suspicious. I could also easily see situations where this gives false positives, of course - if some product suddenly does something newsworthy you could see a spew of legitimate posts about it in a variety of communities. But no automated spam checker is perfect.

db0@lemmy.dbzer0.com · edit-2 8 months ago

Feel free to submit a PR for these ideas. For post similarity, ML learning techniques can be used to calculate the “distance” between two posts, but I don’t know if with an increasing amount of spam could work computation wise. Especially if spammers start using their own GenerativeAI engines.

FaceDeer@kbin.social · 8 months ago

That’s why I was suggesting such a simple approach, it doesn’t require AI or machine learning except in the most basic sense. If you want to try applying fancier stuff you could use those basic word-based filters as a first pass to reduce the cost.

db0@lemmy.dbzer0.com · 8 months ago

There’s likely a lot of anti spam tactics we can employ. I hope people will help improve it

GlitterInfection@lemmy.world · 8 months ago

Honestly, my dream lemmy client would combine posts in my home and all feed based solely on the links in the post regardless of community or instance, and it would then provide UX to present the rest of the information if I choose to click into it.

Lemmy is designed around a concept that almost requires but definitely invites spamming links. Assuming you have good intentions and want to reach a wider federated audience, you would post your link to a few instances at once.

Danterious@lemmy.dbzer0.com · 8 months ago

This is sort of related but do you have any plans on looking for coordinated voting?

db0@lemmy.dbzer0.com · 8 months ago

Not atm. Wouldn’t even begin to know where to look.

Threativore: An anti-spam automoderator for Lemmy

Threativore: An anti-spam automoderator for Lemmy

GitHub - db0/threativore: A Thrediverse bot fight against spam