User Tools

Site Tools


mastodon_spam_scanner

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
mastodon_spam_scanner [2023/02/09 03:11]
admin
mastodon_spam_scanner [2023/05/02 10:11] (current)
neil
Line 1: Line 1:
 ====== Mastodon SEO Spam ====== ====== Mastodon SEO Spam ======
-I occasionally notice some spam accounts being created on my Mastodon instance. ​ If they haven'​t posted to the timeline then they aren't reported and the only way I can spot them is to manually review new accounts when they sign up.  On those rare days that there is a massive spike in signups (we had 11k sign up to https://​glasgow.social over a few days in November) it's just not feasible to manually review. ​ I made this script to let me review the accounts after the fact (this also helps catch those spammers who create the account, wait a few days, then modify them).+{{ :​pasted:​20230209-031500.png?​200|An example of the type of accounts this script finds}} 
 +I occasionally notice some spam accounts being created on my Mastodon instance. ​ If they haven'​t posted to the timeline then they generally ​aren'​t ​spotted/reported and the only way I can see them is to manually review new accounts when they sign up.  On those rare days that there is a massive spike in signups (we had 11k sign up to https://​glasgow.social over a few days in November) it's just not feasible to manually review. ​ I made this script to let me review the accounts after the fact (this also helps catch those spammers who create the account, wait a few days, then modify them).
  
-I'm still working out how best to identify ​them.  At the moment, I'm looking at the custom fields (called '​attachment'​ in Mastodon) and counting the URLs there. ​ If there are four URLs then it's often spam.+I'm still working out how best to identify ​a spammer.  At the moment, I'​m ​just looking at the custom fields (called '​attachment'​ in Mastodon) and counting the URLs there. ​ If there are four URLs then it's often spam. 
  
 First, I get a list of all the local users by connecting to my postgres database: First, I get a list of all the local users by connecting to my postgres database:
  
 <code sql> <code sql>
-copy (select username,​suspended_at from accounts where domain is null) to '​users.csv';​+copy (select username,​suspended_at from accounts where domain is null) to '/tmp/users.csv' with delimiter ',';
 </​code>​ </​code>​
  
Line 38: Line 39:
 I can then search for these in the moderation interface and review them. I can then search for these in the moderation interface and review them.
  
-The php code to generate the scores:+The php code to generate the scores ​(remember to create a cache directory with ''​mkdir cache''​):
  
 <code php scan_for_spammers.php>​ <code php scan_for_spammers.php>​
Line 78: Line 79:
       }       }
       $percent_complete = number_format(($progress/​$total)*100,​1);​       $percent_complete = number_format(($progress/​$total)*100,​1);​
-      $moderation_link = $mastodon_host."/​admin/​accounts?​origin=local&​username="​.$username;​+      $moderation_link = "<a href='$mastodon_host/​admin/​accounts?​origin=local&​username="​.$username."'>​mod link</​a><​br />";
       echo $score."​\t$username\t$moderation_link\n";​       echo $score."​\t$username\t$moderation_link\n";​
       // this outputs a progress indicator to stderr       // this outputs a progress indicator to stderr
Line 91: Line 92:
 ?> ?>
 </​code>​ </​code>​
 +
 +I added a moderation link to the CSV output so I can just open that file in a browser with this for example:
 +<code bash>
 +php scan_for_spammers.php | grep -E "​^4"​ > output.html
 +</​code>​
 +
 +To answer a question on Mastodon; ​ You could add a list of spam keywords or suspicious urls at the top of the file, for example:
 +
 +<code php>
 +$spam_keywords = array('​spam_term',​ '​spamwebsite.com'​);​
 +</​code>​
 +
 +Then add a loop just after the ''​foreach($attachment..''​ to search the profile text for a url or keyword, for example, adding this would increase the score generated based on more keywords matching:
 +
 +<code php>
 +      foreach($spam_keywords as $keyword) {
 +         ​if(preg_match("/​$keyword/​i",​ $json['​summary'​]))
 +            $score++;
 +      }
 +
 +</​code>​
 +
 +Back to the [[Mastodon]] page.
mastodon_spam_scanner.1675912290.txt.gz · Last modified: 2023/02/09 03:11 by admin