EDIT: I'm putting this up front so it's the FIRST thing you see and read: I WAS WRONG I ASSUMED (and I know better) that it wasn't possible for me to have 3000 accounts created within a day or two of going live. I ASSUMED what I saw was accounts that were NOT local, I WAS WRONG I created a process to remove the bot accounts from my database without crashing my site. I have tested and it looks like all functions are working. If you need help because you suddenly have thousands more accounts than you would suspect ask me for the procedure. I'll gladly provide it.
I was able to identify bot accounts by looking at creation times. They accounts are grouped by "batches" where the account creation times are within seconds of each other. That's not typically going to happen with random humans creating accounts.
I used a tool to see how many users my site had. Once I saw the count was larger than expected, I wondered who these users were. I checked the database table and saw a huge list. I know for a fact that all these users are not on my instance. I was able to confirm that the database includes email address and password hash. This SHOULD mean that if someone tries to login, and their authentication information is sitting in my database, they can login at my site locally, correct? I only ask because I did not find an entry anywhere that lists a “home” instance for them to log in to. Am I correct in understanding that accounts are distributed like communities are?
SELECT * from local_user; provides a list of users that has a password_encrypted field. That list is exactly equal (all the same accounts are listed) to what I get from: select p.name, p.display_name, a.person_id, a.email, a.email_verified, a.accepted_application from local_user a, person p where a.person_id = p.id;
So I can see a persons a.email (email address), a.person_id, and their password_encrypted (hash) by correlating these tables, can I not?
These accounts are NOT ALL local to my server… So I MUST be being passed hashes, right?
Can you add
p.local=false
to that query?I always assume I’m wrong first, I may have put that in the wrong spot. Where should I put that in the query? I put it under the Select statement.
in the WHERE
I grabbed the first 292 names in that query, there are thousands of results. Then I compared that list to the output of the query that includes “password_encrypted.” There are matches. LOTS of matches. I’ll give you ONE person_id of a result that is in both lists. 38291
That person ID means nothing outside of your instance. The ID is a sequential number, so it's saying "the 38,291st person I've seen" and each instance's exact list of people will be different since they'll see different people in different orders based on what users subscribe to and when.
Here's a query that will tell you the instance of that person:
SELECT i.domain FROM person p INNER JOIN instance i ON p.instance_id=i.id WHERE p.id=38291;
I WAS WRONG I ASSUMED (and I know better) that it wasn't possible for me to have 3000 accounts created within a day or two of going live. I ASSUMED what I saw was accounts that were NOT local, I WAS WRONG I created a process to remove the bot accounts from my database without crashing my site. I have tested and it looks like all functions are working. If you need help because you suddenly have thousands more accounts than you would suspect ask me for the procedure. I'll gladly provide it.
I was able to identify bot accounts by looking at creation times. They accounts are grouped by "batches" where the account creation times are within seconds of each other. That's not typically going to happen with random humans creating accounts.
I'm sorry, I included an edit on my original post. If I can do anything else to rectify the problem my assumption caused, let me know. Thank you for your help.