[API] Add Bot Activity metric API#3664
[API] Add Bot Activity metric API#3664HimasreeKolathur24 wants to merge 1 commit intochaoss:mainfrom
Conversation
Signed-off-by: HimasreeKolathur24 <himasrikolathur@gmail.com>
| SELECT | ||
| SUM( | ||
| CASE | ||
| WHEN LOWER(cn.cntrb_login) LIKE '%bot%' THEN 1 |
There was a problem hiding this comment.
id be curious if this is the best way to do bot detection. I know 8Knot has filters for this so id want to compare how that project does bot detection.
There was a problem hiding this comment.
For this initial implementation, I used a simple heuristic (LOWER(cntrb_login) LIKE '%bot%') to align with a minimal MVP and patterns I’ve seen in other Augur metrics.
I agree this may not be the most robust approach. I’m not yet familiar with 8Knot’s bot filtering logic, but I’d be happy to review how bot detection is handled there and adjust this metric to better align with that approach if you think it would be preferable here.
Would you recommend:
- reusing a similar filter list / logic from 8Knot, or
- keeping this as a simple first pass and iterating in a follow-up?
Happy to update based on your guidance.
There was a problem hiding this comment.
patterns I’ve seen in other Augur metrics
Which augur metrics were you looking at for this?
| """) | ||
| params = {"repo_id": repo_id} | ||
|
|
||
| else: |
There was a problem hiding this comment.
are these queries mostly the same except for the final join and WHERE statments?
Can we maybe build these queries with SQLAlchemy or string concatenation so the parts that are the same between these two codepaths can be reused?
There was a problem hiding this comment.
You’re right. The two queries share most of their structure, with the main differences being the final join and the WHERE clause depending on whether repo_id is provided.
Refactoring the shared parts makes sense. I can:
- extract the common SELECT / FROM portion into a base query, and
- append the repo-specific filtering conditionally (either via string composition or SQLAlchemy constructs).
I initially kept them separate for clarity, but I’m happy to refactor this to reduce duplication if that’s preferred here. Let me know if you have a style preference between SQLAlchemy-based composition vs string concatenation.
There was a problem hiding this comment.
I personally prefer SQLAlchemy constructs but i suspect other maintainers like the string style because they are used to writing queries manually.
lets go with the SQLalchemy style since its likely going to be easier to implement given the need to essentially have part of the query be dynamically determined with an if statement (and less prone to bugs than trying to string concatenate a query across multiple branches of the code)
Description
repo_meta.pyand is scoped to commit activity for this first version.This PR fixes #2594
Notes for Reviewers
bot), consistent with existing Augur practices.Signed commits