forty 24 minutes ago
DangitBobby 9 minutes ago
thatjoeoverthr 25 minutes ago
cadamsdotcom 39 minutes ago
Ask your agent for ways to do this using code, not more AI.
It might propose - and build! - an embeddings based system and scraper for your issues & PRs. Using that will burn zero tokens and you can iterate on it as you think of improvements.
CodingJeebus an hour ago
Because there's no money in trying to filter out noise that costs next to nothing to generate. It's like asking why no startup is trying to bring forum moderation to the masses.
ranger_danger an hour ago
Why? There's no reason you need to actually handle that many in a day, right? Pace yourself.
tantalor 36 minutes ago
tayo42 an hour ago
ltbarcly3 an hour ago
Step 1: have it sum up every issue and pr in like 100 words. You can have it do it using subagents working on subsets of the tickets so it doesn't take forever.
Step 1a: concatenate all the summary files to one big file.
Step 2: have it check pairs that seem duplicate from the summary. You may have to force it to read the entire file, for whatever reason models are trained to try to avoid just reading stuff into their context and will try grep and writing scripts and whatever else.
Step 3: repeat the above until it stops finding dupes.
I think this will probably take about 4 hours? 2 hours to get the process working and 2 hours of looping it.
If you don't think the above will work well please just move along, don't bother arguing with me because I've done tasks like this over and over and it works great.
Ways to get better results in general:
- Start by having it write a script to dump all the relevant information you will need up front. It's much faster at reading files than trying to do mcp calls. It's also less likely to pretend to read files and just assume it didn't find anything. (happens more than you think)
- Break the problem down into clear steps for the model, don't just give it a vague project. Just paste the steps above and it should work fine.
- Check what it is doing. Don't assume that because it says it read a file it actually read it, it will very often read the first 1000 bytes, then not read any of the rest of it, then just assume it read everything. In fact ChatGPT will complain that the input is truncated when it is the one that chose to only read the first part.