I had a simple goal: spend less time on repetitive pull request comments across multiple repos, and more time on actual design and engineering decisions. I thought that it might be possible to automate this.
Then I thought, wait, the AI tools doesn’t know about what mistakes I make on regular basis or what actually my coding style is. The AI tools doesn’t also know about organizational coding standards for each of the domains.
So I started to experiment with connecting AI tools, in my case Cursor, with different setups to understand how I can improve don’t make same mistakes again and again.
What started as an experiment quickly became a repeatable workflow:
- connected Cursor to the GitHub CLI
- exposed my PR history from the last two years
- mine reviewer feedback for mistake patterns
- codify those patterns into reusable skills
- enforce them with global rules
- and finally, record decisions as a searchable memory bank
In this post, I will walk through the exact setup, why it works, and how you can adapt it for your own stack.
The Problem: “I Keep Repeating Similar Mistakes”
Most of us are not blocked by lack of effort. We are blocked by context switching and consistency drift.
I noticed my review comments clustered around predictable themes:
- in frontend changes: readability, naming, prop and state boundaries, and component responsibility
- in backend changes: naming (again!), error semantics, DTO validation, logging consistency, and separation of transport/business logic
- in tests: naming of tests (once again!), weak assertions, missing negative paths, and flaky setup/cleanup patterns
None of these were unknown to me. The real issue was this:
I was relying on memory under delivery pressure.
So I thought to externalize the memory.
Step 1: Let Cursor See the Right History
In my work setup, I granted Cursor command-line GitHub access and asked it to analyze:
- my PRs across repositories over the last two years
- prompted Cursor to check for coding style (naming, spacing, verbosity, etc.)
- my received code review comments
- recurring mistake categories and missing checks
This gave me a personalized quality profile instead of generic lint-like advice.
Step 2: Convert Patterns Into Stack-Specific Skills
I created three practical skills based on those findings:
- Frontend (React + TypeScript) skill
- Backend (NestJS + KoaJS) skill
- Testing (Playwright + Jest + RTL) skill
Each skill focused on “what to check” and “what to suggest” in a review, not just style policing.
Example shape of a review skill
| |
The goal was predictable review quality, even when my own attention was split.
Step 3: Add Global Rules for “Always” Behaviors
Skills are helpful when invoked at the right time. Rules are how you reduce reliance on manual prompting.
I added global rules to guide behavior such as:
- perform a standards-aware review when I ask “do a code review”
- identify stack from changed files
- apply relevant skill(s) automatically
This part feels small, but it is where workflow reliability comes from.
Step 4: Build a Context Diary (The Real Multiplier)
The biggest improvement was adding a strict always rule for post-session journaling.
After each decision-heavy chat session, Cursor writes a diary entry in both Markdown and JSON formats.
Filename convention:
| |
Storage (in my MacOS setup):
| |
This turns scattered chat history into an organized memory layer.
Example Markdown diary entry
| |
Example JSON diary entry
| |
Now when I revisit a related change later, I am not guessing. I can look up prior rationale quickly - both manually and with help of Cursor or another AI tool.
Why This Workflow Works
The system combines three strengths:
- Personalized learning loop from historical PR feedback
- Standardized execution through reusable skills and always rules
- Long-term recall through structured context journaling
This is not about replacing engineering judgment. It is about preserving it under time pressure and also ensure manually the delivery quality.
Practical Suggestions If You Want to Try This
- start with one narrow skill and iterate from real review comments
- separate stack concerns (frontend/backend/tests) to keep prompts focused
- use rules for critical automation, not for every preference
- add to this skills all coding conventions - general, project-wise, team-wise, etc.
- keep diary schema small at first, then extend only if retrieval needs it
- review and prune stale rules monthly
If you also care about clean coding conventions in React + TypeScript, the style principles in this post are a useful companion reference: Clean Code Principles & Code Conventions for React + TypeScript.
Closing Thought
I used to think “AI assistance” meant faster code generation. They are limited to that, at least, lots of people I know are using the tools for this purpose. But I think the power of the tools worth exploring.
And what actually created leverage for me was different:
fewer repeated review mistakes + better decision memory.
One more takeaway: these skills are portable!
If you define them as clear checklists, review rubrics, and decision templates, you can adapt the same system across Cursor, Claude, Copilot, and other AI coding tools with minimal rewrites.
If you / your team spends too much time rediscovering known standards, this pattern is worth exploring.