AI Agent Security Rules for Daily Use

I keep seeing people ask the wrong question about agent security.

They ask:

is Cursor safe?
is Codex safer?
should I trust Claude more?

That framing sounds reasonable, but it misses the real risk.

If you use AI agents every day, the important question is not just which model you picked.
It is what the agent can access.

Once an agent can read files, run shell commands, browse pages, or use connected apps, the security problem stops being “chatbot safety” and starts being access control.

TLDR

If you use AI agents every day, the safest setup is still boring:

keep secrets out of reach
keep permissions narrow
use repo rules like AGENTS.md
treat external content as untrusted
review risky actions before approval

The model vendor matters.
But the bigger everyday risk is usually the access you gave the agent.

The Moment The Frame Changed For Me

The mental shift was simple.

I stopped asking:

“Can this agent help me with the repo?”

and started asking:

“What exactly can this agent see, do, and send right now?”

That one question changes everything.

It makes you notice:

secret files sitting next to normal code
vague prompts like “check everything”
connected apps that did not need to be connected
approval dialogs you were about to click through too fast

That is the real workflow change with agents.

Why This Is Harder Than Normal Assistant Use

The old workflow was simple.

ask for help
copy the suggestion
apply it yourself

The new workflow is different.

give the agent the repo
let it inspect many files
let it run commands
sometimes let it browse the web or touch logged-in tools

This is why prompt injection matters so much.

OpenAI’s write-up on prompt injections makes the core problem explicit: once a model combines untrusted content with real tools and permissions, malicious instructions can become real actions.

That is not just a browser-agent problem.
OWASP’s AI Agent Security Cheat Sheet and LLM Prompt Injection Prevention Cheat Sheet describe the same pattern across files, fetched pages, connected tools, and output handling.

One Concrete Example

The clearest example I found comes from OpenAI’s ChatGPT agent documentation.

Imagine you ask an agent to help with something normal, like finding a restaurant based on your calendar and recent email.

While doing that, the agent encounters a malicious instruction hidden in external content telling it to fetch a password reset code from your email and send it to a malicious site.

That is the right way to think about agent risk.

The danger is not just “the model says something wrong.”
The danger is “the model sees untrusted input while holding useful access.”

The Four Layers I Actually Use

I think the topic gets easier when you reduce it to four layers.

1. Secrets

This is the first layer because it is the most practical one.

do not keep real secrets in normal repo context
use .env.example for setup docs
use placeholders like YOUR_API_KEY
keep credentials, keys, and private files out of easy reach

If the agent can reach the secret, your prompt rules are already weaker than you think.

2. Scope

Be specific about what the agent may touch.

exact files
exact folders
exact commands
exact tools

Bad:

Check everything and fix whatever is needed.

Better:

Work only in this repository.
Edit only `src/auth.ts` and `README.md`.
Run tests, but stop before any network command, install command, or browser action and ask for approval.

Scope is a real control.
Open-ended prompts are not.

3. Permissions

Every agent tool has some safeguards, but that does not remove the need for permission boundaries.

Anthropic’s Claude Code security docs describe a permission-based model with write restrictions and approval for network requests by default.

Cursor’s docs also matter here for a different reason.
Its Privacy FAQ says requests still go through Cursor’s backend even if you use your own API key.
Its Ignore Files docs say .cursorignore affects indexing, but it does not fully prevent chat or composer from using files in context.

Those details matter because they kill two common myths:

using your own API key does not mean the vendor never touches the request
an ignore file is not the same as a hard boundary

4. Review

This is the least glamorous layer and probably the most important one.

Review before approval:

shell commands
file diffs
outbound messages
actions on logged-in pages
anything that can move data outside your environment

If you would hesitate to give the same task to an intern with shell access and your browser already logged in, you should hesitate before giving it to an agent too.

Where `AGENTS.md` Fits

I like AGENTS.md.
I would use it.

It is a good place to define repo-level behavior rules like:

never read .env
never print environment variables
never echo tokens, cookies, or passwords
use placeholders in examples
stop and ask if secrets are required

That is useful.

But it is important to say what it is not.

AGENTS.md is not a sandbox.
It is not a filesystem boundary.
It is not data-loss prevention.

It is a behavior layer.
If the agent can still reach the secret, the stronger control is secret placement plus permissions plus review.

Here is the kind of snippet I would keep:

# AGENTS

## Secret Handling (Mandatory)

- Never read, open, cat, grep, or otherwise inspect any secret files.
- Forbidden paths include:
  - `.env`
  - `.env.*`
  - `**/.env`
  - `**/.env.*`
  - `*.pem`
  - `*.key`
  - `id_rsa*`
  - `id_ed25519*`
  - `secrets/**`
  - `credentials/**`
- Never print environment variables.
- Never echo, log, or copy token values, API keys, passwords, private keys, cookies, or session values.
- If a task appears to require secrets, stop and ask the user to provide sanitized placeholders instead.

## Safe Defaults

- Use placeholder values in examples (for example `YOUR_API_KEY`).
- Prefer `.env.example` for documentation and setup guidance.
- If a command might expose secrets in output, do not run it.

That is worth adding.
Just do not confuse it with hard isolation.

The Small Setup I Think Most People Need

You do not need an enterprise security program to work more safely with agents.

A practical daily setup is usually enough:

Keep real secrets out of the repo and out of easy working context.
Add AGENTS.md.
Keep .env.example instead of real values in documentation.
Use .gitignore and tool-specific ignore files where relevant.
Start with narrow prompts and read-only work when possible.
Require approval before network, install, browser, or connected-app actions.

That setup does not remove all risk.
It just moves you away from the most avoidable mistakes.

The Main Thing I Would Not Pretend

I would not pretend prompt injection is solved.

OpenAI’s public safety material, Anthropic’s agent docs, and OWASP all point in the same direction:

untrusted content is still dangerous
connected tools raise the stakes
least privilege still matters
human confirmation still matters

That means the honest advice is also the less exciting advice.

Not:

choose the magic safe agent
write the perfect system prompt
enable one setting and relax

Instead:

reduce access
reduce secrets in reach
reduce unnecessary connections
review before approval

FAQ

Is one agent clearly safer than the others?

Not in the way people usually mean.

Products do differ in safeguards, but your bigger everyday variable is still what access you grant and how carefully you review actions.

Does `AGENTS.md` solve agent security?

No.

It helps as a rule layer, but it is not a hard technical boundary.

Is `.cursorignore` enough to protect sensitive files?

No.

Cursor’s docs say it affects indexing, not every possible way files can be used in chat or composer context.

Does using my own API key solve the privacy problem?

No.

Cursor’s docs say requests still route through Cursor’s backend even when you use your own API key.

What tasks are highest risk?

The highest-risk tasks are the ones that combine untrusted content with meaningful access:

email
browser sessions
connected apps
credentials
production systems
customer data

Takeaway

The safest way to use AI agents every day is not to look for the safest brand and stop thinking.

It is to keep asking a more useful question:

What can this agent touch right now?

If you get that question right, the rest of the security model becomes much simpler.

I Use AI Agents Every Day. These Are the Security Rules I Actually Follow