The plugin threat model

Malicious publishers & crypto-stealers

How it works

A plugin is published purely to steal — wallets, private keys, API tokens, and credentials are harvested the moment it activates, often within an otherwise functional tool.

In the wild

A "wallet helper" or popular-looking formatter that, on activation, scans the home directory and environment for keys and ships them to an attacker endpoint.

How we detect it

Every plugin is correlated against our threat intel and returns a clean / suspicious / malicious verdict; known-malicious publishers and payloads are blocked before activation.

Rug-pulls

How it works

A plugin that was trusted at install time silently auto-updates to a malicious version. The name, publisher, and listing are unchanged, so no one re-evaluates it.

In the wild

A widely-installed extension changes hands or is compromised, and a later version quietly adds new host permissions and an exfiltration call.

How we detect it

We track the content hash and declared permissions of every version. A changed hash or new permission request surfaces the drift and re-triggers a verdict before the update runs.

Typosquatting & impersonation

How it works

Look-alike names and spoofed publishers trick developers into installing the wrong thing — a one-character-off package or a clone of a popular extension.

In the wild

An npm package named one letter away from a popular library, or a "Prettier" extension from a publisher spelled prettir-team.

How we detect it

Name and publisher similarity to known-good packages is scored, and impersonation patterns are flagged as suspicious in the inventory.

Over-privileged permissions

How it works

Scope creep — a plugin requests far more access than its function needs, giving it a foothold for later abuse even if it is benign today.

In the wild

A code formatter that suddenly requests network access, full filesystem read, and shell execution.

How we detect it

Declared permissions are captured per version and scored against the plugin's stated purpose; excessive or newly-added scopes raise the risk score.

MCP tool poisoning & prompt injection

How it works

A malicious MCP server ships crafted tool descriptions or payloads that manipulate the AI agent into unintended actions — exfiltrating data, running commands, or chaining into other tools.

In the wild

An MCP "filesystem" helper whose tool description embeds instructions that hijack the agent the moment it is loaded.

How we detect it

MCP servers are first-class assets. We inventory each server's tools and endpoints and flag poisoned descriptions and suspicious tool surfaces.

Credential & token exfiltration

How it works

A plugin quietly ships secrets, session cookies, tokens, or source to an attacker-controlled endpoint — often piggybacking on legitimate-looking network calls.

In the wild

A browser extension that reads session cookies across every tab, or an IDE extension that uploads .env contents on save.

How we detect it

Exfiltration patterns and known bad endpoints feed the verdict; plugins exhibiting them are marked malicious and can be blocked or quarantined.

Unknown / unvetted MCP servers

How it works

Shadow MCP configs are added to agents with no review — each is a local process with tool access and no security team in the loop.

In the wild

A developer wires a random MCP server from a gist into Claude Code or Codex to save time; nobody else knows it exists.

How we detect it

We surface every MCP server and skill registered across your agents — registry-installed or hand-wired — so unvetted ones can no longer hide.

FAQ

Common questions.

What is a rug-pull in the plugin context?

A rug-pull is when a plugin that was trusted at install time silently auto-updates to a malicious version. The publisher, name, and listing stay the same, so the user never re-evaluates it. PluginSec tracks the content hash and permissions of every version and flags the drift before the new version runs.

What is MCP tool poisoning?

MCP tool poisoning is when a malicious MCP server ships crafted tool descriptions or payloads that manipulate the AI agent into taking unintended actions — exfiltrating data, running commands, or calling other tools. It is a form of prompt injection delivered through the tool layer.

Do you rely on antivirus signatures?

No. Generic AV signatures miss plugin-layer techniques. PluginSec models the attack techniques specific to plugins — rug-pulls, typosquatting, permission scope creep, tool poisoning, and exfiltration — and correlates plugin metadata against dedicated threat intel to produce a verdict.

How fast do you catch a newly-malicious update?

The agent continuously inventories installed plugins, so a changed content hash or a new permission request is detected as soon as the update lands on the endpoint and re-evaluated against intel before it is allowed to run.

Can you stop a threat before it runs?

Yes. PluginSec enforces your allowlist on the endpoint — block, quarantine, or warn — so a plugin with a malicious or suspicious verdict can be stopped before it executes, not just reported after the fact. See how it works.

A threat model built for the plugin era.

Malicious publishers & crypto-stealers

Rug-pulls

Typosquatting & impersonation

Over-privileged permissions

MCP tool poisoning & prompt injection

Credential & token exfiltration

Unknown / unvetted MCP servers

Common questions.

Get ahead of the plugin threat model.