Tooling · May 6, 2026 · 3 min read

AI quoting for MSPs without vendor lock-in

Most SaaS tools that ship AI features lock you into their LLM and stack a markup on every token. MSPercury's AI runs on your own API key — Anthropic, OpenAI, or any OpenAI-compatible endpoint. Here's why that matters for an MSP.

Lucas Flores

IT Systeme Flores UG

If your MSP backend tool is going to call an LLM on your behalf, two questions matter:

Whose key is it calling on?
What does that cost you per token?

For most “AI-enabled” SaaS in this market, the answer is: their key, and there’s a markup baked into the seat price. You don’t see the token bill, you see a “20% AI surcharge” on the invoice. Switching providers — say, from a US-hosted model to one in Frankfurt for compliance — isn’t on the table.

We took the opposite route.

Bring your own model

Every AI feature in MSPercury — and there are several — calls your Anthropic, OpenAI, or OpenAI-compatible endpoint. You drop a key into Workspace Settings, pick a model, and from then on:

You pay the provider directly. No markup on our side.
You pick the model. Want Claude Sonnet for executive summaries and a cheap local model via Ollama for catalog matching? Configure both.
You can swap providers any time without losing your data. The prompts are documented; the tool just routes.
A self-hosted LLM with an OpenAI-compatible API (vLLM, LiteLLM, your own proxy) works the same as the hosted ones.

If you don’t configure a model, MSPercury runs in its classic mode. Every non-AI feature stays intact.

What the AI actually does

Concretely, today:

Executive summaries for CheckUp PDF reports (2–4 sentences, no ChatGPT clichés).
Quote drafts from a public CheckUp’s answers — 3 to 6 services from your catalog with quantities and a tailored summary, dropped straight into the normal draft-quote workflow.
First-contact emails to inbound leads, anchored on their weakest CheckUp categories.
Catalog matcher that links findings to your service items so the “generate quote from CheckUp” CTA isn’t gated on manual mapping.
Status-update structurer — three hectic bullet points become a polite customer-facing post with the right category enum.
Service-report builder that turns “did / noticed / recommended” into clean field-report prose.
Project-task generator that converts findings into a prioritised plan with hours and rationale.
Quote-reply drafter — suggests the next operator reply in a quote thread, pre-filled in the composer for review.

None of those auto-send anything. The operator is always the one who clicks the button that puts text in front of the customer. The audit trail (who/when) stays clean.

Why this is the right call

A managed service provider sells trust. Trust is bilateral: your customers trust you, and you need to trust your stack. Putting an opaque LLM intermediary between your operator’s notes and your customer’s inbox is a trust hand-off most MSPs don’t want to make blindly. Letting you choose the provider, see the prompts, and pay the bill directly is the cleanest way to keep that hand-off honest.

For MSPs in regulated markets — Germany, Spain, anywhere with a strict residency requirement — the same architecture lets you point the routing at an EU-hosted provider, or your own GPU. No “AI feature gated to a US-only model” friction.

What’s next

We’re calling MSPercury open for signups. The full feature set is documented at mspercury.com. The Partner Network and the cross-tenant Marketplace land as Pro features after the public beta — for now they’re open to every workspace so we can dogfood them on real partner relationships.

If you run an MSP and you want to try the quoting + audit + customer-portal stack with your own LLM, you can sign up here. Feedback to info@it-flores.de is read by an actual human within 24h.