On this page
How to Create an MCP Server from Any Data Source
Most MCP server generators assume you have an OpenAPI spec or want to write TypeScript. But what if your data lives in a GitHub repository, a documentation site, or a collection of markdown files? What if you just want search across your content without building anything from scratch?
vecr.io generates MCP servers from any data source and makes them instantly searchable from Cursor, Claude Desktop, or any MCP-compatible tool. No API specs, no boilerplate, no deployment pipeline.
This guide walks through the entire process, from raw data to a working MCP server, in under five minutes.
What You Get
When you create an MCP server on vecr.io, you get:
- Semantic search over your content using vector embeddings (BGE or Cohere models)
- An SSE-based MCP endpoint compatible with any MCP client
- Token-based authentication with scoped access control
- A built-in playground to test tool calls and inspect responses
The server exposes MCP tools that let AI assistants search and retrieve relevant content from your data. This means Cursor can pull context from your internal docs, Claude Desktop can reference your knowledge base, and any MCP-compatible agent can access your data programmatically.
Supported Data Sources
vecr.io supports three types of input:
- GitHub repositories - Index source code, READMEs, documentation, issues
- Websites - Crawl and index any publicly accessible site
- Documents - Upload markdown, text, PDFs, or other files directly
You can mix sources within a single integration, building a unified search layer across your scattered content.
Step 1: Create an Integration
After signing in at vecr.io, navigate to the integration creation page. You need two things:
Name - Something descriptive like "Engineering Docs" or "Product Knowledge Base"
Embedding Model - Choose based on your use case:
| Model | Best For | Languages |
|---|---|---|
| BGE-base (768d) | General English text, code | Multilingual |
| BGE-M3 (1024d) | Multi-language content, long documents | 100+ languages |
| Cohere Embed 4 (1536d) | Mixed content: text, images, tables, code | Multilingual, multimodal |
For most developer use cases, BGE-M3 hits the sweet spot between quality and speed. If your content includes images or diagrams, go with Cohere Embed 4.
Step 2: Add Your Data
Once the integration is created, you land on the files dashboard. Upload your content here:
- Paste URLs to GitHub repos or websites for automatic crawling
- Upload files directly (markdown, text, PDF)
- Use the API for programmatic ingestion
vecr.io chunks your content, generates embeddings, and indexes everything for semantic search. Processing time depends on volume, but most datasets under a few hundred pages finish in seconds.
Step 3: Generate an API Token
Navigate to the MCP tab in your integration dashboard. Create a new token:
- Click the + button next to the token input
- Give the token a name (e.g., "Cursor Access")
- Set an expiration date
- Click Create Token
The token is automatically filled in. Keep it handy, you will need it to configure your MCP client.
Step 4: Connect from Your MCP Client
Cursor
Add the MCP server to your Cursor configuration. In your project's .cursor/mcp.json (or global MCP settings):
{
"mcpServers": {
"my-knowledge-base": {
"url": "https://mcp.vecr.io/sse/YOUR_INTEGRATION_ID",
"headers": {
"Authorization": "Bearer YOUR_TOKEN"
}
}
}
}
Replace YOUR_INTEGRATION_ID with your integration UUID and YOUR_TOKEN with the token from Step 3. Cursor will connect automatically and expose the search tools in your AI assistant context.
Claude Desktop
In your Claude Desktop MCP configuration file:
{
"mcpServers": {
"my-knowledge-base": {
"url": "https://mcp.vecr.io/sse/YOUR_INTEGRATION_ID",
"headers": {
"Authorization": "Bearer YOUR_TOKEN"
}
}
}
}
After restarting Claude Desktop, you can ask Claude to search your indexed content directly.
Any MCP Client
vecr.io uses standard SSE transport. Any client implementing the MCP specification can connect to:
Endpoint: https://mcp.vecr.io/sse/{integrationId}
Auth: Bearer token in Authorization header
Transport: SSE (Server-Sent Events)
Step 5: Test in the Playground
Before integrating into your workflow, test everything in the built-in playground on the MCP page:
- Click Connect to establish a live connection
- Click List Tools to see available search tools
- Select a tool, fill in a query, and click Call
- Inspect the response in the JSON viewer
The playground also shows connection history, server notifications, and round-trip latency. Use the Ping tab to verify connectivity.
How It Compares to Building Your Own
Building an MCP server from scratch typically requires:
- Writing a TypeScript/Python server with the MCP SDK
- Implementing tool handlers for each capability
- Setting up vector storage (Pinecone, Weaviate, FAISS, etc.)
- Building an embedding pipeline
- Deploying and managing the server infrastructure
- Implementing authentication and access control
That is a multi-week project for a production-grade setup. vecr.io collapses this into a managed service where you upload data and get a working MCP endpoint.
Trade-offs to consider: A custom server gives you full control over tool definitions, response formatting, and infrastructure. vecr.io gives you fast setup and managed search with less customization. For most "search my content" use cases, the managed approach saves significant engineering time.
How It Compares to Other MCP Generators
Most MCP server generators (Stainless, Speakeasy, liblab) convert OpenAPI specifications into MCP servers. They are excellent if you already have a well-documented REST API.
vecr.io takes a different approach: instead of wrapping an existing API, it creates a search layer over raw content. This means you can generate an MCP server from a GitHub repo, a documentation site, or uploaded files, none of which have an OpenAPI spec.
| Approach | Input | Output | Best For |
|---|---|---|---|
| OpenAPI generators | API specification | MCP tools mirroring API endpoints | Existing REST APIs |
| vecr.io | Any data (repos, sites, docs) | MCP search tools over indexed content | Knowledge bases, docs, codebases |
| Custom build | Your code | Whatever you build | Full control, custom logic |
Use Cases
Developer documentation - Index your internal docs and search them from Cursor while coding. No more switching to a browser to look up an API.
Codebase search - Point vecr.io at a GitHub repo and get semantic code search in your AI assistant. Find relevant functions by describing what you need, not memorizing file paths.
Knowledge bases - Combine multiple data sources (wiki, docs, READMEs) into a single searchable MCP server. Give your team AI-powered access to institutional knowledge.
Customer support content - Index help articles and FAQs. Build support tools that retrieve accurate answers from your actual documentation.
Getting Started
Create your first MCP server at vecr.io. The free tier includes 500 API calls per month and up to 10 pages of indexed content, enough to test the full workflow and validate the approach for your use case.
For larger datasets or production workloads, reach out for usage-based pricing that scales with your needs.
Related Articles
- What is an MCP Server? Everything You Need to Know — Understand the fundamentals of Model Context Protocol
- Cursor MCP: The Complete Guide — Step-by-step setup for MCP servers in Cursor
- Vector Search — How vector embeddings power semantic search
- Knowledge Base Search — Building searchable knowledge bases
