Creating MCP Server for LM Studio and Open WebUI

Understanding the Three-Layer Architecture

When integrating AI applications with existing hardware or services, you don’t need to modify your device code. Instead, you introduce an MCP Server as middleware—a translator between the AI world and your real-world devices. 🤖

The Architecture

1. AI Frontend (LM Studio / Open WebUI) – Knows how to talk to MCP servers
2. MCP Server (Your custom Python/Node script) – The middleman that bridges AI and devices
3. Your Device (Existing service) – Unchanged, continues working as before

Real-World Example: Water Pump Controller

Imagine you have a water pump controller running at http://127.0.0.1:9090 that returns “Alive” when queried.

Your Goal: Type in LM Studio “Check if my water pump controller is running!” and have it automatically query your pump without modifying any device code.

The MCP Server Script

Here’s how you write the middleware in Python:

1
2
3
4
5
6
7
8
9
10
11
12
13

from mcp.server.fastmcp import FastMCP
import requests

mcp = FastMCP("WaterPumpTools")

@mcp.tool()
def check_pump_status() -> str:
"""Checks if the water pump controller is currently running and active."""
try:
response = requests.get("http://127.0.0.1:9090", timeout=5)
return f"The pump responded with: {response.text}"
except requests.exceptions.RequestException:
return "The pump controller is offline or unreachable."

The Interaction Flow

1. You type in LM Studio: “Check if my water pump controller is running!”

2. LM Studio reads your prompt and discovers the check_pump_status tool from your MCP server

3. LM Studio delegates to MCP Server: “I need to check the pump status. Please execute this tool.”

4. MCP Server executes: Performs requests.get(“http://127.0.0.1:9090”) behind the scenes

5. Your pump responds: Returns “Alive” (unchanged behavior)

6. MCP Server responds to LM Studio: “The pump responded with: Alive”

7. LM Studio translates: Converts raw data to natural language: “Yes, I checked your controller and it is fine and alive!”

💡 Key Benefit: Your hardware controllers stay simple and pure—no complex AI protocols needed. The MCP server handles all the translation.

Configuring LM Studio

Step 1: Open the Configuration File

• Open LM Studio
• Click the Plug icon in the right sidebar (or go to Program tab)
• Click Install ➡️ Edit mcp.json

Step 2: Add Your MCP Server

Add your script to the mcpServers object:

1
2
3
4
5
6
7
8

{
"mcpServers": {
"water-pump-controller": {
"command": "python3",
"args": ["/absolute/path/to/your/pump_script.py"]
}
}
}

Replace /absolute/path/to/your/pump_script.py with the actual location of your Python file.

Step 3: Activate in Chat

• Save the configuration file
• Go back to the Chat view
• Click the Plug icon again
• You’ll see “WaterPumpTools” listed as an available server
• Toggle it ON

Your local LLM now knows about check_pump_status and will use it when appropriate!

Configuring Open WebUI

Open WebUI runs inside Docker, so it needs your MCP server to be a running background service rather than a local script.

Step 1: Run Your Script as a Web Service

Modify the bottom of your Python script to run as a persistent server:

1 2	if __name__ == "__main__": mcp.run(transport="http", port=8001)

Run it in your terminal:

1	python3 pump_script.py

It will stay active and listen on port 8001.

Step 2: Add to Open WebUI Admin Settings

• Open Open WebUI in your browser
• Log in as Administrator
• Navigate to ⚙️ Admin Settings ➡️ External Tools
• Click + to add a new server
• Set the configuration:

Type: MCP (Streamable HTTP) (not OpenAPI!)
Server URL: http://localhost:8001/mcp (or http://host.docker.internal:8001/mcp if using Docker Desktop)
Auth: None

• Click Save

Step 3: Enable for Your Model

• Go to the Models dashboard in Open WebUI
• Click your preferred model (e.g., Gemma 4, Qwen3)
• Scroll to the Tools section
• Find check_pump_status and toggle it ON

Now when you chat with that model, Open WebUI will automatically relay requests to your local MCP server!

Key Takeaways

No Device Modification: Your hardware stays unchanged—it’s still just returning data over HTTP

Clean Separation: MCP acts as a pure translation layer between AI and real-world systems

Scalable: Add multiple tools to your MCP server for different device queries

Local and Private: Everything runs on your machine with no cloud dependency

By using MCP as middleware, you gain the full power of AI-assisted device control while keeping your legacy systems and hardware controllers simple and independent. 🚀