Understanding the Three-Layer Architecture
When integrating AI applications with existing hardware or services, you don’t need to modify your device code. Instead, you introduce an MCP Server as middleware—a translator between the AI world and your real-world devices. 🤖
The Architecture
1. AI Frontend (LM Studio / Open WebUI) – Knows how to talk to MCP servers
2. MCP Server (Your custom Python/Node script) – The middleman that bridges AI and devices
3. Your Device (Existing service) – Unchanged, continues working as before
Real-World Example: Water Pump Controller
Imagine you have a water pump controller running at http://127.0.0.1:9090 that returns “Alive” when queried.
Your Goal: Type in LM Studio “Check if my water pump controller is running!” and have it automatically query your pump without modifying any device code.
The MCP Server Script
Here’s how you write the middleware in Python:
1 2 3 4 5 6 7 8 9 10 11 12 13 | from mcp.server.fastmcp import FastMCP import requests mcp = FastMCP("WaterPumpTools") @mcp.tool() def check_pump_status() -> str: """Checks if the water pump controller is currently running and active.""" try: response = requests.get("http://127.0.0.1:9090", timeout=5) return f"The pump responded with: {response.text}" except requests.exceptions.RequestException: return "The pump controller is offline or unreachable." |
The Interaction Flow
1. You type in LM Studio: “Check if my water pump controller is running!”
2. LM Studio reads your prompt and discovers the check_pump_status tool from your MCP server
3. LM Studio delegates to MCP Server: “I need to check the pump status. Please execute this tool.”
4. MCP Server executes: Performs requests.get(“http://127.0.0.1:9090”) behind the scenes
5. Your pump responds: Returns “Alive” (unchanged behavior)
6. MCP Server responds to LM Studio: “The pump responded with: Alive”
7. LM Studio translates: Converts raw data to natural language: “Yes, I checked your controller and it is fine and alive!”
💡 Key Benefit: Your hardware controllers stay simple and pure—no complex AI protocols needed. The MCP server handles all the translation.
Configuring LM Studio
Step 1: Open the Configuration File
• Open LM Studio
• Click the Plug icon in the right sidebar (or go to Program tab)
• Click Install ➡️ Edit mcp.json
Step 2: Add Your MCP Server
Add your script to the mcpServers object:
1 2 3 4 5 6 7 8 | { "mcpServers": { "water-pump-controller": { "command": "python3", "args": ["/absolute/path/to/your/pump_script.py"] } } } |
Replace /absolute/path/to/your/pump_script.py with the actual location of your Python file.
Step 3: Activate in Chat
• Save the configuration file
• Go back to the Chat view
• Click the Plug icon again
• You’ll see “WaterPumpTools” listed as an available server
• Toggle it ON
Your local LLM now knows about check_pump_status and will use it when appropriate!
Configuring Open WebUI
Open WebUI runs inside Docker, so it needs your MCP server to be a running background service rather than a local script.
Step 1: Run Your Script as a Web Service
Modify the bottom of your Python script to run as a persistent server:
1 2 | if __name__ == "__main__": mcp.run(transport="http", port=8001) |
Run it in your terminal:
1 | python3 pump_script.py |
It will stay active and listen on port 8001.
Step 2: Add to Open WebUI Admin Settings
• Open Open WebUI in your browser
• Log in as Administrator
• Navigate to ⚙️ Admin Settings ➡️ External Tools
• Click + to add a new server
• Set the configuration:
Type: MCP (Streamable HTTP) (not OpenAPI!)
Server URL: http://localhost:8001/mcp (or http://host.docker.internal:8001/mcp if using Docker Desktop)
Auth: None
• Click Save
Step 3: Enable for Your Model
• Go to the Models dashboard in Open WebUI
• Click your preferred model (e.g., Gemma 4, Qwen3)
• Scroll to the Tools section
• Find check_pump_status and toggle it ON
Now when you chat with that model, Open WebUI will automatically relay requests to your local MCP server!
Key Takeaways
No Device Modification: Your hardware stays unchanged—it’s still just returning data over HTTP
Clean Separation: MCP acts as a pure translation layer between AI and real-world systems
Scalable: Add multiple tools to your MCP server for different device queries
Local and Private: Everything runs on your machine with no cloud dependency
By using MCP as middleware, you gain the full power of AI-assisted device control while keeping your legacy systems and hardware controllers simple and independent. 🚀