MCP server exposing local Ollama models via LiteLLM proxy to Claude Code. Tools: query_local_model, review_code, summarize, generate_boilerplate, list_models. Deployed to k8s ai-inference namespace via ArgoCD. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
21 lines
475 B
Docker
21 lines
475 B
Docker
FROM python:3.12-slim
|
|
|
|
WORKDIR /app
|
|
|
|
COPY requirements.txt .
|
|
RUN pip install --no-cache-dir -r requirements.txt
|
|
|
|
COPY src/ ./src/
|
|
|
|
ENV PYTHONUNBUFFERED=1 \
|
|
PORT=8090 \
|
|
LITELLM_BASE_URL=http://litellm.ai-inference.svc:4000 \
|
|
REQUEST_TIMEOUT=120
|
|
|
|
EXPOSE 8090
|
|
|
|
HEALTHCHECK --interval=30s --timeout=10s --start-period=15s --retries=3 \
|
|
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8090/health')"
|
|
|
|
CMD ["python", "src/server.py"]
|