HomeGuidesAPI ReferenceChangelog
Guides

Local AI Server Deployment

Deploy a RegScale AI Inference Server locally to support RegML functionality. This option is for air-gapped or private infrastructure environments. Our SaaS platform includes this by default for cloud users.

System Requirements

Host Requirements

Hardware Requirements

  • GPU: 12GB+ VRAM (16GB preferred)
  • RAM: 16–24GB
  • CPU: 4+ cores
  • Storage: 50GB free

Model Download

We recommend using Microsoft’s Phi-4 model, plus a transformer for embeddings.

Prerequisites

  • Python 3.6+
  • Required packages:
pip install transformers sentence-transformers huggingface_hub torch

Download Models

python model_downloader.py "microsoft/Phi-4" "/path/to/save"
python model_downloader.py "sentence-transformers/all-MiniLM-L6-v2" "/path/to/save"

Setup Overview

Step 1: Generate API Key

Use the provided script from RegScale to generate your INFERENCE_API_KEY.

Step 2: Configure Environment Variables

Edit the following files:

ai.env

LOGLEVEL=INFO
LOCAL_MODELS_PATH=/app/local_models
LANGUAGE_MODEL_NAME=Phi-4
INFERENCE_API_KEY=<Your_API_Key>

atlas.env (AI-related variables only)

regmlEnabledEnv=true
regmlModelSelector=AzureAI.OpenAI.gpt-4o
RegmlAiModelRegistry__Models__AzureAI.OpenAI.gpt-4o__regmlInferenceApiKey=<Your_API_Key>
RegmlAiModelRegistry__Models__AzureAI.OpenAI.gpt-4o__regmlInferenceEndpoint=http://reg-ai-inference:8000/regml/query/
CoreSettings__RegmlEmbeddingsApiUrl=http://reg-ai-inference:8000/regml/embeddings/
CoreSettings__RegmlEmbeddingsApiKey=<Your_API_Key>

Note: db.env remains unchanged.


Step 3: Launch

Use the provided Docker Compose file:

sudo docker compose -f docker-compose-regscale-standalone.yml up -d

This will start RegScale, the AI Inference Server, and the SQL database.


Monitoring

Track:

  • GPU memory (stay below 90%)
  • RAM and CPU load on startup
  • Inference response times