How do you deploy a machine learning model?
1 Answer
Deploying a machine learning (ML) model means making it available so real users or systems can send data to it and get predictions. Think of it as moving from “training in your laptop” to “running in production.”
Here’s a clear, practical flow:
1. Train and Save the Model
You first build and train your model using tools like scikit-learn, TensorFlow, or PyTorch.
Then save it:
.pkl/.joblib(scikit-learn).h5or SavedModel (TensorFlow).pt(PyTorch)
2. Wrap the Model in an API
You expose the model using a web API so other apps can call it.
Common frameworks:
- Flask (simple)
- FastAPI (fast & production-ready)
Example flow:
- Input → API → Model → Prediction → Response (JSON)
3. Containerize (Optional but Recommended)
Use Docker to package:
- Code
- Model
- Dependencies
This ensures it runs the same everywhere.
4. Choose Deployment Type
A. Cloud Deployment
Popular platforms:
- Amazon Web Services (EC2, SageMaker)
- Google Cloud Platform (Vertex AI)
- Microsoft Azure (Azure ML)
B. Server-based Deployment
- Host API on a VM (Linux server)
- Use Nginx + Gunicorn for production
C. Serverless Deployment
- AWS Lambda / Azure Functions
- Good for low-traffic or event-based predictions
D. Edge / On-device
- Convert model (e.g., TensorFlow Lite) for mobile or IoT
5. Handle Scaling & Performance
- Use load balancers
- Add caching (e.g., Redis)
- Batch predictions if needed
6. Monitor & Maintain
Track:
- Model accuracy (drift)
- Latency
- Errors
Tools:
- Logging systems
- Monitoring dashboards
7. CI/CD Pipeline (Advanced)
Automate:
- Model retraining
- Testing
- Deployment
Simple Real-World Architecture
User → Frontend → API (FastAPI/Flask)
↓
ML Model
↓
Prediction
Quick Example (FastAPI)
from fastapi import FastAPI
import joblib
app = FastAPI()
model = joblib.load("model.pkl")
@app.get("/")
def home():
return {"message": "ML API running"}
@app.post("/predict")
def predict(data: dict):
prediction = model.predict([data["input"]])
return {"result": prediction.tolist()}
In Short
- Deployment =
Model + API + Hosting + Scaling + Monitoring