In order to enable local AI capabilities an additional server is required with the following specifications:
Minimum (CPU Only):
CPU: 64 cores
RAM: 256GB
Storage: 100GB SSD
Production (GPU):
CPU: 64 cores
RAM: 128GB
GPU: 1x H100 80GB
Storage: 500GB NVMe SSD
We are leveraging a 120 Billion parameter open source model and therefore greatly benefit from GPU acceleration. While the system is flexible enough to handle any open source model, the smaller models don't perform as well and often struggle with tool calling.