Thox OS™
Purpose-built for AI inference at the edge with Hybrid TensorRT-LLM
Overview
A custom Linux-based operating system optimized for AI workloads, featuring hybrid Ollama + TensorRT-LLM inference for 60-100% faster performance on large models.
Thox OS™ is engineered from the ground up to deliver uncompromising AI performance while maintaining the security, reliability, and developer experience that professionals demand. Every component has been carefully optimized for the unique requirements of edge AI inference.
Key Features
Hybrid AI Inference
Smart routing between Ollama (7B) and TensorRT-LLM (14B+) for optimal performance.
TensorRT-LLM Acceleration
60-100% faster inference on 14B+ models using NVIDIA TensorRT-LLM with INT4/INT8 quantization.
Secure by Design
Hardware-backed security with TPM 2.0 integration and verified boot chain. HIPAA and GDPR ready.
Silent Operation
Intelligent thermal management maintains whisper-quiet operation below 25 dBA. Perfect for any workspace.
Seamless Updates
Over-the-air updates with automatic rollback protection for reliability.
User Friendly
Intuitive web dashboard, easy setup, and full API compatibility for any workflow integration.
Technical Highlights
System Components
Thermal Management
Intelligent cooling algorithms maintain optimal performance while keeping noise levels below 25 dBA during typical workloads.
Security Framework
Hardware-backed security with TPM 2.0 integration, secure boot chain, and encrypted storage for your data and models.
Connectivity Stack
Full support for WiFi 6E, Bluetooth 5.3, 2.5Gbps Ethernet, and USB 3.2 with optimized drivers for low latency.
Hybrid AI Runtime
Ollama Backend (7B Models)
- 45-72 tokens/s inference speed
- Quick model swapping
- 100+ compatible models
- Port 11434
TensorRT-LLM Backend (14B+)
- 60-100% faster inference
- INT4/INT8 quantization
- Native Jetson execution
- Port 11435
Smart Router (Port 8080)
Automatically routes requests to the optimal backend based on model size. 7B models → Ollama, 14B+ models → TensorRT-LLM. OpenAI-compatible API with backend info in responses.
Pre-installed Models
- Thox.ai Coder 7B (Ollama)
- Thox.ai Coder 14B (TensorRT)
- Thox.ai Coder 32B (TensorRT)
- Model Context Protocol (MCP)
Developer Tools
- Web dashboard with TensorRT status
- Thox CLI + TensorRT engine builder
- SSH access enabled
- systemd services for boot
Boot Experience
████████╗██╗ ██╗ ██████╗ ██╗ ██╗ █████╗ ██╗ ╚══██╔══╝██║ ██║██╔═══██╗╚██╗██╔╝ ██╔══██╗██║ ██║ ███████║██║ ██║ ╚███╔╝ ███████║██║ ██║ ██╔══██║██║ ██║ ██╔██╗ ██╔══██║██║ ██║ ██║ ██║╚██████╔╝██╔╝ ██╗ ██╗██║ ██║██║ ╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ Thox OS v1.1 - Hybrid AI Inference Engine Platform: NVIDIA Jetson Orin NX (JetPack 6.x) Starting services... ✓ thox-ollama.service [11434] Ollama Runtime ✓ thox-tensorrt.service [11435] TensorRT-LLM ✓ thox-api-hybrid.service[8080] Smart Router Ready for inference in 8.2 seconds
Thox OS™ boots directly into an optimized AI-ready state with hybrid inference. The smart router automatically selects Ollama (7B) or TensorRT-LLM (14B+) for optimal performance.
Legal Notice
Thox OS™ is a trademark of Thox.ai LLC. All rights reserved. © 2026 Thox.ai LLC.
The Thox OS™ software, including its architecture, design, source code, object code, algorithms, data structures, user interface, APIs, and all associated documentation, is proprietary and confidential information of Thox.ai LLC. This software is protected by U.S. and international copyright laws, trade secret laws, and other intellectual property laws.