Model Compatibility

Complete guide to Ollama models compatible with Thox.ai devices. Latest 2024-2025 models with vision, multilingual, and professional capabilities.

Latest Ollama Models (2024-2025)

This catalog features the newest generation of Ollama models optimized for professional use. All models have been evaluated for compatibility with Thox.ai's 16GB RAM architecture and NVIDIA Jetson Orin NX hardware.

Vision Models

Native image understanding, medical imaging, OCR in 32+ languages

Multilingual

Support for 12-32+ languages including English, Spanish, Chinese, Arabic

Performance

Optimized with Q4_K_M/INT4_AWQ quantization for Jetson hardware

Updated: December 28, 2025 - Catalog refreshed with Llama 4 Scout, Ministral-3, Gemma3, and other frontier models

Filter Models

Showing 7 compatible models
Recommended

Ministral-3 8B

Edge-optimized vision model with 32+ languages. Perfect for single devices with vision needs.

VisionMultilingualTools

Size

8B parameters

Speed

40-60 tok/s

Memory

10GB

Context

256K tokens

Backend

Ollama

Min Devices

1x

Best For

HealthcareEducationLegal
ministral-3:8b

Gemma 3 8B

Google's efficient vision model optimized for single GPU. Excellent balance of performance and capability.

VisionTools

Size

8B parameters

Speed

38-55 tok/s

Memory

10GB

Context

128K tokens

Backend

Ollama

Min Devices

1x

Best For

EnterpriseResearchDevelopment
gemma3:8b
Recommended

Qwen 3 14B

Advanced reasoning model with vision and multilingual support. Excellent for complex professional tasks.

VisionMultilingualToolsThinking

Size

14B parameters

Speed

30-45 tok/s

Memory

14GB

Context

128K tokens

Backend

TensorRT-LLM

Min Devices

1x

Best For

ResearchLegalFinance
qwen3:14b

Phi-4 Mini (3.8B)

Microsoft's compact model with exceptional performance. Multilingual with function calling.

MultilingualTools

Size

3.8B parameters

Speed

70-95 tok/s

Memory

4GB

Context

128K tokens

Backend

Ollama

Min Devices

1x

Best For

EducationBusinessDevelopment
phi4:mini

Llama 3.2 8B

Meta's reliable foundation model. Excellent for general professional use.

Size

8B parameters

Speed

42-65 tok/s

Memory

10GB

Context

128K tokens

Backend

Ollama

Min Devices

1x

Best For

All Industries
llama3.2:8b

Qwen 2.5 Coder 14B

State-of-the-art coding model with reasoning improvements and 128K context.

ToolsThinking

Size

14B parameters

Speed

28-42 tok/s

Memory

14GB

Context

128K tokens

Backend

TensorRT-LLM

Min Devices

1x

Best For

Software DevelopmentEnterpriseTechnical Teams
qwen2.5-coder:14b

DeepSeek-Coder-V2 16B

Advanced coding model with MoE architecture. Excellent for software engineering.

Tools

Size

16B parameters

Speed

25-38 tok/s

Memory

16GB

Context

64K tokens

Backend

TensorRT-LLM

Min Devices

1x

Best For

Software DevelopmentTech Companies
deepseek-coder-v2:16b

Compatibility Guide

Single Device (16GB RAM)

Best for 3-14B parameter models with Q4_K_M or INT4_AWQ quantization:

  • Ministral-3 8B (Vision, 32+ languages)
  • Phi-4 Mini 3.8B (Ultra-fast, multilingual)
  • Qwen 3 14B (Vision, thinking mode)
  • Gemma 3 8B (Vision, single GPU optimized)

MagStack 2x (32GB RAM)

Unlocks frontier models with 10M context and multimodal capabilities:

  • Llama 4 Scout (109B with MoE, 10M context, 12 languages)
  • Qwen 3 32B (Vision, advanced reasoning)
  • DeepSeek-Coder-V2 16B (Code specialist)

MagStack 4x+ (64GB+ RAM)

Enterprise-grade frontier models for professional workflows:

  • Custom 70B+ models for healthcare, legal, finance
  • Llama 4 Maverick (400B with MoE)
  • Enterprise-specific fine-tuned models

Pro Tip: For vision tasks, we recommend Ministral-3 8B (single device) or Llama 4 Scout (2x stack). For coding, try Qwen 2.5 Coder 14B with TensorRT acceleration.

Quick Start Guide

# Pull a model from Ollama

ollama pull ministral-3:8b

# Run the model

ollama run ministral-3:8b

# For vision tasks, attach an image

ollama run ministral-3:8b "Analyze this medical image" /path/to/image.jpg

CONFIDENTIAL AND PROPRIETARY INFORMATION

This documentation is provided for informational and operational purposes only. The specifications and technical details herein are subject to change without notice. Thox.ai LLC reserves all rights in the technologies, methods, and implementations described.

Nothing in this documentation shall be construed as granting any license or right to use any patent, trademark, trade secret, or other intellectual property right of Thox.ai LLC, except as expressly provided in a written agreement.

Patent Protection

The MagStack™ magnetic stacking interface technology, including the magnetic alignment system, automatic cluster formation, NFC-based device discovery, and distributed inference me...

Reverse Engineering Prohibited

You may not reverse engineer, disassemble, decompile, decode, or otherwise attempt to derive the source code, algorithms, data structures, or underlying ideas of any Thox.ai hardwa...

Thox.ai™, Thox OS™, MagStack™, and the Thox.ai logo are trademarks or registered trademarks of Thox.ai LLC in the United States and other countries.

NVIDIA, Jetson, TensorRT, and related marks are trademarks of NVIDIA Corporation. Ollama is a trademark of Ollama, Inc. All other trademarks are the property of their respective owners.

© 2026 Thox.ai LLC. All Rights Reserved.