TritonGPT Large Language Models (LLMs)

As commercial generative AI services continue to evolve, users are increasingly familiar with large language models (LLMs) from providers such as OpenAI and others. In response, the TritonGPT platform at UC San Diego offers access to multiple LLMs, empowering users to choose the model best suited to their specific needs.

To maintain the highest standards of data protection, all models are deployed through secure, institutionally governed infrastructure that ensures compliance, privacy, and robust security controls.

GPT-OSS-120B

TritonGPT's default model is GPT-OSS-120B, OpenAI’s open-weight model designed for strong reasoning, tool use, and self-hosted deployment. It's a good on-prem choice for privacy-first RAG, internal knowledge workflows, and scalable business use cases where local inference matters. Users often view it as an open workhorse: not the flashiest model, but a reliable option for content generation, knowledge management, and operational tasks.

By hosting GPT-OSS within SDSC's secure environment, UC San Diego retains full control over data usage and sharing. All processing occurs entirely within the university's infrastructure, ensuring institutional oversight and protection.

Other Available Models

GPT 5.4 — OpenAI's newest flagship model. A strong general-purpose choice for writing, summarizing, analysis, and complex reasoning tasks.
Gemini 3.1 Pro — Google's latest high-capability model, well suited to long documents, research, and tasks that need careful, structured thinking.
Gemini 3.1 Flash — A faster, lighter version of Gemini 3.1 designed for quick answers, drafting, and everyday questions.
Claude Sonnet 4.6 — Anthropic's latest balanced model. A good fit for careful writing, editing, and nuanced conversations.
Gemma 4 26b — An efficient open model option for fast, lightweight tasks.

The OpenAI 4 Series models have been retired and are no longer available. If you previously relied on one of those models, GPT 5.4 is the recommended replacement.

OpenAI Models

The OpenAI models are made available through Microsoft Azure. Data shared with the Azure OpenAI Service remains private and secure. Microsoft operates this service entirely within the Azure AI Foundry environment, ensuring that user inputs and outputs are not shared with OpenAI or any other external entities. Additionally, this data is never used to train or enhance any models. The Azure OpenAI Service operates independently and does not interact with OpenAI-operated services.

Models accessed via Azure AI Foundry are configured to maintain full data residency within the United States.

Commitment to Privacy and Compliance

By leveraging Azure's secure infrastructure for cloud models and SDSC's on-premises hosting for open-source models, UC San Diego ensures that all data privacy, security, and compliance requirements are fully met.

Model Summaries

Model	Common User Vibe	Often Used For
Gemma 4 (26B)	Local powerhouse	Math, tool-calling, on-prem assistants
GPT-OSS-120B	Open workhorse	Privacy-first content, RAG, knowledge workflows
Claude Sonnet 4.6	Collaborative senior dev	Writing, collaborative coding, human tone
Gemini 3.1 Flash-Lite	Snappy fixer	Fast summaries, quick coding, lightweight tasks
Gemini 3.1 Pro	Multimodal specialist	Image, audio, video, deeper analysis
GPT-5.4	Academic expert	Deep research, error-finding, complex reasoning