TritonAI
- About
- TritonGPT
- Training & Resources
- Developer APIs
- AI Tools
Access enterprise-grade large language models (LLMs) through UC San Diego's secure and centralized API gateway.
The TritonAI Developer API gives UC San Diego faculty, staff, researchers, and campus teams programmatic access to curated large language models through a secure, centralized LLM gateway powered by LiteLLM. The gateway connects approved AI development tools and application stacks to commercial cloud providers and SDSC-hosted open-source models, while preserving campus standards for data protection, authentication, logging, and responsible use.
The program is designed for practical AI development: prototype with approved AI coding tools, connect to the gateway, build with reusable campus patterns, and then move completed artifacts into the right hosting lane for review, support, and long-term operation.
The Developer API is more than model access. It is a campus development path for AI-enabled applications, from early experimentation through production hosting.
ITS owns the shared gateway, templates, baseline risk review, usage tracking, hosting infrastructure, SSO, logs, and the right to remove unsafe or unsupported applications. Departments and project teams own their application logic, dependencies, accessibility and testing, end-user support, and a named technical point of contact.
The TritonAI Developer API provides access to models through multiple hosting environments, giving teams a way to choose the right model path based on data sensitivity, capability needs, and operational requirements.
We offer access to leading commercial models from major cloud providers, including Microsoft Azure, Google Cloud Platform, and Amazon Web Services, operating under UC San Diego's campus-wide enterprise agreements.
These agreements provide enterprise data protection terms that meet UC system requirements, ensure prompts and responses are not used for model training, and include contractual safeguards that individual accounts cannot obtain.
For projects requiring the highest level of data control, TritonAI also provides access to locally hosted open-source and open-weight models running within UC San Diego infrastructure at the San Diego Supercomputer Center.
With self-hosted models, inference processing occurs on campus infrastructure. This makes them appropriate for research involving sensitive data, projects with heightened privacy considerations, or use cases where on-premises processing is preferred or required.
API access can support experimentation, but durable campus tools need an operating model. As projects move from individual development to shared use, they may require scope review, recurring risk review, authentication, logging, support ownership, and migration into a more formal hosting environment.
Small prototypes may remain on a user desktop, laptop, or sandbox. Tools used by a department or repeated audience should move into a reviewed campus application lane. Broad or mission-critical use cases require enterprise architecture, recurring review, support planning, and a named technical owner.
We offer a curated selection of large language models to meet diverse needs and budgets. The model catalog is updated as the AI landscape evolves and includes current availability, pricing, capabilities, and hosting environment.
View Complete Model List & Pricing
The model catalog includes options across multiple providers and capabilities:
Pricing is based on token usage. The model hub displays current per-token pricing, context window sizes, available features, and hosting environment.
The LiteLLM gateway provides an OpenAI-compatible API interface, making it easier to integrate with existing tools, libraries, and codebases. If your application already works with the OpenAI API, it can typically work with TritonAI with minimal modifications.
API access is secured through API keys issued upon approval. Keys should be stored securely and should never be committed to version control or shared publicly.
Default rate limits are designed to support typical development and production workloads. Projects requiring higher throughput can request limit increases.
Approved users receive access to API documentation, code examples, integration guidance, and support resources for common development patterns.
All approved users receive $15 per month in free API credits for use with self-hosted models running on UC San Diego infrastructure. Free credits refresh monthly and are non-transferable.
Usage beyond monthly free credits for self-hosted models, as well as cloud provider model usage through Azure, Google Cloud, and AWS, is billed at pass-through cost to the designated chart string. Recharge rates reflect the actual cost of model access plus the infrastructure fee needed to sustain the service.
When requesting access, you will provide project and task information that maps to your chart string. This supports cost allocation, reporting, and budgeting for grant-funded or department-funded projects.
Usage alerts can notify teams when spending reaches defined thresholds, helping projects stay within budget and avoid surprises.
View the Complete Model List & Pricing or read through Frequently Asked Questions.
To receive the latest announcements and news, subscribe to the TritonAI mailing list.