Setting Usage Limits and Fallback Options for Models within a Virtual Key

Question

My user authenticates exclusively with an API key, and this key has an associated config. My user makes a request using the name of a model that has a virtual key associated with the provider that serves that model.

How can I set a limit on a specific model within a virtual key, so that if a rate or budget limit is exceeded, it fallsback to another model within the same virtual key or another one?

Something similar to what ChatGPT does when a user exceeds the usage limit for O1—deactivating that model while still allowing access to other models. In my case, I would like to have both possibilities: fallback to another model, whether within the same provider or another, and also something like what ChatGPT does—disabling a specific model after a certain usage threshold within a week, month, or day.

Vrushank | Portkey · Answer

Hi @mrKa this is currently not possible because we do not have a model management piece on Portkey. You could however get this to work with a combo of virtual key + conditional routing + metadata, but I'd suggest to wait 1/2 weeks - as this is the next big release on cards for Portkey

Vrushank | Portkey · Answer

We are building a "Model Catalog" feature which will allow you to do exactly this!

mrKa · Answer

Yes, I see that for only a few users it could work after generating duplicates virtual keys and conditional routing with fallbacks set, I suppose we could set fallbacks inside the conditional.
But, for example, in my use of Librechat I can't send a metadata from each model user request, instead I send the headers from the Portkey endpoint request.
So the Portkey models management is a very important feature to wait for.

Welcome to Portkey Forum

Setting Usage Limits and Fallback Options for Models within a Virtual Key