My user authenticates exclusively with an API key, and this key has an associated config. My user makes a request using the name of a model that has a virtual key associated with the provider that serves that model.
How can I set a limit on a specific model within a virtual key, so that if a rate or budget limit is exceeded, it fallsback to another model within the same virtual key or another one?
Something similar to what ChatGPT does when a user exceeds the usage limit for O1âdeactivating that model while still allowing access to other models. In my case, I would like to have both possibilities: fallback to another model, whether within the same provider or another, and also something like what ChatGPT doesâdisabling a specific model after a certain usage threshold within a week, month, or day.
Hi @mrKa this is currently not possible because we do not have a model management piece on Portkey. You could however get this to work with a combo of virtual key + conditional routing + metadata, but I'd suggest to wait 1/2 weeks - as this is the next big release on cards for Portkey
Yes, I see that for only a few users it could work after generating duplicates virtual keys and conditional routing with fallbacks set, I suppose we could set fallbacks inside the conditional. But, for example, in my use of Librechat I can't send a metadata from each model user request, instead I send the headers from the Portkey endpoint request. So the Portkey models management is a very important feature to wait for.