mrKa

Setting Usage Limits and Fallback Options for Models within a Virtual Key

My user authenticates exclusively with an API key, and this key has an associated config. My user makes a request using the name of a model that has a virtual key associated with the provider that serves that model.

How can I set a limit on a specific model within a virtual key, so that if a rate or budget limit is exceeded, it fallsback to another model within the same virtual key or another one?

Something similar to what ChatGPT does when a user exceeds the usage limit for O1—deactivating that model while still allowing access to other models. In my case, I would like to have both possibilities: fallback to another model, whether within the same provider or another, and also something like what ChatGPT does—disabling a specific model after a certain usage threshold within a week, month, or day.

3 comments

mmrKa

Conditional routing error

Getting this error when conditional routing to Openrouter:

"x-clerk-auth-message": "Invalid JWT form. A JWT consists of three parts separated by dots. (reason=token-invalid, token-carrier=header)",
    "x-clerk-auth-reason": "token-invalid",
    "x-clerk-auth-status": "signed-out",

9 comments

mmrKa

Configuring a hierarchy of models based on token usage limits

How do I create a config to associate a group of models with a specific user, considering that I need to create a hierarchy of models from more advanced to less advanced? The idea is that, after reaching a certain token usage limit for the first model, the system switches to the second model, then to the third, and so on.
For example, if I’m using Groq, I would like the user to be able to use up to 200,000 tokens with DeepSeek R1. If they exceed that limit, they would be switched to the Llama 3.2 model. And if they exceed 500,000 tokens, they would then switch to Gemma 2.
How do I implement this?

Welcome to Portkey Forum

Setting Usage Limits and Fallback Options for Models within a Virtual Key

Conditional routing error

Configuring a hierarchy of models based on token usage limits

Gateway UI