Configuring a hierarchy of models based on token usage limits

Question

How do I create a config to associate a group of models with a specific user, considering that I need to create a hierarchy of models from more advanced to less advanced? The idea is that, after reaching a certain token usage limit for the first model, the system switches to the second model, then to the third, and so on.
For example, if I’m using Groq, I would like the user to be able to use up to 200,000 tokens with DeepSeek R1. If they exceed that limit, they would be switched to the Llama 3.2 model. And if they exceed 500,000 tokens, they would then switch to Gemma 2.
How do I implement this?

sega · Answer

hey,

first you'd have to count your tokens with a tokenizer (honestly, you can do words/4 which is a decent measure instead of using different tokenizers like tiktoken for different models)
send the number of tokens in your metadata, and do metadata based routing in your config https://portkey.ai/docs/product/ai-gateway/conditional-routing#applying-conditional-routing-based-on-metadata

sega · Answer

there's also some open source repositories that can tokenize from different models you could check them out as well

mrKa · Answer

Thanks. I was looking for some ready solution. Then in your link I read just about it:
Soon, Portkey will also support routing based on other critical parameters like input character count, input token count, prompt type, tool support, and more.

sega · Answer

it's on the roadmap, tagging @Manjot | Portkey and @Vrushank | Portkey from product for visibility

mrKa · Answer

What about the issue of associating a group of models from a provider with a virtual key? I know that in LiteLLM, you can associate more than one provider, each with a specific group of models, within the limitation of a specific virtual key.

sega · Answer

you can create an API key and assign it a budget to the same effect

mrKa · Answer

No rate and budget limit in my panel.

sega · Answer

my bad, this is an enterprise feature I just checked

mrKa · Answer

Why? It's a simple basic feature.. LiteLLM offers it for everyone. I just want to replace LiteLLM with Portkey

Vrushank | Portkey · Answer

we'd love that! @mrKa i can enable this for your org. let me do that and get back to you?

Vrushank | Portkey · Answer

it's actually not exactly an enterprise feature, just a gated one for a while until we make it GA

mrKa · Answer

Please, should I DM you my account and org?

Vrushank | Portkey · Answer

yes!

Welcome to Portkey Forum

Configuring a hierarchy of models based on token usage limits