Welcome to Portkey Forum

Updated 2 months ago

Fallback on max length

At a glance
Hi,
Is there any fallback strategy where I can fallback if I reached max length of first LLM?
I want to basically call Llama by default and switch to Claude if we hit max tokens on llama call.
V
S
5 comments
We don't specifically have fallbacks on these checks, but since the request would anyway automatically fail on exceeding the max token length, setting up a simple fallback that is triggered on ANY Llama error should work.

If there is a consistent error code that's generated for reaching max length of Llama, you can define the fallback only on that error code with the on_status_codes array as well
Noo... Sometimes the response length reaches the max length allowed, in that case - the api call does not fail.
Got it! Currently, we don't do routing based on that, but we are soon launching a new feature (before the end of this month) where you'd be able to do something to this effect. (It's very exciting!)

I'll share it back with you as we launch?
Awesomee! Waiting for it!
Add a reply
Sign up and join the conversation on Discord