Welcome to Portkey Forum

Updated 4 months ago

Fallback on max length

At a glance

Hi,
Is there any fallback strategy where I can fallback if I reached max length of first LLM?
I want to basically call Llama by default and switch to Claude if we hit max tokens on llama call.

5 comments

VVrushank | Portkey

We don't specifically have fallbacks on these checks, but since the request would anyway automatically fail on exceeding the max token length, setting up a simple fallback that is triggered on ANY Llama error should work.

If there is a consistent error code that's generated for reaching max length of Llama, you can define the fallback only on that error code with the on_status_codes array as well

VVrushank | Portkey

Hope that helps!

SSiddharth Bulia

Noo... Sometimes the response length reaches the max length allowed, in that case - the api call does not fail.

VVrushank | Portkey

Got it! Currently, we don't do routing based on that, but we are soon launching a new feature (before the end of this month) where you'd be able to do something to this effect. (It's very exciting!)

I'll share it back with you as we launch?

SSiddharth Bulia

Awesomee! Waiting for it!

Add a reply