curl --request POST \ --url https://api.portkey.ai/v1/chat/completions \ --header 'Content-Type: application/json' \ --header 'x-portkey-api-key: sxxx=' \ --header 'x-portkey-config: {"retry":{"attempts":3},"cache":{"mode":"simple"},"strategy":{"mode":"loadbalance"},"targets":[{"virtual_key":"open-ai-kxey-","weight":0.7},{"virtual_key":"test-virtual-ke","weight":0.3}]}' \ --header 'x-portkey-virtual-key: open-ai-key' \ --data '{ "model": "gpt-3.5-turbo", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello!" } ] }'
x-portkey-virtual-key
will be ignored since x-portkey-config
enabled.model
is mandatory property, and gpt-3.5-turbo
is chosen model, do loadbalace targets should be same gpt-3.5-turbo
models with different virtual keys? I suppose I am getting this bit wrong in my understanding?0.3
, 0.4
and 0.3
?As I can comprehend:Yes, you got it all right. For the last point, if both the virtual keys are OpenAI, then we will loadbalance on the given model name in
data
, which is gpt.3.5-turbo
here. If the second target is not OpenAI and let's say a provider like Anyscale, and if you haven't given a specific model name with the override_params
config, then we will pick a default model for that provider, which is llama-2-7bdata
. And if there are no override_params, then we will pick the default models for that provider