Welcome to Portkey Forum

G
Gijs
Offline, last seen 3 days ago
Joined December 3, 2024
Hey folks, I had to setup load balancing for Perplexity (good problem to have). It seems to work, I get Loadbalancer active in the dashboard. But can I please trouble you guys to double check my config?

I have a hunch that I should be able to simplify it and wouldn't need to input virt_key_1 twice, but not sure how

{
"virtual_key": "virt_key_1",
"cache": {
"mode": "semantic",
"max_age": 10000
},
"retry": {
"attempts": 5,
"on_status_codes": [
429
]
},
"strategy": {
"mode": "loadbalance"
},
"targets": [
{
"virtual_key": "virt_key_1"
},
{
"virtual_key": "virt_key_2"
}
]
}
12 comments
V
G
Ayo legends, happy new year almost. I'm trying to reduce latency because I think it's quite high.

I put some measurements in my portkey route, and comes down to this:

Final timing: {
auth: 455.3472920060158,
balanceCheck: 143.57929101586342,
messageProcessing: 0.00041601061820983887,
portkeyInit: 1.2397089898586273,
messagePrep: 0.03700000047683716,
toolsSetup: 0.0010839998722076416,
apiCallSetup: 4235.661958009005,
portkeyCall: 4235.673999994993,
timeToFirstToken: 4904.3970829844475,
totalTime: 14430.662541985512
}

It looks like main bottleneck is happening after calling portkey actually. Any tips? Am I messing something up in my config or something?
5 comments
V
G
Any known issue or different trick to get user to show for image generation? For chat completion I just pass it in metadata. Tried to do the same
9 comments
V
G
Hey folks, I'm trying to implement a 'stop' button for streaming chat responses. Portkey docs don't mention this, but the general way I read about is to include a stop signal and an abortController.

So I just tried it, but seems Portkey doesn't handle it.

Any advice for how to handle this?
19 comments
V
G
V
G
I'm running into an issue with vision messages that every image message hits the semantic cache. Obviously that's really bad because it will return text that is based on a different image.

How do I get around this?
2 comments
V
Hey folks, the portkey doc is a bit ambivalent on this: https://portkey.ai/docs/product/ai-gateway/multimodal-capabilities/vision

But I should be able to send images to GPT-4o and 4o-mini using the regular completions route, no?
8 comments
V
G
S
I have a question about using vision properly in a chat context

In ChatGPT once you have an image in your chat conversation, you can continue to chat with it. It seems like this requires sending the image to the LLM every time (similar to sending the entire messages array in a chat convo). First of all, this will use a lot of tokens, but also, from a practical perspective, how would this be done?

Do you just keep passing all the Base64 data in a messages array like below?

messages: [
{
role: "user",
content: [
{ type: "text", text: "What’s in this image?" },
{
type: "image_url",
image_url:
"https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
},
],
},
],
1 comment
V