Strange issue with empty responses from completions endpoint

Question

I have a really strange issue where sometimes, with a large enough system message (or at least that's how I was able to reproduce it), I'm receiving an completely empty response from the completions endpoint (streaming).

The even weirder thing is that there's also no log showing up in the portkey UI, but I do receive a trace ID in the response headers. 46698b1e-479c-4287-a612-c249e20dee22 is such an example.

Roydev · Answer

Thanks for reporting the issue. We will update here accordingly.

flagbug · Answer

thank you! let me know if I can provide more infos to debug this

Roydev · Answer

Also @flagbug Can you please share the org id this trace belongs to will help us a lot to narrow down the issue.

flagbug · Answer

856e990e-b518-4afc-87dc-8d6d4740b8b7

flagbug · Answer

I guess it's this? It's the ID that's part of the 'getting started' page URL 🤔 Not sure where else to find it

Roydev · Answer

yesThis should helpThanksWe are checking the same

flagbug · Answer

we're hitting this really often by the way

flagbug · Answer

I hope this is a really weird edge case we're hitting here, because this seems quite like a severe issue 😅

Roydev · Answer

Hey we are checking this internallyOne more question Have this happened since the trace id you have shared with us?Will help us a lot to know

Witherwings · Answer

@flagbug are you available for a call? want to understand more on how we can reproduce this ourselves

flagbug · Answer

yes, it happened very often since then

flagbug · Answer

unfortunately I'm having a hard time reproducing this today

flagbug · Answer

basically what I was doing to reproduce it was to send 5000 words of lorem ipsum as system message and then just tell the llm to respond with 'hello' in the user message.

After a number of calls, using our fallback config with Google Vertex API -> Antrophic, a few request returned literally an empty body

Witherwings · Answer

is the latency for these requests on the higher side?

Witherwings · Answer

also were there any timeouts or connection drops from Portkey for any of these requests?

flagbug · Answer

nope, they are returning very fast

flagbug · Answer

no timeouts that I could observe

flagbug · Answer

always returned with 200 OK, a bunch of headers, but no body

flagbug · Answer

{ Date: Wed, 12 Feb 2025 17:04:30 GMT Transfer-Encoding: chunked Connection: keep-alive CF-Ray: 910e28711d94c2b1-VIE CF-Cache-Status: DYNAMIC Cache-Control: no-cache Vary: Origin, X-Origin, Referer, accept-encoding Alt-Svc: h3=":443" request-id: req_vrtx_01XG8c3dux745uYh7Jj8MPiq X-Content-Type-Options: nosniff X-Frame-Options: SAMEORIGIN x-portkey-cache-status: DISABLED x-portkey-last-used-option-index: config.targets[0] x-portkey-provider: vertex-ai x-portkey-retry-attempt-count: 0 x-portkey-trace-id: 46698b1e-479c-4287-a612-c249e20dee22 X-XSS-Protection: 0 Server: cloudflare Content-Type: text/event-stream; charset=utf-8
}

flagbug · Answer

this was the header of such a response

Witherwings · Answer

got it

Witherwings · Answer

are these calls streaming calls?

flagbug · Answer

yup

Witherwings · Answer

got it. thanks a lot of this. this will help us narrow down the root cause. will check this on priority.

flagbug · Answer

If it helps you I can start logging the trace IDs for the calls where it happens. So weird that they don't show up in the dashboard 🤔

Witherwings · Answer

as a gateway our highest priority is always to add as low latency to the requests. Also we dont want any LLM requests to fail because of how Portkey works.
so we log the requests asynchronously after the request is sent back to the user. We use the same trace id that we send in response header for the recorded log.
In this particular case, it is not coming up in the dashboard as somehow the log is not being captured correctly by us once you have received the request.
Hope that answers your question

Witherwings · Answer

one reason I can think of is do you have any mechanism of closing the stream abruptly once you receive the last chunk?

flagbug · Answer

Btw, my gut feeling tells mit it has something to do with my fallback config. unfortunately I don't have any data on this, but I believe this might have started happening when I introduced a fallback from Google Vertex API -> Antrophic from only Antrophic before.

Again, only a feeling, not 100% sure but Vertex AI has been notoriously wonky, so maybe that's a start for the investigation.

flagbug · Answer

one reason I can think of is do you have any mechanism of closing the stream abruptly once you receive the last chunk? We always consume the full stream

Witherwings · Answer

got it. thanks again for all the information you provided. this will definitely us narrow down on why this is happening

flagbug · Answer

While debugging this I literally looped the whole returned stream into a string and for these cases the final string was completely empty

flagbug · Answer

what I can do is remove the fallback config and always use a single provider to check if it still happens. Not sure if I get to it today, but will let you know if I have more information.

flagbug · Answer

Alright, I can pinpoint the issue better now:

It's not related to the fallback config
The issue doesn't occur when targeting the Antrophic provider directly
The issue does occur when targeting Google Cloud Vertex AI

flagbug · Answer

so either something is wrong with Vertex AI, or there is a bug in Portkey when forwarding the Vertex response 🤔

flagbug · Answer

Ok, I've reverted to a state before we were using Portkey and integrated Vertex AI directly, and can't reproduce this issue.This looks like something between Portkey <-> Vertex 😬

flagbug · Answer

Any news about this? We're currently blocked using Vertex because of this bug, which means we'll not be able to use Portkey in the future and have to look for something else 🙈

Witherwings · Answer

hey @flagbug we will be releasing a fix for it very soon. thanks for your patience and we dont want you to leave 🥲 . But understandable if its mission/business critical for you

visarg · Answer

Hey @flagbug , can you please share the code snippet which is being used for reading the stream response? One of the reasons that I could think of is that client is disconnecting before the whole stream is read. Due to this, the logging will not happen for the request.

Will it also be possible to make the request using some client like Postman so that we can confirm that it has nothing to do with stream handling on client level.

flagbug · Answer

I can reproduce the exact same thing in Postman. e.g just happened to me for trace ID 9b6d2d4c-5861-4a6f-883e-f58f14993075, which doesn't show up in the Portkey dashboard

flagbug · Answer

ok so what's interesting: when I use stream: false, in addition to 429 Too Many Requests I'm also, sometimes, getting 529 Service Overloaded errors from Vertex AI. I never get a 529 response (only 429) when called with stream: true. I'm wondering if that's the missing piece where the request returning 529 just disappears?

Witherwings · Answer

@flagbug are you available for a call?

flagbug · Answer

yeah

Witherwings · Answer

@flagbug can you join https://meet.google.com/jrv-vmwd-jhz?ijlm=1739889148726&hs=187&adhoc=1?

flagbug · Answer

it looks like we're having the same issue (very rarely) pretty often now with Antrophic too, and not just Vertex AI. It just didn't occur often enough that we noticed it before.

flagbug · Answer

well, unfortunately that ends our journey with Portkey, we need a production ready tool for our launch 🫤

rohit · Answer

Sorry to hear that! We're actively working to resolve this edge case

Welcome to Portkey Forum

Strange issue with empty responses from completions endpoint