Welcome to Portkey Forum

Updated 11 hours ago

Strange issue with empty responses from completions endpoint

I have a really strange issue where sometimes, with a large enough system message (or at least that's how I was able to reproduce it), I'm receiving an completely empty response from the completions endpoint (streaming).

The even weirder thing is that there's also no log showing up in the portkey UI, but I do receive a trace ID in the response headers. 46698b1e-479c-4287-a612-c249e20dee22 is such an example.
2
R
f
W
46 comments
Thanks for reporting the issue. We will update here accordingly.
thank you! let me know if I can provide more infos to debug this
Also @flagbug
Can you please share the org id this trace belongs to will help us a lot to narrow down the issue.
856e990e-b518-4afc-87dc-8d6d4740b8b7
I guess it's this? It's the ID that's part of the 'getting started' page URL πŸ€” Not sure where else to find it
yes
This should help
Thanks
We are checking the same
we're hitting this really often by the way
I hope this is a really weird edge case we're hitting here, because this seems quite like a severe issue πŸ˜…
Hey we are checking this internally
One more question
Have this happened since the trace id you have shared with us?
Will help us a lot to know
@flagbug are you available for a call? want to understand more on how we can reproduce this ourselves
yes, it happened very often since then
unfortunately I'm having a hard time reproducing this today
basically what I was doing to reproduce it was to send 5000 words of lorem ipsum as system message and then just tell the llm to respond with 'hello' in the user message.

After a number of calls, using our fallback config with Google Vertex API -> Antrophic, a few request returned literally an empty body
is the latency for these requests on the higher side?
also were there any timeouts or connection drops from Portkey for any of these requests?
nope, they are returning very fast
no timeouts that I could observe
always returned with 200 OK, a bunch of headers, but no body
Plain Text
{
  Date: Wed, 12 Feb 2025 17:04:30 GMT
  Transfer-Encoding: chunked
  Connection: keep-alive
  CF-Ray: 910e28711d94c2b1-VIE
  CF-Cache-Status: DYNAMIC
  Cache-Control: no-cache
  Vary: Origin, X-Origin, Referer, accept-encoding
  Alt-Svc: h3=":443"
  request-id: req_vrtx_01XG8c3dux745uYh7Jj8MPiq
  X-Content-Type-Options: nosniff
  X-Frame-Options: SAMEORIGIN
  x-portkey-cache-status: DISABLED
  x-portkey-last-used-option-index: config.targets[0]
  x-portkey-provider: vertex-ai
  x-portkey-retry-attempt-count: 0
  x-portkey-trace-id: 46698b1e-479c-4287-a612-c249e20dee22
  X-XSS-Protection: 0
  Server: cloudflare
  Content-Type: text/event-stream; charset=utf-8
}
this was the header of such a response
are these calls streaming calls?
got it. thanks a lot of this. this will help us narrow down the root cause. will check this on priority.
If it helps you I can start logging the trace IDs for the calls where it happens. So weird that they don't show up in the dashboard πŸ€”
as a gateway our highest priority is always to add as low latency to the requests. Also we dont want any LLM requests to fail because of how Portkey works.
so we log the requests asynchronously after the request is sent back to the user. We use the same trace id that we send in response header for the recorded log.
In this particular case, it is not coming up in the dashboard as somehow the log is not being captured correctly by us once you have received the request.
Hope that answers your question
one reason I can think of is do you have any mechanism of closing the stream abruptly once you receive the last chunk?
Btw, my gut feeling tells mit it has something to do with my fallback config. unfortunately I don't have any data on this, but I believe this might have started happening when I introduced a fallback from Google Vertex API -> Antrophic from only Antrophic before.

Again, only a feeling, not 100% sure but Vertex AI has been notoriously wonky, so maybe that's a start for the investigation.
one reason I can think of is do you have any mechanism of closing the stream abruptly once you receive the last chunk?

We always consume the full stream
got it. thanks again for all the information you provided. this will definitely us narrow down on why this is happening
While debugging this I literally looped the whole returned stream into a string and for these cases the final string was completely empty
what I can do is remove the fallback config and always use a single provider to check if it still happens. Not sure if I get to it today, but will let you know if I have more information.
Alright, I can pinpoint the issue better now:
  1. It's not related to the fallback config
  2. The issue doesn't occur when targeting the Antrophic provider directly
  3. The issue does occur when targeting Google Cloud Vertex AI
so either something is wrong with Vertex AI, or there is a bug in Portkey when forwarding the Vertex response πŸ€”
Ok, I've reverted to a state before we were using Portkey and integrated Vertex AI directly, and can't reproduce this issue.

This looks like something between Portkey <-> Vertex 😬
Any news about this? We're currently blocked using Vertex because of this bug, which means we'll not be able to use Portkey in the future and have to look for something else πŸ™ˆ
hey @flagbug we will be releasing a fix for it very soon. thanks for your patience and we dont want you to leave πŸ₯² . But understandable if its mission/business critical for you
Hey @flagbug , can you please share the code snippet which is being used for reading the stream response? One of the reasons that I could think of is that client is disconnecting before the whole stream is read. Due to this, the logging will not happen for the request.

Will it also be possible to make the request using some client like Postman so that we can confirm that it has nothing to do with stream handling on client level.
I can reproduce the exact same thing in Postman. e.g just happened to me for trace ID 9b6d2d4c-5861-4a6f-883e-f58f14993075, which doesn't show up in the Portkey dashboard
ok so what's interesting: when I use stream: false, in addition to 429 Too Many Requests I'm also, sometimes, getting 529 Service Overloaded errors from Vertex AI.

I never get a 529 response (only 429) when called with stream: true. I'm wondering if that's the missing piece where the request returning 529 just disappears?
@flagbug are you available for a call?
it looks like we're having the same issue (very rarely) pretty often now with Antrophic too, and not just Vertex AI. It just didn't occur often enough that we noticed it before.
well, unfortunately that ends our journey with Portkey, we need a production ready tool for our launch 🫀
Sorry to hear that! We're actively working to resolve this edge case
Add a reply
Sign up and join the conversation on Discord