Arya

Extremely Odd Issue with OpenAI Gateway Timeout and Portkey Logs Gap

Extremely odd issue, My service got hung up for 10 minutes calling OpenAI via the gateway where the model specific timeout is set to 2 min:
Portkey Logs have a gap of 10 mins:
trace_id: c1d2e8a6-8e0c-4962-b9c2-322bcb6c1c38 (original)
trace_id: 5c3d907b-17d0-4631-bdfd-720684f4f61c (cached)

the second request that portkey logs show after 10 mins hits the cache, as it exactly the same, I would assume this is result of a retry.

the metadata on logs show the first request 5130 ms though.

However when I look into my tracing application, it confirmed what I encountered, and the gap is exactly when the second request shows up in the portkey logs and hits the cache.

I would assume the call should have timed out.

2 comments

AArya

Caching and cache misses for requests with new seeds

if I am sending a new seed everytime would it cause my request miss cache ? trying to figure out why cache miss is happening for the same request.

6 comments

AArya

Understanding Fallback Configuration and Request Timeout Handling

Want to understand the fallback config, I simulated request timeout, where I get:

Plain Text

{
  "status": 408,
  "headers": {
    "content-type": "application/json",
    "x-portkey-cache-status": "MISS",
    "x-portkey-last-used-option-index": "config.targets[0]",
    "x-portkey-provider": "openai",
    "x-portkey-retry-attempt-count": "0",
    "x-portkey-trace-id": "9ac0fc87-562c-4b42-92e6-ad3cdb100880"
  },
  "body": {
    "error": {
      "message": "Request exceeded the timeout sent in the request: 12ms",
      "type": "timeout_error",
      "param": null,
      "code": null
    }
  },
  "responseTime": 1851,
  "lastUsedOptionJsonPath": "config.targets[0]"
}

My config does not include 408 in it on_status_codes list , yet the gateway fallbacks and uses the second model. What am I missing ?

Also, if I have on_status code under strategy and then nested within the target what gets preference ?

4 comments

AArya

Configuring Retry Behavior in a Nested Object

Also when I added retry to the nested object as follows:

Plain Text

{
    "strategy": {
        "mode": "fallback",
        "on_status_codes": [
            401,
            500,
            503,
            520,
            524
        ]
    },
    "request_timeout": 360000,
    "targets": [
        {
            "virtual_key": "open-ai-virtual-xxxx",
            "override_params": {
                "model": "gpt-4o-2024-08-06"
            },
            "request_timeout": 12,
            "retry": {
                "attempts": 1,
                "on_status_codes": [
                    429,
                    408
                ]
            }
        },
        {
            "virtual_key": "anthropic-api-k-xxxx",
            "override_params": {
                "model": "claude-3-7-sonnet-20250219"
            },
            "request_timeout": 120000
        },
        {
            "virtual_key": "anthropic-api-k-xxxx",
            "override_params": {
                "model": "claude-3-5-sonnet-20241022"
            },
            "request_timeout": 120000
        }
    ],
    "cache": {
        "mode": "simple",
        "max_age": 6
    }
}
....

also, when I play with the retry number, I definitely observe setting it to 3 takes long for it to fallback to anthropic, but the portkey UI only shows one log for gpt-4 and one for claude, there is on information available on retries.

2 comments

AArya

Connection timeout issue

@Team Portkey anything on the connection issue I flagged above? its blocker for us to adopt portkey and deploy to production. If nothing, the alternative unfortunately would be to use some other gateway.

6 comments

AArya

Unexpected Error While Calling Portkey Gateway

Here is the full traceback:

Plain Text

2025-02-27 08:09:14.751
stream=stdout
2025-02-27 13:09:14.737 | ERROR | ai_core.llms.open_ai_wrapper:generate_text_response_async:436 | [Portkey Gateway] Unexpected error while calling Portkey Gateway with config pc-opeai-5393d5: 'ConnectTimeout' object has no attribute 'response'
2025-02-27 08:09:14.730
stream=stdout
AttributeError: 'ConnectTimeout' object has no attribute 'response'
2025-02-27 08:09:14.730
stream=stdout
└ 1
2025-02-27 08:09:14.730
stream=stdout
│ └
2025-02-27 08:09:14.730
stream=stdout
│ │ └
2025-02-27 08:09:14.730
stream=stdout
if remaining_retries > 0 and self._should_retry(err.response):
2025-02-27 08:09:14.730
stream=stdout
File "/app/.venv/lib/python3.12/site-packages/portkey_ai/_vendor/openai/_base_client.py", line 1601, in _request
2025-02-27 08:09:14.730
stream=stdout
└
2025-02-27 08:09:14.730
stream=stdout
│ └
2025-02-27 08:09:14.730
stream=stdout
return await self._request(
2025-02-27 08:09:14.730
stream=stdout
File "/app/.venv/lib/python3.12/site-packages/portkey_ai/_vendor/openai/_base_client.py", line 1554, in request
2025-02-27 08:09:14.730
stream=stdout
└
2025-02-27 08:09:14.730
stream=stdout
│ └
2025-02-27 08:09:14.730
stream=stdout
│ │ └
2025-02-27 08:09:14.730
stream=stdout
│ │ │ └ FinalRequestOptions(method='post', url='/chat/completions', params={}, headers={'X-Stainless-Raw-Response': 'true'}, max_retr...
2025-02-27 08:09:14.730
stream=stdout
│ │ │ │ └ False
2025-02-27 08:09:14.730
stream=stdout
│ │ │ │ │ └ openai.AsyncStream[portkey_ai._vendor.openai.types.chat.chat_completion_chunk.ChatCompletionChunk]
2025-02-27 08:09:14.730
stream=stdout
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
2025-02-27 08:09:14.730
stream=stdout
File "/app/.venv/lib/python3.12/site-packages/portkey_ai/_vendor/openai/_base_client.py", line 1860, in post

15 comments

AArya

Unified AI Inference Gateway: Addressing Reliability Challenges

Is there plan to support huggingface inference endpoints ? Ideally would like to have all AI configs in a single gateway.
Problem: Dedicated Inference endpoints, especially ones using Nvidia GPU instances that scale to zero often go down in prod, and warrant a need for a fallback option.

6 comments

AArya

Mock Fallback

any idea how i can mock fallback ? (without messing around by providing wrong api key)

2 comments

AArya

Missing model name in response

I was able to get it to work via following schema:
"virtual_key": "xxxxxxxxxxx",
"override_params": {
"model": "anthropic.claude-3-7-sonnet-20250219-v1:0"
}

but when the response is returned from the server model field is empty.

e.g.

Plain Text

{
    "id": "1740523845732",
    "choices": [
        {
            "finish_reason": "max_tokens",
            "index": 0,
            "logprobs": null,
            "message": {
                "content": "{\n    \"description\": \"A Portkey is a magical object in the Harry Potter universe that has been enchanted to instantly transport anyone who touches it to a specific predetermined destination. It can be any ordinary object (like an old boot, newspaper, or bottle) that has been spelled to transport wizards an",
                "role": "assistant",
                "function_call": null,
                "tool_calls": null,
                "refusal": null,
                "audio": null
            }
        }
    ],
    "created": 1740523845,
    "model": "",
    "object": "chat.completion",
    "system_fingerprint": null,
    "usage": {
        "prompt_tokens": 25,
        "completion_tokens": 64,
        "total_tokens": 89,
        "completion_tokens_details": null,
        "prompt_tokens_details": null
    },
    "service_tier": null,
    "provider": "bedrock"
}

so I am unsure how will know which model among the multiple options of fallback was called.

1 comment

Welcome to Portkey Forum

Extremely Odd Issue with OpenAI Gateway Timeout and Portkey Logs Gap

Caching and cache misses for requests with new seeds

Understanding Fallback Configuration and Request Timeout Handling

Configuring Retry Behavior in a Nested Object

Connection timeout issue

Unexpected Error While Calling Portkey Gateway

Unified AI Inference Gateway: Addressing Reliability Challenges

Mock Fallback

Missing model name in response