Welcome to Portkey Forum

Hello,
In the documentation I see an option to block a specific model but, when I try the same Config is not accepting "model" property. Please let me know

This is from Portkey documentation:
For example, if you're using OpenAI and want to block a specific model, your configuration might look like this:

{
"provider": "openai",
"api_key": "your-api-key",
"model": "gpt-4" // Specify only the model(s) you want to use
}
2 comments
v
s
Hi all, working on some configs and trying to understand if it is possible to block a specific model through the config. I am able to load balance and set the weight to zero, but is there a way to explicitly block a model from a provider?
2 comments
W
s
is there a way to pass pdf files to vertexai without uploading to google storage? Their sdk supports it, was wondering how to do it with portkey
https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference
4 comments
s
H
if I am sending a new seed everytime would it cause my request miss cache ? trying to figure out why cache miss is happening for the same request.
2 comments
s
A
Hey everyone. I'm evaluating using portkey's oss gateway. I'm confused by the documentation as I'm unable to find some features to get started, in particular virtual keys. Is that not included in the oss gateway?
2 comments
M
b
Heyy everyone, I am new to Portkey and am a bit overwhelmed by this amazing project. I am currently local hosting the docker compose container and am working on conditional routing. The documentation told me to create virtual keys for the targets, however, I could not find out where can I create those virtual keys locally except creating the keys on the official app.portkey.ai console. Does that mean that to use conditional routing, I must access the service through portkey cloud?

Also, I could not access the gateway console on http://localhost:8787/public/ as this route simply does not exist in the index.ts. Therefore, I am super confused by the GitHub README documentation.
7 comments
v
R
Hi everyone!
A few days ago we moved from Gateway 1.5.3 to 1.9.8 and having issues with Bedrock Sonnet model.
With Gateway 1.5.3 we were setting max_tokens=200_000, since max_tokens parameter was required so we decided to set it to a big number.
Now with Gateway 1.9.8 we're getting the following error:
Plain Text
Error code: 400 - {\'error\': {\'message\': \'bedrock error: The maximum tokens you requested exceeds the model limit of 4096. Try again with a maximum tokens value that is lower than 4096.\', \'type\': None, \'param\': None, \'code\': None}, \'provider\': \'bedrock\'}

If we remove the max_tokens param or set it to 4096 or lower, the request is successful.
If we downgrade to Gateway 1.5.3 then even if we set max_tokens=200_000 the request passes successfully.
So we wonder why is the request failing in Gateway 1.9.8? It seems that there is a logic somewhere that checks if max_tokens_val > max_tokens_max_allowed_val.

Also we deduced that the problem starts happening in Gateway 1.8.0 where changes involving max_tokens took place.
5 comments
s
a
Feature Suggestion:

  • Currently, in the prompt tab, users can only create a single folder, with all prompts placed inside it.
  • I suggest adding the ability to create nested folders (sub-folders). This would significantly improve the organization and structure of prompts, especially as the number of prompts grows over time, users often need another layer of breakdown (sub-folders).
2 comments
M
M
I noticed prompts can have labels (production, staging etc). How do I use it over the prompt render api?
1 comment
v
hey Team,

It seems the logs are not longer showing:

  1. Prompt
  2. User input
Now it is just showing the assistant response, is that intented?
Viewing the user input was extremely helpful. Now it takes us a lot more time to debug
2 comments
s
v
Can I get a report or summary of all the emails that contain a specific string and the total cost?
For instance
Get all requests containing STRING.com in the _user
Sum all the requests' costs
3 comments
s
G
Also related, I added the Langchain callback handler, but I do not see the tree-like representation of my trace, instead i just see a linear timeline of llm calls. I was expecting something like in the scrrenshot here: https://portkey.ai/docs/integrations/agents/langgraph#5-traces
4 comments
s
b
I am but confused by the docs here under the Auto-Instrumentation heading. The image is broken and the instrumentation field does not exist in the SDK. What am I missing? 😄
https://portkey.ai/docs/integrations/agents/langgraph#auto-instrumentation
8 comments
s
b
c
cartman_
·

Cancel

how to cancel the account? I couldn't find it from billing or other places. More importantly, i asked to cancel the service from one of your support engineer, and you still charge me for months.
5 comments
V
c
Question: With AWS bedrock + Claude I see that sometimes the request "kinda fails" i.e. the response has a 200 status code but response content is empty. Can I somehow use the portkey native retry mechanism here? Since the status code is 200, I cannot use the status code based retries.
2 comments
s
b
Also when I added retry to the nested object as follows:

Plain Text
{
    "strategy": {
        "mode": "fallback",
        "on_status_codes": [
            401,
            500,
            503,
            520,
            524
        ]
    },
    "request_timeout": 360000,
    "targets": [
        {
            "virtual_key": "open-ai-virtual-xxxx",
            "override_params": {
                "model": "gpt-4o-2024-08-06"
            },
            "request_timeout": 12,
            "retry": {
                "attempts": 1,
                "on_status_codes": [
                    429,
                    408
                ]
            }
        },
        {
            "virtual_key": "anthropic-api-k-xxxx",
            "override_params": {
                "model": "claude-3-7-sonnet-20250219"
            },
            "request_timeout": 120000
        },
        {
            "virtual_key": "anthropic-api-k-xxxx",
            "override_params": {
                "model": "claude-3-5-sonnet-20241022"
            },
            "request_timeout": 120000
        }
    ],
    "cache": {
        "mode": "simple",
        "max_age": 6
    }
}
....

also, when I play with the retry number, I definitely observe setting it to 3 takes long for it to fallback to anthropic, but the portkey UI only shows one log for gpt-4 and one for claude, there is on information available on retries.
2 comments
v
A
Want to understand the fallback config, I simulated request timeout, where I get:

Plain Text
{
  "status": 408,
  "headers": {
    "content-type": "application/json",
    "x-portkey-cache-status": "MISS",
    "x-portkey-last-used-option-index": "config.targets[0]",
    "x-portkey-provider": "openai",
    "x-portkey-retry-attempt-count": "0",
    "x-portkey-trace-id": "9ac0fc87-562c-4b42-92e6-ad3cdb100880"
  },
  "body": {
    "error": {
      "message": "Request exceeded the timeout sent in the request: 12ms",
      "type": "timeout_error",
      "param": null,
      "code": null
    }
  },
  "responseTime": 1851,
  "lastUsedOptionJsonPath": "config.targets[0]"
}

My config does not include 408 in it on_status_codes list , yet the gateway fallbacks and uses the second model. What am I missing ?

Also, if I have on_status code under strategy and then nested within the target what gets preference ?
4 comments
v
A
@Team Portkey anything on the connection issue I flagged above? its blocker for us to adopt portkey and deploy to production. If nothing, the alternative unfortunately would be to use some other gateway.
6 comments
s
A
Hi @Team Portkey I am unable to add Tools for Claude Sonnet 3.5/3.7 with AWS Bedrock on the prompt playground. Can this be please fixed?
7 comments
S
b
v
and there are multiple bugs related to image_urls that are repeatedly happening
8 comments
s
k
M
I work for a giant multinational company, and when I recently created a portkey account, I ended up being a member of an "Organization" created by some dude 9 time zones away from me who works for some completely independent group within the company. But now I can't create virtual keys or configure anything or do any of the investigations I need to do in order to build our app's PortKey integration. I have asked 9-time-zones-away dude to remove me, but in the meanwhile, is there any way I can get myself removed or even just delete my account so I can sign up again and decline membership in 9-time-zones-away dude's organization?
2 comments
v
I've noticed that the costs in Portkey logs are consistently off for me.

Just now, with Claude Sonnet Thinking, I used OpenRouter via Portkey, and OR showed me a cost of 26 cents, while portkey said only 7 cents. Also the token count was wrong too. Maybe because it's a new thinking model?

But I've seen similar issues with Perplexity; the costs on Portkey and my actual usage on the Perplexity API don't add up :/
7 comments
v
A
Hay guys, do you have any method to contruct a url using the trade_id returned by prompt completions?
We need an access point with the prompt and input snapshot for each iteation.
If not, at least a way to retrieve the information available through the trace_id.
1 comment
s
Here is the full traceback:
Plain Text
2025-02-27 08:09:14.751
stream=stdout
2025-02-27 13:09:14.737 | ERROR | ai_core.llms.open_ai_wrapper:generate_text_response_async:436 | [Portkey Gateway] Unexpected error while calling Portkey Gateway with config pc-opeai-5393d5: 'ConnectTimeout' object has no attribute 'response'
2025-02-27 08:09:14.730
stream=stdout
AttributeError: 'ConnectTimeout' object has no attribute 'response'
2025-02-27 08:09:14.730
stream=stdout
└ 1
2025-02-27 08:09:14.730
stream=stdout
│ └
2025-02-27 08:09:14.730
stream=stdout
│ │ └
2025-02-27 08:09:14.730
stream=stdout
if remaining_retries > 0 and self._should_retry(err.response):
2025-02-27 08:09:14.730
stream=stdout
File "/app/.venv/lib/python3.12/site-packages/portkey_ai/_vendor/openai/_base_client.py", line 1601, in _request
2025-02-27 08:09:14.730
stream=stdout
└
2025-02-27 08:09:14.730
stream=stdout
│ └
2025-02-27 08:09:14.730
stream=stdout
return await self._request(
2025-02-27 08:09:14.730
stream=stdout
File "/app/.venv/lib/python3.12/site-packages/portkey_ai/_vendor/openai/_base_client.py", line 1554, in request
2025-02-27 08:09:14.730
stream=stdout
└
2025-02-27 08:09:14.730
stream=stdout
│ └
2025-02-27 08:09:14.730
stream=stdout
│ │ └
2025-02-27 08:09:14.730
stream=stdout
│ │ │ └ FinalRequestOptions(method='post', url='/chat/completions', params={}, headers={'X-Stainless-Raw-Response': 'true'}, max_retr...
2025-02-27 08:09:14.730
stream=stdout
│ │ │ │ └ False
2025-02-27 08:09:14.730
stream=stdout
│ │ │ │ │ └ openai.AsyncStream[portkey_ai._vendor.openai.types.chat.chat_completion_chunk.ChatCompletionChunk]
2025-02-27 08:09:14.730
stream=stdout
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
2025-02-27 08:09:14.730
stream=stdout
File "/app/.venv/lib/python3.12/site-packages/portkey_ai/_vendor/openai/_base_client.py", line 1860, in post
15 comments
s
A