Portkey Community

Blocking a Specific Model

Hello,
In the documentation I see an option to block a specific model but, when I try the same Config is not accepting "model" property. Please let me know

This is from Portkey documentation:
For example, if you're using OpenAI and want to block a specific model, your configuration might look like this:

{
"provider": "openai",
"api_key": "your-api-key",
"model": "gpt-4" // Specify only the model(s) you want to use
}

2 comments

JJaymie

Blocking a specific model through the config

Hi all, working on some configs and trying to understand if it is possible to block a specific model through the config. I am able to load balance and set the weight to zero, but is there a way to explicitly block a model from a provider?

2 comments

HHarold Senzei

Is There a Way to Pass Pdf Files to Vertex Ai Without Uploading to Google Storage?

is there a way to pass pdf files to vertexai without uploading to google storage? Their sdk supports it, was wondering how to do it with portkey
https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference

4 comments

AArya

Caching and cache misses for requests with new seeds

if I am sending a new seed everytime would it cause my request miss cache ? trying to figure out why cache miss is happening for the same request.

2 comments

bburnin

Evaluating Portkey's Oss Gateway

Hey everyone. I'm evaluating using portkey's oss gateway. I'm confused by the documentation as I'm unable to find some features to get started, in particular virtual keys. Is that not included in the oss gateway?

2 comments

RRocketDonald

Accessing virtual keys for conditional routing in portkey

Heyy everyone, I am new to Portkey and am a bit overwhelmed by this amazing project. I am currently local hosting the docker compose container and am working on conditional routing. The documentation told me to create virtual keys for the targets, however, I could not find out where can I create those virtual keys locally except creating the keys on the official app.portkey.ai console. Does that mean that to use conditional routing, I must access the service through portkey cloud?

Also, I could not access the gateway console on http://localhost:8787/public/ as this route simply does not exist in the index.ts. Therefore, I am super confused by the GitHub README documentation.

7 comments

aabinesh

Handling PDF Data with Gemini

https://portkey.ai/docs/integrations/llms/gemini#document%2C-video%2C-audio-processing-with-gemini I see this, but is there anything specific for handling pdf data

9 comments

aartyabra

Troubleshooting Issues with Bedrock Sonnet Model and Gateway Version Upgrade

Hi everyone!
A few days ago we moved from Gateway 1.5.3 to 1.9.8 and having issues with Bedrock Sonnet model.
With Gateway 1.5.3 we were setting max_tokens=200_000, since max_tokens parameter was required so we decided to set it to a big number.
Now with Gateway 1.9.8 we're getting the following error:

Plain Text

Error code: 400 - {\'error\': {\'message\': \'bedrock error: The maximum tokens you requested exceeds the model limit of 4096. Try again with a maximum tokens value that is lower than 4096.\', \'type\': None, \'param\': None, \'code\': None}, \'provider\': \'bedrock\'}

If we remove the max_tokens param or set it to 4096 or lower, the request is successful.
If we downgrade to Gateway 1.5.3 then even if we set max_tokens=200_000 the request passes successfully.
So we wonder why is the request failing in Gateway 1.9.8? It seems that there is a logic somewhere that checks if max_tokens_val > max_tokens_max_allowed_val.

Also we deduced that the problem starts happening in Gateway 1.8.0 where changes involving max_tokens took place.

5 comments

MMohamed Soliman

Prompt nested folders

Feature Suggestion:

Currently, in the prompt tab, users can only create a single folder, with all prompts placed inside it.

I suggest adding the ability to create nested folders (sub-folders). This would significantly improve the organization and structure of prompts, especially as the number of prompts grows over time, users often need another layer of breakdown (sub-folders).

2 comments

bbhuvansingla

Prompt labels and the prompt render api

I noticed prompts can have labels (production, staging etc). How do I use it over the prompt render api?

1 comment

sshockdav

The user input is no longer visible

hey Team,

It seems the logs are not longer showing:

Prompt
User input

Now it is just showing the assistant response, is that intented?
Viewing the user input was extremely helpful. Now it takes us a lot more time to debug

2 comments

GGiblix

report or summary of emails containing specific string and total cost

Can I get a report or summary of all the emails that contain a specific string and the total cost?
For instance
Get all requests containing STRING.com in the _user
Sum all the requests' costs

3 comments

bbhuvansingla

the langchain callback handler does not display a tree-like representation of the trace

Also related, I added the Langchain callback handler, but I do not see the tree-like representation of my trace, instead i just see a linear timeline of llm calls. I was expecting something like in the scrrenshot here: https://portkey.ai/docs/integrations/agents/langgraph#5-traces

4 comments

bbhuvansingla

Auto-instrumentation

I am but confused by the docs here under the Auto-Instrumentation heading. The image is broken and the instrumentation field does not exist in the SDK. What am I missing? 😄
https://portkey.ai/docs/integrations/agents/langgraph#auto-instrumentation

8 comments

ccartman_

Cancel

how to cancel the account? I couldn't find it from billing or other places. More importantly, i asked to cancel the service from one of your support engineer, and you still charge me for months.

5 comments

bbhuvansingla

Handling Intermittent Failures with AWS Bedrock and Claude

Question: With AWS bedrock + Claude I see that sometimes the request "kinda fails" i.e. the response has a 200 status code but response content is empty. Can I somehow use the portkey native retry mechanism here? Since the status code is 200, I cannot use the status code based retries.

2 comments

AArya

Configuring Retry Behavior in a Nested Object

Also when I added retry to the nested object as follows:

Plain Text

{
    "strategy": {
        "mode": "fallback",
        "on_status_codes": [
            401,
            500,
            503,
            520,
            524
        ]
    },
    "request_timeout": 360000,
    "targets": [
        {
            "virtual_key": "open-ai-virtual-xxxx",
            "override_params": {
                "model": "gpt-4o-2024-08-06"
            },
            "request_timeout": 12,
            "retry": {
                "attempts": 1,
                "on_status_codes": [
                    429,
                    408
                ]
            }
        },
        {
            "virtual_key": "anthropic-api-k-xxxx",
            "override_params": {
                "model": "claude-3-7-sonnet-20250219"
            },
            "request_timeout": 120000
        },
        {
            "virtual_key": "anthropic-api-k-xxxx",
            "override_params": {
                "model": "claude-3-5-sonnet-20241022"
            },
            "request_timeout": 120000
        }
    ],
    "cache": {
        "mode": "simple",
        "max_age": 6
    }
}
....

also, when I play with the retry number, I definitely observe setting it to 3 takes long for it to fallback to anthropic, but the portkey UI only shows one log for gpt-4 and one for claude, there is on information available on retries.

2 comments

AArya

Understanding Fallback Configuration and Request Timeout Handling

Want to understand the fallback config, I simulated request timeout, where I get:

Plain Text

{
  "status": 408,
  "headers": {
    "content-type": "application/json",
    "x-portkey-cache-status": "MISS",
    "x-portkey-last-used-option-index": "config.targets[0]",
    "x-portkey-provider": "openai",
    "x-portkey-retry-attempt-count": "0",
    "x-portkey-trace-id": "9ac0fc87-562c-4b42-92e6-ad3cdb100880"
  },
  "body": {
    "error": {
      "message": "Request exceeded the timeout sent in the request: 12ms",
      "type": "timeout_error",
      "param": null,
      "code": null
    }
  },
  "responseTime": 1851,
  "lastUsedOptionJsonPath": "config.targets[0]"
}

My config does not include 408 in it on_status_codes list , yet the gateway fallbacks and uses the second model. What am I missing ?

Also, if I have on_status code under strategy and then nested within the target what gets preference ?

4 comments

AArya

Connection timeout issue

@Team Portkey anything on the connection issue I flagged above? its blocker for us to adopt portkey and deploy to production. If nothing, the alternative unfortunately would be to use some other gateway.

6 comments

bbhuvansingla

Tools for Claude Sonnet 3.5/3.7 with AWS Bedrock on the Prompt Playground

Hi @Team Portkey I am unable to add Tools for Claude Sonnet 3.5/3.7 with AWS Bedrock on the prompt playground. Can this be please fixed?

7 comments

kkoishore

Recurring Image URL Bugs

and there are multiple bugs related to image_urls that are repeatedly happening

8 comments

CChad

Navigating organizational complexities in a multinational company

I work for a giant multinational company, and when I recently created a portkey account, I ended up being a member of an "Organization" created by some dude 9 time zones away from me who works for some completely independent group within the company. But now I can't create virtual keys or configure anything or do any of the investigations I need to do in order to build our app's PortKey integration. I have asked 9-time-zones-away dude to remove me, but in the meanwhile, is there any way I can get myself removed or even just delete my account so I can sign up again and decline membership in 9-time-zones-away dude's organization?

2 comments

AAnshul

Inconsistencies in Portkey Cost Reporting

I've noticed that the costs in Portkey logs are consistently off for me.

Just now, with Claude Sonnet Thinking, I used OpenRouter via Portkey, and OR showed me a cost of 26 cents, while portkey said only 7 cents. Also the token count was wrong too. Maybe because it's a new thinking model?

But I've seen similar issues with Perplexity; the costs on Portkey and my actual usage on the Perplexity API don't add up :/

7 comments

eendyb

Get log by trace id

Hay guys, do you have any method to contruct a url using the trade_id returned by prompt completions?
We need an access point with the prompt and input snapshot for each iteation.
If not, at least a way to retrieve the information available through the trace_id.

1 comment

AArya

Unexpected Error While Calling Portkey Gateway

Here is the full traceback:

Plain Text

2025-02-27 08:09:14.751
stream=stdout
2025-02-27 13:09:14.737 | ERROR | ai_core.llms.open_ai_wrapper:generate_text_response_async:436 | [Portkey Gateway] Unexpected error while calling Portkey Gateway with config pc-opeai-5393d5: 'ConnectTimeout' object has no attribute 'response'
2025-02-27 08:09:14.730
stream=stdout
AttributeError: 'ConnectTimeout' object has no attribute 'response'
2025-02-27 08:09:14.730
stream=stdout
└ 1
2025-02-27 08:09:14.730
stream=stdout
│ └
2025-02-27 08:09:14.730
stream=stdout
│ │ └
2025-02-27 08:09:14.730
stream=stdout
if remaining_retries > 0 and self._should_retry(err.response):
2025-02-27 08:09:14.730
stream=stdout
File "/app/.venv/lib/python3.12/site-packages/portkey_ai/_vendor/openai/_base_client.py", line 1601, in _request
2025-02-27 08:09:14.730
stream=stdout
└
2025-02-27 08:09:14.730
stream=stdout
│ └
2025-02-27 08:09:14.730
stream=stdout
return await self._request(
2025-02-27 08:09:14.730
stream=stdout
File "/app/.venv/lib/python3.12/site-packages/portkey_ai/_vendor/openai/_base_client.py", line 1554, in request
2025-02-27 08:09:14.730
stream=stdout
└
2025-02-27 08:09:14.730
stream=stdout
│ └
2025-02-27 08:09:14.730
stream=stdout
│ │ └
2025-02-27 08:09:14.730
stream=stdout
│ │ │ └ FinalRequestOptions(method='post', url='/chat/completions', params={}, headers={'X-Stainless-Raw-Response': 'true'}, max_retr...
2025-02-27 08:09:14.730
stream=stdout
│ │ │ │ └ False
2025-02-27 08:09:14.730
stream=stdout
│ │ │ │ │ └ openai.AsyncStream[portkey_ai._vendor.openai.types.chat.chat_completion_chunk.ChatCompletionChunk]
2025-02-27 08:09:14.730
stream=stdout
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
2025-02-27 08:09:14.730
stream=stdout
File "/app/.venv/lib/python3.12/site-packages/portkey_ai/_vendor/openai/_base_client.py", line 1860, in post

15 comments

Welcome to Portkey Forum

Blocking a Specific Model

Blocking a specific model through the config

Is There a Way to Pass Pdf Files to Vertex Ai Without Uploading to Google Storage?

Caching and cache misses for requests with new seeds

Evaluating Portkey's Oss Gateway

Accessing virtual keys for conditional routing in portkey

Handling PDF Data with Gemini

Troubleshooting Issues with Bedrock Sonnet Model and Gateway Version Upgrade

Prompt nested folders

Prompt labels and the prompt render api

The user input is no longer visible

report or summary of emails containing specific string and total cost

the langchain callback handler does not display a tree-like representation of the trace

Auto-instrumentation

Cancel

Handling Intermittent Failures with AWS Bedrock and Claude

Configuring Retry Behavior in a Nested Object

Understanding Fallback Configuration and Request Timeout Handling

Connection timeout issue

Tools for Claude Sonnet 3.5/3.7 with AWS Bedrock on the Prompt Playground

Recurring Image URL Bugs

Navigating organizational complexities in a multinational company

Inconsistencies in Portkey Cost Reporting

Get log by trace id

Unexpected Error While Calling Portkey Gateway