Welcome to Portkey Forum

S
Saif
Offline, last seen 2 months ago
Joined November 4, 2024
I am making an chat completions call to Together AI

Plain Text
import Portkey from "portkey-ai";

let portkey = new Portkey({
  apiKey: process.env.PORTKEYAI_API_KEY,
  virtualKey: process.env.TOGETHERAI_API_KEY,
});

try {
  var response = await portkey.chat.completions.create({
    messages,
    model: "togethercomputer/llama-2-70b-chat",
  });
  console.info("Success", response);
} catch (error) {
  console.error("We saw errors getting response from LLM", error);
}

and I see this error:
Plain Text
error: Either x-portkey-config or x-portkey-provider header is required
      at new APIError (/Users/saifas/PKey/cookbook/scripts/node_modules/portkey-ai/dist/src/error.js:7:8)
      at new BadRequestError (/Users/saifas/PKey/cookbook/scripts/node_modules/portkey-ai/dist/src/error.js:78:8)
      at generate (/Users/saifas/PKey/cookbook/scripts/node_modules/portkey-ai/dist/src/error.js:25:19)
      at /Users/saifas/PKey/cookbook/scripts/node_modules/portkey-ai/dist/src/baseClient.js:118:22
      at fulfilled (/Users/saifas/PKey/cookbook/scripts/node_modules/portkey-ai/dist/src/baseClient.js:5:47)

I don't usually see this error when making API calls to OpenAI. Should we add config or provider explicitly?
6 comments
S
v
V
Just a general question, consider the following request:
Plain Text
curl --request POST \
  --url https://api.portkey.ai/v1/chat/completions \
  --header 'Content-Type: application/json' \
  --header 'x-portkey-api-key: sxxx=' \
  --header 'x-portkey-config: {"retry":{"attempts":3},"cache":{"mode":"simple"},"strategy":{"mode":"loadbalance"},"targets":[{"virtual_key":"open-ai-kxey-","weight":0.7},{"virtual_key":"test-virtual-ke","weight":0.3}]}' \
  --header 'x-portkey-virtual-key: open-ai-key' \
  --data '{
    "model": "gpt-3.5-turbo",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "Hello!" }
    ]
  }'

As I can comprehend:
  • All requests will have automatic retries enabled, either it's load-balanced with model 1 or model 2
  • All requests will have simple cache enabled; either they are load-balanced with model 1 or model 2
  • The x-portkey-virtual-key will be ignored since x-portkey-config enabled.
  • Since model is mandatory property, and gpt-3.5-turbo is chosen model, do loadbalace targets should be same gpt-3.5-turbo models with different virtual keys? I suppose I am getting this bit wrong in my understanding?
6 comments
V
S
S
Saif
·

the post

Yes
3 comments
V
S
Trying to do: Monitor logs to Llama3 (through Ollama) using Portkey

Problem:
I set up ngrok and llama3 running in the local, when I try to run chat completions call using Portkey, I see following response:
Plain Text
bun ollama-llama3.js
reaching llama3
{
  provider: "ollama",
  getHeaders: [Function: getHeaders],
}


Code
Plain Text
const portkey = new Portkey({
  apiKey: process.env.PORTKEY_API_KEY, 
  provider: 'ollama',
  customHost: 'https://6b73-165-1-160-105.ngrok-free.app ',
  traceID: 'ollama-llama3'
});

console.log('reaching llama3');

const chatCompletion = await portkey.chat.completions.create({
  messages: [{ role: 'user', content: 'Say this is a test' }],
  model: 'llama3'
});

console.log(chatCompletion);


Expecting:
To see any kind of completions response to saying it's an test.
10 comments
V
S
v
The runs on the Assistants API don't hit cache, how evering creating threads and assistants seem to successfully HIT the cache. What could I be missing to enable cache for Assistant Runs?
2 comments
v
V
I was playing with Assistants API along with Portkey (using OpenAI SDK). I was wondering if the pricing that shows up covers both "Message Pricing" and "Retrieval Pricing" combined?
1 comment
V
What is the right way to apply Fallback mechanism to the prompts that are created?

Plain Text
const portkey = new Portkey({
  apiKey: PORTKEY_API_KEY,
  config: {
    strategy: {
      mode: 'fallback'
    },
    targets: [
      {
        promptID: 'pp-test-811461'
      },
      {
        promptID: 'pp-l-i-ef463c'
      }
    ]
  }
});

Also, What is the following the correct way to invoke prompt completions?
Plain Text
const response = await portkey.prompts.completions.create({
  variables: {
    city: 'Hyderabad',
    name: 'Nobita'
  }
});

console.log(response.choices);
8 comments
V
S
C
How to look for image generated from a model (DALL-E) in the logs as shown in this documentation
3 comments
S
C
In the integrations guides of the documentation, a suggestion to include a section that's the list of models and equivalent labels to use in the code (atleast a top few).
1 comment
V
Try creating an image through Portkey will give base64 format, even after setting up response_format to be url.
Plain Text
config_json = json.dumps(config)

url = 'https://api.portkey.ai/v1/images/generations'

headers = {
            'Content-Type': 'application/json',
            'x-portkey-api-key': f'{PORTKEY_API_KEY}',
          }

headers['x-portkey-config'] = config_json

data = {"prompt": "Harry potter using aeroplane for transport", "response_format":"url"}

response = requests.post(url, headers=headers, data=json.dumps(data))

generation_response = response.json()

print(generation_response)

# {'created': '1710664648966', 'data': [{'b64_json': 'iVBORw0KGgoAAAANSUhEUgAAAgAAAAIACA....

This could be because Stability AI does not return URL in it's response (after Accept: image/png). But it does return bytes. How can one get them through Portkey ?
4 comments
v
r
V
The tokens and cost doesn't show up for image generations. Is this expected?
1 comment
V
The hyperlinks to list models and model overview are not found.
1 comment
V
S
Saif
·

View model names

Feature Request: To be able to slide the model parameters viewport (or) truncate common phrases like togethercomputer/ to list the full model name.
2 comments
V
S
S
Saif
·

anomalous tokens

Today, I learned that there are anomalous tokens, which may be those when models encounter "don’t know what to do".

For example, couldn't even repeat a name.
1 comment
S
I picked up the following snippet form A/B testing cookbook:
Plain Text
import Portkey from 'portkey-ai'

const portkey = new Portkey({
    apiKey: "PORTKEY_API_KEY",
    config: "pc-blog-o-0e83d2" // replace with your config ID
})

// We can also override the hyperparameters
const pcompletion = await portkey.prompts.completions.create({
    promptID: "pp-blog-outli-840877", // Use any prompt ID
    variables: {
       "title": "Should colleges permit the use of AI in assignments?",
       "num_sections": "5"
    },
});

console.log(pcompletion.choices)

The configs split the traffic into two prompts at 50% traffic each.

I wonder why chat completions call would need promptID given these are part of config already?
3 comments
V
S
I made a API call to claude-2.1 through portkey. It is clear that I missed the adding a required parameters to consume claude-2.1, which is max_tokens. However, I was expecting the error to be shown the terminal. But rather I see the error in the console. Is this an expected behaviour?
4 comments
S
v
V
This page lists an example. If someone uses it directly in config or UI editor when creating configs, they see errors.

The fix would be:
Plain Text
{
  "strategy": {
    "mode": "fallback"
  },
  "targets": [
    {
      "virtual_key": "openai-virtual-key"
    },
    {
      "virtual_key": "anthropic-virtual-key",
      "override_params": {
        "model": "claude-1"
      }
    }
  ]
}
`
1 comment
V
The Fallback feature allows you to specify a list of Language Model APIs (LLMs) in a prioritized order. If the primary LLM fails to respond or encounters an error, Portkey will automatically fallback to the next LLM in the list, ensuring your application's robustness and reliability.
At https://portkey.ai/docs/product/ai-gateway-streamline-llm-integrations/fallbacks

or encounters an error
Should I assume any other response code outside of 2XX will trigger fallbacks?
3 comments
V
S
I am trying the understand the use of override params in the following snippet:
Plain Text
{
  "strategy": {
      "mode": "fallback",
  },
  "targets": [
    {
      "virtualKey": "openai-virtual-key",
    },
    {
      "virtualKey": "anthropic-virtual-key",
      "override_params": {
          "model": "claude-1"
      }
    }
  ]
}

Here is how I understood this, please correct me:

  1. API calls is made through a default call ideally. It is instantiated with a virtual_key and the model (assume gpt-3.5) is specificed during the chat completion call.
  2. Now, when #1 tries and fails, the above strategy will be applied. Meaning the call will be routed to target #1 above — openai-virtual-key + model (assume gpt-3.5)
  3. Finally if #2 fails, same call is set through — anthropic-virtual-key, since the model in the chat completions call is gpt-3.5 until then, it's improtant to override the model to claude
3 comments
V
S
I am trying to run the AI gateway on Replit, but I tend to see an error from Replt that the app is not open to HTTP traffic. I did not deploy on Replit yet, just running it in development mode. I figured this is because, AI gateway tries to run the server on localhost (127.0.0.1), but Replit needs it to run it on 0.0.0.0. Where do I modify this in the AI gateway codebase?
3 comments
v
S
V
TIL: AI researchers use fluid dynamics principles to detect deepfakes 🌊. Fundamentally, The human curates the voice by passing air over the various structures of the vocal tract, including vocal folds, tongue, and lips. Anatomy like this limits this (acoustic) behavior, resulting in a relatively small range of correct sounds for each. 😮
2 comments
S
V
I wanted to force refresh the cache for a chat completions request using portkey sdk. I don't want to hit the cache for specific requests.
Plain Text
import { Portkey, createHeaders } from "portkey-ai";

let reqHeaders = createHeaders({
  "cache-force-refresh": true,
});
// rest
try {
  var response = await portkey.chat.completions.create(
    {
      messages,
      model: "gpt-3.5-turbo",
      stream: true,
    },
    { headers: reqHeaders }
  );
  for await (const chunk of response) {
    process.stdout.write(chunk.choices[0]?.delta?.content || "");
  }
} catch (error) {
  console.error("Errors usually happen:", error);
}

This is my logs (request):
Plain Text
  "headers": {
    "user-agent": "Bun/1.0.1",
    "x-forwarded-proto": "https",
    "x-portkey-headers": "{\"x-portkey-cache-force-refresh\":true}",
  },

Nevertheless, I see cache HITs. Any suggestions could help....
7 comments
n
v
V
If the Config Object page has a hyperlink or section, that gives an example of code snippets explaining how headers can be added to enable to AI gateway features, AND how mentioning config ID will make it simper would help.
2 comments
V
S
Docs request: A new tab with Curl examples in the Universal API Signature page
1 comment
V
When using a function calling API within chat completions. We could use tools to specify a function specification and get a structured function argument. I also read that we can select a tool_choice in the consequent calls to LLM.

Does this mean the LLMs providers remember the tools you've defined forever?
2 comments
S