Welcome to Portkey Forum

Updated last month

Gemini flash 1.5, 2.0, openai 4omini, and anthropic haiku 3.5.

At a glance

The post mentions several AI models: Gemini Flash 1.5, 2.0, OpenAI 4ominiand, and AnthropicHaiku 3.5. The comments discuss benchmarking the performance of these models, specifically in terms of response time. Community members are interested in comparing the performance of direct API calls versus using a platform like Portkey. One community member shares a screenshot of response time data and suggests creating a video to demonstrate the issue further. However, other community members indicate they are unable to replicate the performance issue on their end and request a video to help investigate the problem.

Useful resources
Gemini
Flash 1.5, 2.0,
OpenAI
4omini
and
Anthropic
Haiku 3.5.

I will be more than happy to provide any other information you need to facilitate this.
V
s
6 comments
Thanks, and to confirm - are you looking at tok/s on Portkey prompt playground v/s Openrouter or also benchmarking API requests directly?
@sega @visarg could we check this - or share more if this is something that should happen where there's significant difference in throughput between raw API calls & going through Portkey?
Im mainly interested is the response time.

E.g

I have a simple prompt and a user input, click execute and I measure the time that it takes for last token to be printed. That is equivalent to the number of secounds it takes to complete the answer which you have it here: https://app.screencast.com/3P33uInuh0VtY

This timing should be the same timing as the when executing via openrouter or simple postman calling the model directly, with an added overhead of 40ms which your documentation suggest.

I will create a video and showing a simple test
that will help, yes @shockdav!
we're unable to replicate this on our end for some reason - also checked with a bunch of customers who are getting normal latnecy times
a video would def help to see if there's some issue
Add a reply
Sign up and join the conversation on Discord