Openai prompt caching issue with o1 mini models using p...

At a glance

I've noticed something strange. I've been testing the new OpenAI Prompt Caching on its o1 mini models, using Portkey.

Every few requests, one of the requests has zero cache tokens. Even though it should have cache tokens because the message history is the same.

The previous message would have the cache, and the next one would have the cache too.

I'm wondering if this is an issue with OpenAI, or Portkey? Could it be that Portkey is sending the request from a different IP or something else that causes the cache to not work?

OpenAI doesn't give detailed enough logs for me to check what happened there :/

3 comments

VVrushank | Portkey

@Anshul could it be that cache eviction is happening? Portkey sends all your requests from a different IP, so that shouldn't be an issue

AAnshul

Hey @Vrushank | Portkey ,

Is there any way to find out if cache eviction is happening? I mean is there any visible indicator?

Generally the requests are placed within quick succession since it's an automation I use. So definitely within the 5-min time.

Will run a few more tests to see if it's still happening, but it's a bit time-taking to go and check the JSON for each request.

VVrushank | Portkey

There isn't, unfortunately. The eviction time is also random currently. You'll get evicted fast during peak load times.

Add a reply

Welcome to Portkey Forum

Openai prompt caching issue with o1 mini models using portkey