Welcome to Portkey Forum

Updated last month

Constraints Of Api Returns For Streaming

At a glance
Hey folks, it looks like Deepseek via Together doesn't get cost (this is ok), but also it doesn't return tokens used in the API response while streaming (just like pplx).

This is pretty annoying, especially that I have to find it out myself. Is there any (better) way to know the constraints of API returns for streaming? Also, since it seems I have to tokenize this myself (just like pplx) any tips on a tokenizer for Deepseek?

You guys handle this somehow on your end?
"model": "",
"usage": {
"completion_tokens": 1875
}
r
s
G
6 comments
Hey @Gijs - yea, in streaming use cases some vendors still aren't calculating usage. And we don't append to it as it increases latency. Would you be ok if we did something about it but increased latency by a 100ms or so?
Hey @Gijs through which platform are you using deepseek?
fireworks?
I'm using together.ai
I think 100ms generally would be ok to me, to be honest.
deepseel models added to together.ai
Add a reply
Sign up and join the conversation on Discord