charge you per second of usage instead of tokens

At a glance

The post discusses the pricing model of providers like Replicate.com, which charge per second of usage instead of tokens used. The community members are wondering under what circumstances it makes sense to use this model instead of the ordinary token-based APIs, and whether it can provide more tokens per second.

In the comments, one community member shares their experience with a chain of thought prompt that consumed 1400 tokens and cost 5.5 cents per response, taking about 15 seconds. They suggest that a second-based pricing model could have reduced the cost to 0.15 cents at most.

Another community member expresses caution, noting that if Replicate.com could provide higher throughput (TPS) than providers like Anyscale/Together, they would likely highlight this in their marketing. The absence of such a claim suggests they may not have an edge in this area. However, the community member also acknowledges a lack of anecdotal data on Replicate.com's usage in production environments.

The final comment simply states that the discussion is interesting.

eekevu.

There are providers like replicate.com that charge you per second of usage instead of tokens used. I was wondering, under what circumstances does it make sense to use this instead of the ordinary APIs? And can I expect more tokens per second this way than with token-based APIs?

3 comments

SSaif

I was once trying a chain of thought prompt that consumes about 1400 tokens every time and costed me about 5.5 cents every single time, and took about 15 seconds to get an response.

If second-based, my cost would be just 0.15 cents at max. Ofcourse, I might be missing a lot other things like what model? etc

VVrushank | Portkey

I'd be wary - if they could give more throughput (TPS) than providers like Anyscale/Together that's something that they would lead with in their marketing. The absence of it suggests that they don't have an edge there.

Not sure of the pricing comparisons though. The thing is, haven't really seen much of anyone go to prod with Replicate, so little less on anecdotal data I think.

But paging @rohit who may know more on this

eekevu.

That's interesting, thank you for sharing!

Add a reply

Welcome to Portkey Forum

charge you per second of usage instead of tokens