Hey, sorry for the late reply.
Here's the calculation for one of the requests using gemini flash 001
Source 1
https://cloud.google.com/vertex-ai/generative-ai/pricing
Image Input - $0.00002 / image
Text Input - $0.00001875 / 1k characters
Text Output - $0.000075 / 1k characters
Source 2
https://ai.google.dev/pricing#1_5flash
Input Pricing - $0.075 / 1 million tokens
Output Pricing - $0.30 / 1 million tokens
Request
2332 characters (799 tokens), 1 image
Source 1
2.332 * 0.00001875 + 1 * 0.00002 = 0.000063725
Source 2
799 * 0.075 / 1000000 = 0.000059925
Response
2157 characters (571 tokens)
Source 1
2.157 * 0.000075 = 0.000161775
Source 2
571 * 0.30 / 1000000 = 0.0001713
Source 1 total
= 0.000063725 + 0.000161775 = 0.0002255 = 0.02 cents
Source 2 total
= 0.000059925 + 0.0001713 = 0.000231225 = 0.02 cents
My calculations say 0.02 cents, portkey says 0.13 cents.
What am I missing?
We have negligible usage of flows using gemini for now, not even a million tokens in November.
But the total November cost difference is noticeable - $0.9 in portkey and $0.25 in GCP reports. Total cost difference will be higher as we have few direct calls to vertexai not going through portkey.