Im mainly interested is the response time.
E.g
I have a simple prompt and a user input, click execute and I measure the time that it takes for last token to be printed. That is equivalent to the number of secounds it takes to complete the answer which you have it here:
https://app.screencast.com/3P33uInuh0VtYThis timing should be the same timing as the when executing via openrouter or simple postman calling the model directly, with an added overhead of 40ms which your documentation suggest.
I will create a video and showing a simple test