I am using Llama-3.3-Vision-Llama-Turbo model for some prompts, but I am getting cost as 0 cents for in portkey. @Vrushank | Portkey this seems like a small pricing bug.
Hi, Is there any fallback strategy where I can fallback if I reached max length of first LLM? I want to basically call Llama by default and switch to Claude if we hit max tokens on llama call.
Small Suggestion mostly around improving experience of PortKey product -
Could we replace the pagination with infinite scroll? Makes it very simple to scroll.
Also, some easy way to navigate between the logs. Can we pre-load "previous"/"next" log when I have opened "current" log. It makes experience very smooth.
I was going through our logs, and found one minor ordering issue. I think we sent two requests back to back - but the order got reversed in Portkey Logs. I can see that portkey has same timestamp for both the request - that maybe caused it.
There is a very weird bug I found when I was trying out Google Gemini Models.
So in my current prompts (GPT specific), I have a system prompt and then user prompts. But when I changed the provider, the system prompt just disappeared. (Maybe because Gemini doesn't support system prompt). I think we should not delete the system prompt.
Hi, I had few suggestions for Portkey product which will make my team really productive
P0
Folders for Prompts (Currently, it become very difficult to organize the prompts)
Support for Hindi Unicode in displaying Prompt (Have already reported to Rohit)
P1
Personal Prompts and Team Prompts (So that navigating prompts does not become cluttered)
Keeping Production & Sandbox separate. (I am always worried if some changes in dev will break production. We use same prompt Id for Sandbox & Production. I would want to keep "text" of prompt same, but variables like "Model provider" or "Model Virtual" Key different.
Same for keeping Sandbox Keys and Production keys separate
Testing same prompt on multiple models and comparing the results.
P2
Evaluation/Test Framework. Maybe ability to rate prompt responses through another LLM Prompt.
Hi @Vrushank | Portkey There is some issue with setting the output length size for the models. It behaves very unreliably. Sometimes it says max output length is 2K, sometimes 8K - for the same model. Could you have a look?
Eg - This picture is for Together AI - Llama-70 Model On together website - max limit is 8K, but here it is 32K.
Also, when I change the model to say - Llama-7b model - it still stays at 32K.