Welcome to Portkey Forum

Updated last week

Unified AI Inference Gateway: Addressing Reliability Challenges

Is there plan to support huggingface inference endpoints ? Ideally would like to have all AI configs in a single gateway.
Problem: Dedicated Inference endpoints, especially ones using Nvidia GPU instances that scale to zero often go down in prod, and warrant a need for a fallback option.
s
A
6 comments
dedicated inference endpoints are already supported in the gateway
you have to pass teh x-portkey-huggingface-base-url header for dedicated hosts
I have cross encoder marco model deployed which doesn't use the openai schema, so not sure it would work in that case.
yes, you can send arbitrary json payloads to endpoints that are not in the unified endpoints list, let me share an example
Plain Text
from portkey_ai import Portkey

portkey = Portkey(
    api_key="",
    provider="huggingface",
    Authorization="Bearer hf_",
    huggingface_base_url="https://rpld3pbvx.us-east-1.aws.endpoints.huggingface.cloud"
    # content_type="multipart/form-data"
)

response = portkey.post(
    url="endpoints/PortkeyGuardrails-gibberish/invocations",
    inputs="asdasdasdasdasdsdas"
)

print(response)
Add a reply
Sign up and join the conversation on Discord