Workers AI
Use AI Gateway for analytics, caching, and security on requests to Workers AI.
To interact with a REST API, update the URL used for your request:
- Previous: https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/{model_id}
- New: https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/{model_id}
For these parameters:
- {account_id}is your Cloudflare account ID.
- {gateway_id}refers to the name of your existing AI Gateway.
- {model_id}refers to the model ID of the Workers AI model.
First, generate an API token with Workers AI Read access and use it in your request.
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/meta/llama-3.1-8b-instruct \ --header 'Authorization: Bearer {cf_api_token}' \ --header 'Content-Type: application/json' \ --data '{"prompt": "What is Cloudflare?"}'curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/huggingface/distilbert-sst-2-int8 \  --header 'Authorization: Bearer {cf_api_token}' \  --header 'Content-Type: application/json' \  --data '{ "text": "Cloudflare docs are amazing!" }'Workers AI supports OpenAI compatible endpoints for text generation (/v1/chat/completions) and text embedding models (/v1/embeddings). This allows you to use the same code as you would for your OpenAI commands, but swap in Workers AI easily.
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/v1/chat/completions \ --header 'Authorization: Bearer {cf_api_token}' \ --header 'Content-Type: application/json' \ --data '{      "model": "@cf/meta/llama-3.1-8b-instruct",      "messages": [        {          "role": "user",          "content": "What is Cloudflare?"        }      ]    }'To include an AI Gateway within your Worker, add the gateway as an object in your Workers AI request.
export interface Env {  AI: Ai;}
export default {  async fetch(request: Request, env: Env): Promise<Response> {    const response = await env.AI.run(      "@cf/meta/llama-3.1-8b-instruct",      {        prompt: "Why should you use Cloudflare for your AI inference?",      },      {        gateway: {          id: "{gateway_id}",          skipCache: false,          cacheTtl: 3360,        },      },    );    return new Response(JSON.stringify(response));  },} satisfies ExportedHandler<Env>;Workers AI supports the following parameters for AI gateways:
- idstring- Name of your existing AI Gateway. Must be in the same account as your Worker.
 
- skipCacheboolean(default: false)- Controls whether the request should skip the cache.
 
- cacheTtlnumber- Controls the Cache TTL.