This guide shows you how to get the token count and the number of billable characters for a prompt.
This page covers the following topics:
- Choose a method to count tokens: Compare the
countTokens
API and the integrated SDK tokenizer to select the best option for your use case. - Supported models: Find out which multimodal models support token counting.
- Get the token count for a prompt: Follow step-by-step instructions to get the token count using the Google Cloud console, the Vertex AI SDK, or the REST API.
- Pricing and quota: Understand the pricing and quota limits for the
CountTokens
API.
Choose a method to count tokens
You can get the token count for a prompt using either the countTokens
API or the Vertex AI SDK for Python tokenizer. For most scenarios, using the SDK tokenizer is recommended. The following table compares the two methods.
Method | Description | Pros | Cons |
---|---|---|---|
Vertex AI SDK for Python tokenizer | An integrated tokenizer within the Python SDK that performs local token counting. For details, see List and count tokens. | Fast (no network latency) and easy to integrate into Python workflows. This is the recommended method. | Specific to the Python SDK. |
countTokens API |
A REST API endpoint that returns the token count and billable characters for a prompt. | Language-agnostic (works with any language that can make REST calls) and provides billable character count. | Requires a network API call, which introduces latency. |
Supported models
The following multimodal models support getting an estimate of the prompt token count:
- Gemini 2.5 Flash-Lite
- Gemini 2.0 Flash with image generation (Preview)
- Vertex AI Model Optimizer (Experimental)
- Gemini 2.5 Pro
- Gemini 2.5 Flash
- Gemini 2.0 Flash
- Gemini 2.0 Flash-Lite
To learn more about model versions, see Gemini model versions and lifecycle.
Get the token count for a prompt
You can get a token count estimate and the number of billable characters for a prompt by using the Vertex AI API.
Console
To get the token count for a prompt by using Vertex AI Studio in the Google Cloud console, do the following:
- In the Vertex AI section of the Google Cloud console, go to the Vertex AI Studio page.
- Click either Open Freeform or Open Chat.
- The number of tokens is calculated and displayed as you type in the Prompt pane. This count includes the number of tokens in any input files.
- To see more details, click <count> tokens to open the Prompt tokenizer.
- To view the tokens in the text prompt, click Token ID to text. The tokens are highlighted with different colors to mark the boundary of each token ID. This view doesn't support media tokens.
- To view the token IDs, click Token ID.
To close the tokenizer tool pane, click X, or click outside of the pane.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Go
Learn how to install or update the Go.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Node.js
Install
npm install @google/genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Java
Learn how to install or update the Java.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
REST
To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
- LOCATION: The region to process the request. Available
options include the following:
Click to expand a partial list of available regions
us-central1
us-west4
northamerica-northeast1
us-east4
us-west1
asia-northeast3
asia-southeast1
asia-northeast1
- PROJECT_ID: Your project ID.
- MODEL_ID: The model ID of the multimodal model that you want to use.
- ROLE:
The role in a conversation associated with the content. Specifying a role is required even in
singleturn use cases.
Acceptable values include the following:
USER
: Specifies content that's sent by you.
- TEXT: The text instructions to include in the prompt.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens
Request JSON body:
{ "contents": [{ "role": "ROLE", "parts": [{ "text": "TEXT" }] }] }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
Example for text with image or video
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Go
Learn how to install or update the Go.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Node.js
Install
npm install @google/genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Java
Learn how to install or update the Java.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
REST
To get the token count and the number of billable characters for a prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
MODEL_ID="gemini-2.5-flash" PROJECT_ID="my-project" TEXT="Provide a summary with about two sentences for the following article." REGION="us-central1" curl \ -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}/publishers/google/models/${MODEL_ID}:countTokens -d \ $'{ "contents": [{ "role": "user", "parts": [ { "file_data": { "file_uri": "gs://cloud-samples-data/generative-ai/video/pixel8.mp4", "mime_type": "video/mp4" } }, { "text": "'"$TEXT"'" }] }] }'
Pricing and quota
There is no charge for using the CountTokens
API. The maximum quota for this API is 3,000 requests per minute.
What's next
- Learn how to use the Vertex AI SDK for Python to list and count tokens (Preview).
- Learn about sending chat prompts and text generation.