GetChatCompletionsOptions interface

This module contains models that we want to live side-by-side with the corresponding generated models. This is useful for providing customer-facing models that have different names/types than the generated models.

Extends

Properties

azureExtensionOptions

The configuration entries for Azure OpenAI chat extensions that use them. This additional specification is only compatible with Azure OpenAI.

frequencyPenalty

A value that influences the probability of generated tokens appearing based on their cumulative frequency in generated text. Positive values will make tokens less likely to appear as their frequency increases and decrease the likelihood of the model repeating the same statements verbatim.

functionCall

Controls how the model responds to function calls. "none" means the model does not call a function, and responds to the end-user. "auto" means the model can pick between an end-user or calling a function. Specifying a particular function via {"name": "my_function"} forces the model to call that function. "none" is the default when no functions are present. "auto" is the default if functions are present.

functions

A list of functions the model may generate JSON inputs for.

logitBias

A map between GPT token IDs and bias scores that influences the probability of specific tokens appearing in a completions response. Token IDs are computed via external tokenizer tools, while bias scores reside in the range of -100 to 100 with minimum and maximum values corresponding to a full ban or exclusive selection of a token, respectively. The exact behavior of a given bias score varies by model.

maxTokens

The maximum number of tokens to generate.

n

The number of chat completions choices that should be generated for a chat completions response. Because this setting can generate many completions, it may quickly consume your token quota. Use carefully and ensure reasonable settings for maxTokens and stop.

presencePenalty

A value that influences the probability of generated tokens appearing based on their existing presence in generated text. Positive values will make tokens less likely to appear when they already exist and increase the model's likelihood to output new topics.

responseFormat

An object specifying the format that the model must output. Used to enable JSON mode.

seed

If specified, the system will make a best effort to sample deterministically such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend."

stop

A collection of textual sequences that will end completions generation.

temperature

The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify temperature and topP for the same completions request as the interaction of these two settings is difficult to predict.

toolChoice

If specified, the model will configure which of the provided tools it can use for the chat completions response.

tools

The available tool definitions that the chat completions request can use, including caller-defined functions.

topP

An alternative to sampling with temperature called nucleus sampling. This value causes the model to consider the results of tokens with the provided probability mass. As an example, a value of 0.15 will cause only the tokens comprising the top 15% of probability mass to be considered. It is not recommended to modify temperature and topP for the same completions request as the interaction of these two settings is difficult to predict.

user

An identifier for the caller or end user of the operation. This may be used for tracking or rate-limiting purposes.

Inherited Properties

abortSignal

The signal which can be used to abort requests.

onResponse

A function to be called each time a response is received from the server while performing the requested operation. May be called multiple times.

requestOptions

Options used when creating and sending HTTP requests for this operation.

tracingOptions

Options used when tracing is enabled.

Property Details

azureExtensionOptions

The configuration entries for Azure OpenAI chat extensions that use them. This additional specification is only compatible with Azure OpenAI.

azureExtensionOptions?: AzureExtensionsOptions

Property Value

frequencyPenalty

A value that influences the probability of generated tokens appearing based on their cumulative frequency in generated text. Positive values will make tokens less likely to appear as their frequency increases and decrease the likelihood of the model repeating the same statements verbatim.

frequencyPenalty?: number

Property Value

number

functionCall

Controls how the model responds to function calls. "none" means the model does not call a function, and responds to the end-user. "auto" means the model can pick between an end-user or calling a function. Specifying a particular function via {"name": "my_function"} forces the model to call that function. "none" is the default when no functions are present. "auto" is the default if functions are present.

functionCall?: string | FunctionName

Property Value

string | FunctionName

functions

A list of functions the model may generate JSON inputs for.

functions?: FunctionDefinition[]

Property Value

logitBias

A map between GPT token IDs and bias scores that influences the probability of specific tokens appearing in a completions response. Token IDs are computed via external tokenizer tools, while bias scores reside in the range of -100 to 100 with minimum and maximum values corresponding to a full ban or exclusive selection of a token, respectively. The exact behavior of a given bias score varies by model.

logitBias?: Record<string, number>

Property Value

Record<string, number>

maxTokens

The maximum number of tokens to generate.

maxTokens?: number

Property Value

number

n

The number of chat completions choices that should be generated for a chat completions response. Because this setting can generate many completions, it may quickly consume your token quota. Use carefully and ensure reasonable settings for maxTokens and stop.

n?: number

Property Value

number

presencePenalty

A value that influences the probability of generated tokens appearing based on their existing presence in generated text. Positive values will make tokens less likely to appear when they already exist and increase the model's likelihood to output new topics.

presencePenalty?: number

Property Value

number

responseFormat

An object specifying the format that the model must output. Used to enable JSON mode.

responseFormat?: ChatCompletionsResponseFormat

Property Value

seed

If specified, the system will make a best effort to sample deterministically such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend."

seed?: number

Property Value

number

stop

A collection of textual sequences that will end completions generation.

stop?: string[]

Property Value

string[]

temperature

The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify temperature and topP for the same completions request as the interaction of these two settings is difficult to predict.

temperature?: number

Property Value

number

toolChoice

If specified, the model will configure which of the provided tools it can use for the chat completions response.

toolChoice?: ChatCompletionsNamedToolSelectionUnion

Property Value

tools

The available tool definitions that the chat completions request can use, including caller-defined functions.

tools?: ChatCompletionsToolDefinitionUnion[]

Property Value

topP

An alternative to sampling with temperature called nucleus sampling. This value causes the model to consider the results of tokens with the provided probability mass. As an example, a value of 0.15 will cause only the tokens comprising the top 15% of probability mass to be considered. It is not recommended to modify temperature and topP for the same completions request as the interaction of these two settings is difficult to predict.

topP?: number

Property Value

number

user

An identifier for the caller or end user of the operation. This may be used for tracking or rate-limiting purposes.

user?: string

Property Value

string

Inherited Property Details

abortSignal

The signal which can be used to abort requests.

abortSignal?: AbortSignalLike

Property Value

Inherited From OperationOptions.abortSignal

onResponse

A function to be called each time a response is received from the server while performing the requested operation. May be called multiple times.

onResponse?: RawResponseCallback

Property Value

Inherited From OperationOptions.onResponse

requestOptions

Options used when creating and sending HTTP requests for this operation.

requestOptions?: OperationRequestOptions

Property Value

Inherited From OperationOptions.requestOptions

tracingOptions

Options used when tracing is enabled.

tracingOptions?: OperationTracingOptions

Property Value

Inherited From OperationOptions.tracingOptions