Groq
A client for Groq running in OpenaAI-compatible mode (which happens by default)
It attempts to adhere as strictly as possible (even if some features are not supported by Forma).
🔑 Authentication: uses the
GROQ_API_KEYAPI Key
Note: the documentation below has been copied nearly verbatim from OpenAI's one, as reference
Full Specification
endpoint: string
model: string
stream: boolean # optional
tools:
- OpenAITool
- ... # optional
format: JsonSchema # optional
num_keep: int # optional
seed: int # optional
num_predict: int # optional
top_k: int # optional
response_format: OpenAIResponseFormat # optional
top_p: number # optional
min_p: number # optional
typical_p: number # optional
repeat_last_n: int # optional
temperature: number # optional
repeat_penalty: number # optional
presence_penalty: number # optional
frequency_penalty: number # optional
stop:
- string
- ... # optional
endpoint
The endpooint to utilize. Defaults to http://127.0.0.1:11434
model
The model to utilize. Defaults to llama3.1:8b
stream (optional)
If set to true, the model response data will be streamed to the client as it is generated using server-sent events. See the Streaming section below for more information, along with the streaming responses guide for more information on how to handle the streaming events.
tools (optional)
A list of tools the model may select as appropriate to call.
format (optional)
An object specifying the format that the model must output.
num_keep (optional)
seed (optional)
num_predict (optional)
top_k (optional)
response_format (optional)
An object specifying the format that the model must output.
Setting to { \"type\": \"json_schema\", \"json_schema\": {...} }
enables Structured Outputs which ensures the model will match your
supplied JSON schema.
Setting to { \"type\": \"json_object\" } enables the older JSON mode,
which ensures the message the model generates is valid JSON.
Using json_schema is preferred for models that support it.
top_p (optional)
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or temperature but not both.s