Temperature:
Controls the randomness of the model's output. A lower temperature makes the model more deterministic, while a higher temperature makes it more creative and varied.
Top-P (Nucleus Sampling):
Controls the cumulative probability distribution from which the model picks the next word. A lower value makes the model more focused and deterministic, while a higher value increases randomness.
Max Output Tokens:
Limits the number of tokens (words or subwords) the model can generate in its response. You can use this to control the length of the response.