Upload File
Input image
Input prompt 0.5 tokens per audio usage.
Specify things to not see in the output 0.5 tokens per audio usage.
0
100
Number of diffusion steps
5
Conditioning scale
1
Factor to scale image by
10
Guidance scale to match the prompt
4
Number of outputs to generate
Which scheduler to use
Random seed for reproducibility, leave blank to randomize output
Select the language for the output