PlayDiffusion
Inpaint
Text to Speech
Voice Conversion
Inpaint
Text to Speech
Voice Conversion
Upload an audio file and run ASR to get the text.
Then, specify the desired output text.
Run the inpainter to generate the modified audio.
Note: The model and demo are currently targeted for English.
Advanced options
▼
number of sampling steps codebook
↺
1
100
Initial temperature
↺
0.5
10
Initial diversity
↺
0
10
guidance
↺
0
10
guidance rescale factor
↺
0
1
sampling from top-k logits
↺
1
10000
Audio Token Syllable Ratio
Automatic calculation (recommended) provides the best results in most cases.
Use manual audio token syllable ratio
Audio token syllable ratio (manual)
Upload audio to be modified
Drop Audio Here
- or -
Click to Upload
Run ASR
Input text from ASR
Desired output text
Word times from ASR
Run Inpainter
Output audio
Text to Speech
Advanced options
▼
number of sampling steps codebook
↺
1
100
Initial temperature
↺
0.5
10
Initial diversity
↺
0
10
guidance
↺
0
10
guidance rescale factor
↺
0
1
sampling from top-k logits
↺
1
10000
Audio Token Syllable Ratio
Automatic calculation (recommended) provides the best results in most cases.
Use manual audio token syllable ratio
Audio token syllable ratio (manual)
TTS Input
Voice to use for TTS
Drop Audio Here
- or -
Click to Upload
Convert to Speech
Generated Speech
Real Time Voice Conversion (works best for english)
Source Conversion Speech
Drop Audio Here
- or -
Click to Upload
Target Voice
Drop Audio Here
- or -
Click to Upload
Real time Voice Conversion
Converted Speech