Samples for Hierarchical Timbre-Painting and Articulation Generation

A comparison to the DDSP method is presented in the following audio samples.

We trained 4 model targets for timbre transfer:cello, saxophone, trumpet and violin.

The six source domains used for the MOS:clarinet, violin, female singer, male singer, trumpet and saxophone.

Our method shows better results both on target domain similarity and melody preservation.

Audio Demo on Evaluation subset: (please use headphones for better evaluation)

Clarinet Saxophone Female Singer Male Singer Trumpet Violin
Inputs
DDSP-Cello
Our-Cello
DDSP-Saxophone
our-saxophone
DDSP-Trumpet
Our-trumpet
DDSP-Violin
Our-Violin