How Kits AI Sources AI Training Datasets, Ethically
Written by
Published on
April 11, 2024
Kits is a musician and vocalist first organization. We understand the nuanced debate between artists and AI tools using their likeness or informing their creative process, and how valuable it is to support artists in the process of building our technology. We closely follow the Artist Rights Alliance's work urging tech companies to stop using AI that infringes on artists' rights, and proudly support the ARA's mission to prioritize ethics and want to share how and why our data sourcing practices support the work of musicians and creatives.
How Voice Models Are Created
Let's quickly discuss how AI voice models work. Each AI voice on Kits is a uniquely fine-tuned AI model. To mimic a real voice, the model trains on reference audio datasets. Ideally, this dataset comprises 30 minutes of high-quality dry vocals. Every Kits model uses a vocal dataset, resulting in a voice model as close to the original as possible.
Our AI Voice Library on Kits is continually growing, and for each model we sourced vocal datasets of high quality singing to train our models on. It’s easy to do this the irresponsible way, and find datasets that haven’t been approved by artists nor support them in any way. So how did we approach this responsibly?
Many AI voice platforms simply scrape vocals from the web and rush to train models, aiming for sheer quantity. This approach is not only unethical but also puts end users at significant risk.
If a user converts with a noncompliant voice model, anything they create could face copyright violation and takedown notices. This means artists' voices are used without consent, and any work created with those models is also at risk.
How Kits Sources Our Training Data, Ethically
Kits models train exclusively on vocal data for which we acquire full rights. We begin by contacting session vocalists and studio partners interested in providing vocal datasets. We educate providers on AI model training and create contracts to compensate them and rightfully acquire their vocal datasets.
On the provider side this agreement ensures vocalists understand the nuances of AI Voice technology and receive compensation for any vocals they provide. On the Kits side, this ensures that any model you use from the Kits Library has been fairly sourced and you as the end user retain full rights over any work created with that model now and into the future.
We started Kits to show artists and the music industry how AI Voice technology can be used for good, and there is still plenty of work to be done. In the coming months, we will share more on how we are developing innovative tools to help put Artists in the driver's seat of their own IP and help inform the future of AI voice technology.
Are you an artist looking to share your voice with the world safely, ethically, and with compensation? We’d love to hear from you! Please reach out to us at outreach@kits.ai.
Best,
The Kits Team