Hume, a New York-based synthetic intelligence (AI) agency, unveiled a brand new software on Monday that can enable customers to customize AI voices. Dubbed Voice Management, the brand new characteristic is geared toward serving to builders combine these voices into their chatbots and different AI-based purposes. As a substitute of providing a wide range of voices, the corporate presents granular management over 10 completely different dimensions of voices. By choosing the specified parameters in every of the scale, customers can generate distinctive voices for his or her apps.
The corporate detailed the brand new AI software in a weblog publish. Hume acknowledged that it’s making an attempt to resolve the issue of enterprises discovering the proper AI voice to match their model id. With this characteristic, customers can customise completely different facets of the notion of voice and permit builders to create a extra assertive, relaxed, or buoyant voice for AI-based purposes.
Hume’s Voice Management is presently accessible in beta, however it may be accessed by anybody registered on the platform. Devices 360 employees members have been capable of entry the software and take a look at the characteristic. There are 10 completely different dimensions builders can modify together with gender, assertiveness, buoyancy, confidence, enthusiasm, nasality, relaxedness, smoothness, tepidity, and tightness.
As a substitute of including a prompt-based customisation, the corporate has added a slider that goes from -100 to +100 for every of the metrics. The corporate acknowledged that this strategy was taken to remove the vagueness related to the textual description of a voice and to supply granular management over the languages.
In our testing, we discovered altering any of the ten dimensions makes an audible distinction to the AI voice and the software was capable of disentangle the completely different dimensions accurately. The AI agency claimed that this was achieved by growing a brand new “unsupervised strategy” which preserves most traits of every base voice when particular parameters are various. Notably, Hume didn’t element the supply of the procured information.
Notably, after creating an AI voice, builders should deploy it to the appliance by configuring its Empathic Voice Interface (EVI) AI mannequin. Whereas the corporate didn’t specify, the EVI-2 mannequin was doubtless used for this experimental characteristic.
Sooner or later, Hume plans to broaden the vary of base voices, introduce extra interpretable dimensions, improve the preservation of voice traits underneath excessive modifications, and develop superior instruments to analyse and visualise voice traits.