Hume AI Unveils Second Generation Emotional Intelligence AI Ahead of OpenAI
On September 11, Hume AI launched the Empathic Voice Interface 2 (EVI 2), touted as the world's first conversational AI with emotional intelligence. This new version reduces response latency by 40% and lowers costs by 30% compared to its predecessor, EVI 1.
EVI 2 analyzes users' speech—such as accents, tone, pitch, intonation, rhythm, and pauses—to understand their emotions and psychological states, providing real-time responses. Improvements include enhanced voice quality, increased emotional intelligence, and support for custom voice settings.
Founded in 2021 by former Google DeepMind researcher Alan Cowen, who serves as CEO and chief scientist, Hume AI raised $50 million in Series B funding in March.
Features Enhancement: Improved Voice Quality and Emotional Intelligence with Custom Voices
EVI 2 incorporates an advanced voice generation model and an emotional large language model (eLLM), enabling it to process and generate both text and audio. This multimodal approach results in more natural-sounding speech, appropriate intonations, and greater expressiveness.
By handling voice and language in the same model, EVI 2 can better understand the emotional nuances of user inputs, allowing for empathetic adjustments in both content and tone.
Additionally, EVI 2 supports custom voice settings, enabling developers to adjust parameters like pitch, timbre, and gender to suit specific applications, such as customer service bots or virtual AI assistants. Users can even modify EVI 2's speaking style dynamically during interactions.
Cowen noted that EVI 2 will not offer voice cloning features due to high risks and unclear benefits, emphasizing a focus on customizable voices instead.
Cost Efficiency Improvement: 40% Reduced Latency and 30% Lower Pricing, More Languages Expected by Year-End
EVI 2 features an average response time of 500 to 800 milliseconds, enhancing conversational fluidity. Hume AI has also reduced pricing by approximately 30%, from $0.102 per minute for EVI 1 to $0.072 per minute for EVI 2, with bulk discounts available for enterprises.
However, according to VentureBeat, OpenAI’s text-to-speech services remain significantly cheaper than Hume AI’s offering. Currently, EVI 2 only supports English, with plans to add support for Spanish, French, and German by the end of 2024.
Cowen explained that EVI 2 has learned multiple languages through its training process without specific training for those languages, allowing it to generate speech in various tongues.
Conclusion: Early Launch Aims to Capture Market Share
Hume AI has positioned itself ahead of potential competitors, such as Anthropic, which is revamping Amazon’s Alexa. Meanwhile, OpenAI's advanced voice mode, powered by the GPT-4o model, remains limited to a small user group.
Despite being less known than OpenAI or Anthropic, Hume AI has launched a humanized voice assistant that is immediately available for use, potentially securing a foothold in a competitive market.
For more information, visit: Hume AI Official Site