Vozy from Colombia develops neuronal text-to-speech solution for the Spanish language

October 29, 2019

Colombia News Startups Technology

Contxto – Back in the day, text-to-speech (TTS) technology was boring and monotonous. Not wanting to lull customers to sleep, some startups such as Vozy are developing solutions to make these machine-generated voices more lifelike.

While the Colombian startup has developed advanced English TTS technology, now it offers Spanish services, too.

Based on the regional diversity of the Spanish language, Vozy's bilingual virtual agents can now distinguish between various accents. This way, the AI technology leveraging neuronal TTS can adapt to customers, depending on how fast they speak or roll their "rs," for example.

Neuronal TTS

Out with the old and in with the new. While Vozy has Colombian origins, the startup based in Miami intends to replace standard TTS models with neuronal upgrades.

First, let's back-track. By standard, I mean agonizingly dull voices with little to no character replaced with something more relatable. Since machine-generated voices follow text-to-speech scripts, the original system divided the text into small units.

Like a puzzle, users would essentially adjoin pieces of audio according to the units. Typically, this required large amounts of data to accurately correspond with the text. Needless to say, this was often a long and complicated process.

Instead, neuronal TTS sounds more realistic due to machine learning models of converting text to voice. First, the text goes into the system followed by an acoustic generator. From there, it goes to an acoustic vocoder where the sound is produced.

With this comes the ability to train machines to adapt to unique speech styles, just like a human could. Rather than spending a year in Argentina to learn the regional accent, the neuronal model allows the machine to master these nuances in just a few hours. Overall, this process is more concise than its predecessor.

Behind this service is machine learning that's converting code text into culturally-specific voices. Once the coded text becomes a string of characters, they turn into a sequence of "cepstrum coefficients," meaning frequencies. When these go through the vocoder, this is where the noises become a continuous audio signal.

Voice recognition

Equipped with this communication solution, companies will be better able to serve customers in the Spanish-speaking world. All in all, the Colombian startup combines voice technology, AI and human understanding to develop personalized customer interactions at scale.

So far, the neural voice text technology is available in eight accents. These reportedly include Colombian, Mexican, Argentine, Peruvian, Puerto Rican, among others. Today, Vozy has more than 200 customers in 15 countries, including MAPFRE and Infopáginas in Puerto Rico.

Recently, Vozy raised some funds from the Puerto Rico Science, Technology and Research Trust after collaborating with the Parallel18 accelerator. According to Vozy, it's the only Latin American company providing this type of technology for the Spanish language.

-JA

Vozy from Colombia develops neuronal text-to-speech solution for the Spanish language

Neuronal TTS

Voice recognition

The Challenge Facing Pet Tech Companies and the Boom in the Pet Market

Cobre accelerates digital payments for exports in Mexico

Medellín as a hub for innovation and entrepreneurship in Latin America

The startup ecosystem in Central America and its challenges

Keep up to Date with Latin American VC and Startups News!

How extended reality and spatial computing will define 2026

Recobra, the startup that seeks to recover lost time in the age of digital distraction

Kavak closes $300 million Series F round led by Andreessen Horowitz

KIRA and OXXO are committed to innovation in sending money to Mexico

The three startup sectors that attract the most interest from investors

Argentine company Lebane closes $4 million funding round; seeks to consolidate its expansion in Mexico