Speech Recognition: The Power of Voice
Compared to other technologies, the concept of speech recognition has been around for a long time. 1960’s sci-fi shows featured voice-controlled cars and robots. With that, they built high hopes and predictions about how speech command driven gadgets would become the norm in the exciting future of the year 2000.
Decades later, we have seen a rise of cloud computing systems. These systems bring with them the required processing power that makes speech recognition a reality.
Now, speech recognition is set for growth. According to Markets and Markets, the speech recognition market will be worth an impressive $24.9 billion by 2025.
A brief history…
In 1952, Bell Laboratories designed its ‘Audrey’ system. It was capable of recognising numbers 0 – 9 spoken by a single voice. At the time this was a remarkable feat of modern technology.
Exactly ten years later, IBM showcased ‘Shoebox,’ which could recognise 16 English words in addition to 0 – 9.
Certainly these were improvements in the field, but the complexity of the technology made it seem that voice-controlled technology would be little more than a nice idea.
By the early 1970s, IBM had developed many of the speech algorithms, but one thing was still missing. They needed the computer processing power to make speech recognition a reality.
In the early 1990’s, the SUNDIAL project led to the creation of the British Airways Flight Information Service. This service, operating in four languages – English, French, Italian and German – used speech recognition over the telephone to allow callers to check arrival and departure times of flights at London’s Heathrow Airport.
Costing over £4 million to develop, it claimed 96.6% accuracy, asking callers questions that required simple short sentence answers. The system was a great success and ran for many years. One group that found it particularly useful – local taxi drivers. They wanted to know what time the delayed flight from JFK would actually land.
Modern-Day Speech Recognition is Everywhere
Now, the advent of Big Data, Machine Learning and Artificial Intelligence, coupled with cloud-based computing, has enabled speech systems to develop. Today we are surrounded by all manner of voice-controlled devices.
Take, for example, gadgets such as ‘Alexa’, ‘Siri’, ‘OK Google’ and ‘Cortana’. These all use speech recognition to function. They also have all seen a significant rise in the number of people making voice calls to contact centres through them.
With the prevalence of speech recognition, it’s easy to take for granted. We forget what a long process development has been and how complex the technology is. The seeming simplicity of our own speech is misleading.
Learning to Talk
It takes children years to learn how to talk and understand speech. It takes up to two years on average for an adult to learn an entirely new language. We do this through listening and responding, rectifying and remembering mistakes.
Speech Recognition technology works in a similar way, except we use Big Data and Machine Learning to teach computers to listen and respond. This is still a work in progress, with machines learning thousands of languages, accents and dialects. In the lead, Google’s machine learning algorithms are reported to have achieved a 95% accuracy rate with the English language. This is about the same level as the average human.
Secure payments and Speech Recognition
The introduction of this technology to our solutions means that businesses taking Cardholder Not Present (CNP) payments can offer their customers an additional secure option when paying for products and services. Further, those using speech recognition will have the best possible experience consistent with our entire suite of solutions.
Our focus at PCI Pal has always been to offer our customers and partners the most robust, globally available, true cloud environment for securing payments. Our addition of speech recognition fits right in line with achieving this goal.