Compared to other technologies, speech recognition has been around for a long time. Voice-controlled cars and robots featured in 1960’s sci-fi shows and built high hopes and predictions about how speech command driven gadgets would become the norm in the exciting future of the year 2000. IBM had developed many of the speech algorithms by the early 1970s, but one thing was missing – the computer processing power to make speech recognition a reality. It has taken the rise of cloud computing systems to finally deliver the required processing power and make speech recognition a reality.
According to Research and Markets, the speech recognition market will be worth an impressive $3.5 billion by 2024.
A brief history…
In 1952, Bell Laboratories designed its ‘Audrey’ system. Capable of recognizing numbers 0 – 9 spoken by a single voice, at the time this was a remarkable feat of modern technology. Exactly ten years later, IBM showcased ‘Shoebox,’ which could recognize 16 English words in addition to 0 – 9.
However due to the complexity, it seemed that voice-controlled technology would be little more than a nice idea.
In the early 1990’s, the SUNDIAL project led to the creation of the British Airways Flight Information Service. This service, operating in four languages – English, French, Italian and German, used speech recognition over the telephone to allow callers to check arrival and departure times of flights at London’s Heathrow Airport. Costing over £4 million to develop, it claimed 96.6% accuracy, asking callers questions that required simple short sentence answers. The system was a great success and ran for many years – particularly used by local taxi drivers who wanted to know what time the delayed flight from JFK would actually land.
The advent of Big Data, Machine Learning and Artificial Intelligence coupled with cloud-based computing have enabled speech systems to develop to the point, where today, we are surrounded by all manner of voice-controlled devices. Because of this it’s easy to take for granted what a long process development has been and how the technology works, the simplicity of speech can be misleading.
It takes children years to learn how to talk and understand speech, and up to two years on average for an adult to learn an entirely new language. This is all done through listening and responding, and mistakes are rectified and remembered. Essentially Speech Recognition technology works in the same way except we are using Big Data and Machine Learning to teach computers to listen and respond. This is still a work in progress when machines are learning thousands of languages, accents and dialects. Although, Google’s machine learning algorithms are reported to have achieved a 95% accuracy rate with the English language, which is about the same level as the average human.
Secure payments and Speech Recognition
Our focus at PCI Pal has always been to offer our customers and partners the most robust, globally available, true cloud environment for securing payments. We are delighted to have added Speech Recognition to PCI Pal Agent Assist and PCI Pal IVR. By waiting until now, we have ensured that PCI Pal customers using our Speech Recognition feature will have the best possible experience consistent with our entire suite of solutions.
Speech Recognition has come a long way in the last 5 years. The introduction of this technology to our solutions means that businesses taking Cardholder Not Present (CNP) payments can offer their customers an additional secure option when paying for products and services.
With the rise of speech recognition, devices such as ‘Alexa’, ‘Siri’, ‘OK Google’ and ‘Cortana’ has seen a significant rise in the number of people making voice calls to contact centers from these devices and therefore they may not have access to a telephone keypad to enter the required for information. Speech recognition systems solve this problem. The PCI Pal cloud system ‘listens’ for the spoken card details and removes the caller’s speech from the call before passing the audio down the line to the contact center (and on to the call recording platform). All safe and secure and, of course, PCI compliant.