With dual-channel recording now available via a Twilio API, this speech analytics API provider says the time has come to start attaching spoken-data transcripts to big data to create a ‘big voice’ opportunity.
You’ve no doubt heard about “big data,” that short-and-simple phrase that came into being some five or so years ago as a way of describing the massive amounts of structured (bits and bytes) and unstructured (text, for example) data that companies were amassing as the world became increasingly digital. Applying advanced analytical techniques, companies have been turning these big-data stockpiles into treasure troves of insight… and making smarter business decisions, more quickly, as a result.
But what about “big voice” — have you heard about that?
I’m going to venture to say that you probably haven’t, but soon enough, you may.
The big voice idea comes from a startup called VoiceBase, which you may have seen on the Enterprise Connect 2016 expo floor. VoiceBase offers speech recognition and speech analytics APIs aimed at letting enterprises mine voice for actionable insight just as they have been doing with data.
Consider call logs from customer interactions. Historically, information associated with call logs has been along the lines of, “You had a call on this day, at this time, from this number, it lasted this long and, if the call has been recorded, here’s the download button.” That’s all well and good — if you want to know about the phone system. VoiceBase is about telling a company how the business is doing, said Jay Blazensky, VoiceBase co-founder, during a post EC call. As an example of the sort of useful information it can provide: “Hey, small business, you had 412 calls yesterday — 17 resulted in appointments but 32 were from people who sound like they’re going to leave you.” With this insight in hand, that company could take actions to boost the in-person experience of the 17 and to reduce the churn of the 32.
A few factors are coming together to make speech analytics at enterprise scale a reality. For one, the analytics capability is getting more sophisticated by the day. VoiceBase uses a neural networks-based engine, which helps in the custom vocabulary capability it has built into its speech recognition API, for example.
Parallel processing makes a huge difference as well. The ability to process huge traffic loads in parallel across cloud-resident servers turns jobs that used to take hours, if not days, into near-real-time activities. And, because cloud resources are always there for the taking, high-priority jobs can run during the day and low-priority jobs during off-hours — and the costs can be balanced out accordingly for affordability across the board, said Walter Bachtiger, VoiceBase founder and CEO, who I spoke with at Signal, the communications API developers’ conference that Twilio hosted in San Francisco this week.
Telecom APIs, and the programmatic call control they enable, are critical as well, Jay Blazensky noted during a Signal session titled, “How To Build Speech Analytics Into Any Platform.” Twilio, largely credited for popularizing the communications API market, has upped the speech analytics game with a dual-channel recording capability now available for its Programmable Voice API.
Historically, call recordings have been mono channel, which has made distinguishing who has said what quite difficult, explained Nisha George, a Twilio associate product manager, during the speech analytics session. That’s been so difficult, in fact, that it’s led to this statistic George cited: Only 5% of customer calls get analyzed today. With dual-channel recordings, the ability to analyze the customer’s voice is now much easier — making it Twilio’s mission to move that percentage of analyzed calls to 100, she added.
In the case of a Twilio-VoiceBase scenario, Twilio records calls at scale, and sends those recordings to VoiceBase, via a RESTful API, to do the speech recognition and analytics processing. Using the results, enterprises can then enrich their business intelligence initiatives by attaching the spoken data to their big data, Blazensky said. His conclusion: “It’s open season on call recordings.”