In past blogs I have discussed the possibilities of machine learning and information management, i.e. predictive information governance (PIG) and auto-categorization to automate the management of electronically stored information (ESI). One of the challenges the information management industry continues to face is how to extend this machine learning capability to audio and video content.
Audio and video content is making up a much larger percentage of corporate data than in years past. Voicemails, sales and marketing video content, WebEx presentations, conference calls, surveillance video, website content, podcasts, and corporate audio/video social media content are just a few examples of information that is almost impossible to index and search without specialized and expensive applications and 3rd party services.
In the past, untold hours were spent watching/listening to audio and video files to determine their content, category, keywords etc. and how long they should be retained. With the advent of Azure Cognitive Services, Azure-based information management of non-text based content is now possible at a much lower price than the current “roll your own” independent audio/video applications.
With the addition of Azure Cognitive Services to a native Azure information management platform, audio and video data can be managed just like text-based files with full indexing, search, review, tagging, policy management, and analytics.
Cognitive Services for audio and video data also provides automatic transcription and translation, speech authentication, sentiment analysis with text analytics, video face tracking, image analysis, and the ability to OCR words in images, to name just a few.
Several governmental regulatory retention requirements include audio and video content that must be also kept and managed. Under the SEC 17a-4 Broker Dealer regulations, voicemail recordings with customers wishing to buy and sell securities are considered communications and must be kept and made quickly producible.
Another well-known government regulation is the EU based Markets in Financial Instruments Directive or MiFID II. MiFID II specifically requires the recording of all communications with customers including telephone conversations. In fact, Article 16(7) states:
“Records shall include the recording of telephone conversations or electronic communications relating to, at least, transactions concluded when dealing on own account and the provision of client order services that relate to the reception, transmission and execution of client orders.”
Recital 57 of the MiFID II Directive states:
“Such records should ensure that there is evidence to detect any behavior that may have relevance in terms of market abuse, including when firms deal on own account.”
These requirements highlight an obvious need; audio (and video) files must be captured, stored in the appropriate repository (WORM), indexed, and tagged with the appropriate metadata and keywords. Additionally, because regulatory authorities expect to be able to find suspect content quickly, the ability to search for and find the content quickly is a necessity.
eDiscovery and litigation support
Audio and video analysis is an even greater requirement in the eDiscovery process. 70% to 80% of the cost of eDiscovery is consumed during the review phase and can be even higher if there is large amounts of audio and video content to review. Because audio/video files are rarely described or categorized before storage, upon eDiscovery, these files need to be reviewed if there is even a slight chance they could be responsive to the case. This means that thousands of additional review hours could be consumed driving up the cost of eDiscovery even more.
Customer satisfaction analysis
Azure Cognitive Services will enable organizations to be more responsive to customers by being able to analyze call center records and other customer communications and quickly find and analyze those communications from dissatisfied customers using audio and textual sentiment analysis.
In the not too distant future
I am blue-skying here but what if you could, instead of flat out denying access, redact or anonymize specific content in a document based on the employee’s Azure AD profile? Or alert financial services management when a broker promises a client specific returns via a recorded phone call? Another example most CSOs will recognize; block an email from being sent based on the presence of corporate IP?
The advantage of next generation machine learning capability built into a public cloud repository like Azure is huge. You will get the best of both worlds – powerful machine learning capability combined with the lowest possible price over that of individual, on premise solutions.
Azure Cognitive Services and Archive2Azure
Archive360’s Archive2Azure is the only native Azure Information Management and Compliance solution available today. Archive2Azure is perfectly positioned to take advantage of these new and developing Azure Cognitive Services. Contact Archive360 to find out more about the possibilities.