The eDiscovery process can be a complex and expensive undertaking. Ever increasing data stores, new applications and data formats, country regulations limiting data movement and increasingly, documents authored in foreign languages, continue to drive up cost, time to respond, and risk.
One eDiscovery task that has been an ongoing pain for companies is dealing with foreign language-based documents during collection and review.
Dealing with foreign language documents
So what happens when a corporate legal department kicks off an eDiscovery collection that they suspect may involve non-English documents? Attorneys have told me this will no doubt trigger corporate attorneys to consider potential custodians and countries that could be involved in the suit. Are any of these custodians located in other countries; does the lawsuit involve business activities located in a foreign country; could potentially relevant data, such as research reports, patent filings, audio or video files, or contracts, be present in a foreign language? When the corporate legal department determines the potential custodians and countries potentially involved, they can build keyword search lists that includes the equivalent foreign language keywords. However, that does not address the main problem…
The same old way is expensive and time consuming
Translating foreign language documents so they can be reviewed for responsiveness is a challenge many companies end up facing. In the past, companies would simply include these documents in the eDiscovery data set they send to their law firms expecting them take care of translation. Of course the client will be charged for the extra work but it would push the problem to the law firm which usually is prepared for this process. Law firms will hire outside consultants or companies specializing in translations. Depending on the amount of documents to be translated and the consultant’s or company’s workload, the document translations could take much longer than planned, driving up the total cost and lengthening response time.
Relying on the internet…?
To help cut eDiscovery translation costs, law firms starting using web-based translation tools such as the free Google Translate service. With Google Translate, you would upload a document to the Google Translate website and choose the language you want it translated into. This process is relatively cumbersome in that you need to upload one document at a time and is file-size constrained – not really effective when dealing with large numbers of documents
The main legal issue with Google Translate and other free web-based translation services is one of confidentiality.For example Google’s Terms of Service state:
“When you upload, submit, store, send or receive content to or through our Services, you give Google (and those we work with) a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content.”
The question is this; can information that has been uploaded to the Google Translate web site still be considered confidential or privileged if Google can freely use and distribute the content in any way it chooses? This not to say that Google Translate is a bad service. On the contrary, this free service is a perfect fit for many, however, using it for translating legal documents is questionable.
The cloud is the answer
The Google Translate site is a free, “open to all” service that forces you to upload documents one at a time and has potential issues around confidentiality. The better and lower cost solution for legal translations is to do it automatically from within your own storage and data management solution.
Microsoft’s Azure is a cloud-based platform that is a cloud computing system that enables vendors to build native Azure applications that can provide a wide range of powerful services from within the Azure cloud.
In recent news, Microsoft just announced the availability of Azure Cognitive and Media Services. These new services provide audio and video file capabilities for automatic indexing and transcription of audio and video content as well as automatic real-time translations of documents into (currently) 53 languages. These capabilities, built around Microsoft’s machine learning technology, enables vendors to build storage and information governance applications which include these powerful service capabilities.
Archive2Azure and Azure Cognitive and Media Services
Archive360’s Archive2Azure now includes Azure Cognitive and Media Services. Archive2Azure is the industry’s first intelligent managed cloud archive specifically designed for long-term archiving of compliance and unstructured data on the Microsoft Azure platform. A major differentiator from other cloud archive vendors is that archived data is stored in its native format in your company’s Azure subscription. Azure and Archive2Azure offers complete access controls and encryption, controlled by you, not some proprietary cloud archive controlled by others.
Archive2Azure provides an intelligent cloud-based archive in which eDiscovery data sets can be uploaded, managed, and secured, with the ability to provide real-time translations of any number of documents immediately, while maintaining all original metadata. Archive2Azure includes a case management capability that includes access controls, tagging, review, translation, audio and video file transcription with full indexing, and export of data when needed. And again, your data is always stored in its native format in your Azure subscription with your encryption protocols.