In past blogs I have discussed the possibilities of machine learning and information management, i.e. predictive information governance (PIG) and auto-categorization to automate the management of electronically stored information (ESI). One of the challenges the information management industry continues to face is how to extend this machine learning capability to audio and video content.
Many companies faced with a need to archive data (usually email) due to regulatory requirements, eDiscovery responsibilities, or business requirements, look for solutions based on capabilities, cost, vendor reputation, security, and regulatory requirements.
In the past, companies in need of archiving solutions purchased one of the many on premise or cloud-based solutions that met their needs. However, many of these archiving solutions actually converted the data so as to enable more efficient storage, indexing, and search. The problem with data conversion is the data can be corrupted or metadata changed or lost nullifying its “golden copy” or copy of record status. In most cases, this is not really a problem… unless you are anticipating or are in fact, involved in litigation.
SharePoint adoption continues to grow year over year. It was originally envisioned to be an entry level document management system but has since grown to include content management, as an intranet portal for enterprise information and applications, a groupware platform, a social platform, personal cloud, and a platform for custom web applications. Because of these expanding use models, many have found that the standard SharePoint storage allowance is not enough.
Social media platforms have proliferated as a direct method for companies to connect with their customers. However, in the last several years, businesses have been forced to collect and make available social media content for both eDiscovery and regulatory compliance.
For centuries, records/information managers have had to rely on end-users to take the first, second, and third steps in information governance which are:
- Make a decision on a document as to whether it should be retained
- Decide how long it should be kept (retention period)
- And actually take the step to move the document somewhere for safekeeping and management.
Over the last 15 to 20 years, many companies have marketed and sold “records management systems” that would supposedly make information management much easier. However, these systems didn’t address the 3 points above; the reliance on end users to initiate the process and to make decisions on the importance of the content.
They knew they had something here. I guess they should have known that when one of the largest banks in North America became one of their first clients. Our founders brought a simple tool to market - a software solution that moved data, moved it fast, and moved it completely, to the cloud. At that time, we liked to describe the company as a moving company and everyone was (and still is) always moving. What made it better was that everyone’s lawyer and every new law required our customers to never throw any of those old boxes of stuff away. By law, every relatively insignificant email, attachment, scrap of metadata, etc., from every deal, and every past and current employee had to be boxed up and kept in storage in perpetuity, or until someone somewhere had the guts to actually say “delete it.”
Finally, the financial industry is no longer forced into purchasing and supporting overpriced on premise WORM storage or high priced, specialty cloud archives that lock them into the platform with ridiculously high penalties when you want to move your data out. At least many of the on premise WORM storage systems such as the EMC Centera storage system have a proven history of meeting SEC Rule 17 a-4 requirements however, the financial industry is moving to the cloud for lower prices and higher security.
Updated: Corporate eDiscovery data storage practices have progressed (a bit) over the last 10 years. More than a few times over the years, I’ve received emails from my employer’s corporate legal department informing me that they would need me to search my email—including local and online file repositories—for any potentially relevant content and set it aside until it was asked for. Come to think about it, I never received any follow-up emails releasing me from those instructions …
Many companies that store content in cloud-based archives are stunned by their cloud vendor’s one-way attitudes - it’s free to move huge amounts of data into their cloud-based archives, however, it’s another story when you want to move it out again.
Whether you need to export a large data set in response to an eDiscovery request, or, heaven forbid, you’ve grown dissatisfied with the cloud vendor and want to move your data somewhere else, the cost to extract your data skyrockets, and in many cases, to ridiculous levels.
President Trump signed an Executive Order (EO) on 5/11 designed to strengthen the cybersecurity of federal networks by continuing a massive shift in how the US Government handles its data aiming to create a single federal IT enterprise. This effort will be quarterbacked by the Department of Homeland Security (DHS) and the Office of Management & Budget (OMB). DHS Security Advisor Tim Bossert explained that there will be a preference in federal procurement for shared IT services among the 190 federal agencies and the goal of this move to the cloud is to avoid defending antiquated and fractional systems.
Can your defense team save additional litigation cost and lower risk by using the cloud to dramatically reduce the number of data transfers?
The cloud has become a ubiquitous tool for most companies (and industries) over the last several years. However, when dealing with legal situations and eDiscovery, companies are still in the habit of shipping hard disks, optical disks, or if they’re lucky, electronically transferring terabytes of data to their external law firms in response to eDiscovery demands. Those same law firms turn around and follow the same data shipping/transfer processes when turning over client eDiscovery data to opposing counsel.
With the continuing explosion of data piling up across organizations around the world, many are turning to the cloud as an economical way to keep pace with the vast amounts of data they must store, manage, and share. Eventually, much of this data is archived for regulatory, legal, and business reasons while all of it is backed up due to disaster recovery practices.
I have written about application retirement a couple times in my blog. The concept is simple; organizations retire (shut down) aging business applications for several reasons all the time including cost reduction, application consolidation, risk reduction, and because of new regulatory and eDiscovery requirements. The big question continues to be what should be done with the associated application data? Before I address that question, let's look at a large, well known business application that many organizations are now looking at as a target for retirement - Documentum.
Back in January of this year, we published a blog titled Quarantine your Stale Data about the need to quarantine your stale (or grey) data. In it, we talked about a conversation we had with Alan Daley, Research Director of Gartner Research, about the problems his clients were having with managing stale data – or those files that for whatever reason, become less valuable to the end-user over time.
The Journaling function in “on premise” Microsoft Exchange email systems was originally developed back in the late 1990s for financial services organizations to meet SEC requirements. The main requirement consisted of capturing broker/dealer communications (emails) immediately, ensuring those emails could not have been altered or deleted before they were stored on immutable storage (WORM) per SEC 17 a-3 and a- 4 requirements. The SEC wanted to ensure that broker/dealer communications were available to review in an unaltered state if complaints were later filed against the financial services organization or individual broker/dealers. In fact, other companies adopted journaling for various reasons, mostly when under litigation hold to ensure target custodian email was captured and held thereby avoiding spoliation charges. However, the financial services industry was the only industry to really require it via government regulation.
Archive360 Webinar on 3/14/2017
See how the Power of Azure Manages and Protects your Digital Assets!
Regulatory Data Storage Compliance
Unlock the Enterprise Data Analysis and Search Cost effective SaaS Archive Solution built on the Microsoft Cloud!
Many companies struggle with the long-term storage of low touch data that, because of its nature, does not really fit with the high priced, high performance enterprise storage strategy many companies rely on.
Every day corporate employees beg for more enterprise share drive capacity, to store work documents, backups, internet research, etc. All while demanding their aging, low-touch files not be deleted from those same corporate file shares.
The cloud is an obvious candidate for storing vast amounts of email, files and other forms of unstructured data for compliance. Organizations in highly regulated industries such as financial services, healthcare, government, and energy are very familiar with the regulatory rules that require secure retention of electronically stored information (ESI). However, before you proceed it’s a good idea to carefully review some of the basic requirements for compliance archiving in the cloud.
This week I spoke with Alan Dayley, Gartner Research Director, and the topic of conversation was the management of “stale” data. “Every customer I speak with has a potential problem with managing stale data,” said Alan Dayley. Stale data usually consists of end user files that for various reasons become less valuable to the end user, for example at the end of a project or simply due to age. However, many of these files can remain or become more valuable to the organization because of the intellectual property or other sensitive data they can contain (figure one). Stale data is usually found on user desktops, file shares, and just about anywhere else files are stored.