A couple of weeks ago I sat in on a presentation about the legal profession and information governance hosted by two attorneys. The presentation was very good however a couple of things set my teeth on edge. The two attorneys presenting started the discussion by pointing out that content is everywhere, is always changing, and is under increased scrutiny, and because of that can be a major headache for attorneys during discovery and for compliance personnel responding to information requests from government agencies. Hard to disagree with that.
The underlying problem is that much of enterprise information (which is 80% unstructured data) is unmanaged and spread across numerous enterprise repositories, or worse, stored in hundreds or thousands of employee workstations.
Manage the 30%, ignore the rest
The attorneys offered that to get control of information, a company must decide what information is important and what’s not. The inference here was that only important information should be governed such as records. With this point, I couldn’t disagree more. What should be done with the other (approximately 70%) of data deemed unimportant to the organization? They never addressed this topic.
In reality, the vast majority of companies simply ignore and pass management of this data to the individual employees. One obvious and expensive problem with this strategy is it greatly increases the risk and cost of discovery, not to mention the annual cost of additional storage and overhead.
The presentation touched on the use of cloud storage for better security, ease of access, and providing a single repository to search when responding to eDiscovery or regulatory requests. However, one of the presenters stated that putting everything in the cloud doesn’t make sense and should be discouraged. This statement baffles me…
First, what do you do with the content not put into the cloud? Obviously the implication here is to keep in on enterprise resources, with all of the associated costs. As an example, the fully loaded cost of enterprise storage (depending on what analyst you talk to) can run between 15 to 30 cents per GB per month. The key here is the term “fully loaded.” This includes all of the associated costs of backup, replication, floor space, IT personnel, etc.
Information management and human nature
An apparent strategy here would be to employ a defensible deletion strategy to dispose of all information not deemed important, however, this strategy is not realistic for most companies which have fostered a culture of relying on employees to delete their unimportant content. In fact, this expectation overlooks human nature – employees will keep almost everything “just in case” they will ever need it again.
Whereas cloud storage (and security and management) costs as little as less than 1 cent per GB per month to a couple of cents per GB per month. So my question is why wouldn’t you put everything in the cloud?
Out of hundreds of TB of data, can you determine importance?
Another comment during the presentation was to keep (manage) only that data used in the running of the business and to “centralize where that data (important data) is kept” which makes sense but again, what about the rest? At this point it was obvious the discussion had slipped back into “managing records” only. The point is, managing only records isn’t a long term solution when looking to establish full-fledged information governance program.
Managing all information within a company makes more sense, if the company can put the policies, procedures and, technology in place to make it cost effective. As an example, the estimated cost to store and maintain 100 TB of data on premise is $15,000 per month, while the same 100 TB stored in Microsoft Azure with geo-redundant storage (GRS) is $3,440 per month – a 77% reduction in cost. In addition to the obvious storage savings, you also receive much higher levels of security, powerful search across all data, and with an additional Azure application like Archive2Azure, information governance, eDiscovery support, and data analytics is also included.
Of course. This strategy is a major culture change for most companies. For the last40 plus years, information was divided into two types, company records and the employee data. To now say (and act on) all data is the company’s, is a huge leap for many. The real question is how do you centralize all data while making it (mostly) transparent to employees?
I have consulted with companies that have put a centralization capability in place. One strategy was to limit where employee data can be stored. A bank I worked forced all employee generated files to be stored only on a file share while at the same time disabling all USB ports on each workstation. The other strategy was to sync all data saved by employees onto their workstations to a centralized repository, allowing the data to remain on the employee workstation while ensuring copies are captured and managed centrally.
The second strategy is far less of a culture shock for employees while providing all of the advantages of centralized data to the company. Obviously, based on the cloud cost savings described above, saving all data to the cloud and managing it with retention/disposition policies, employing machine learning for categorization, and data analytics seems like a strategy that will enable information governance.