Skip links

Automatically Group and Classify Electronically Stored Information(ESI), Can Machines Really think?

 Automatically Group and Classify Electronically Stored Information(ESI)

Since the onset of computer technology, arguably as early as the 1950’s when Arthur Samuel developed a checkers playing program for IBM, there has been significant debate about the benefits, logistics and indeed even the morality of creating computers that can think for themselves. “Machine learning,” according to Samuel, “is a field of study that gives computers the ability to learn without being explicitly programmed.” (1) Today, we understand machine learning to take place as either supervised (structure applied to the data in the form of learning algorithms) or unsupervised (no predefined attribute structure.) These definitions are central to our current discussion on the issue of machine learning and information governance (IG).

The CAAT “machine learning” engine, integrated with Altitude IG, will create opportunities for a software platform that can automatically group and classify an organizations electronically stored information (ESI) and at a previously unheard of rate of accuracy. Created by the Content Analyst Company the key differentiator with this technology is that it does not need to rely on a manually created list of keywords, terms or phrases but rather, can learn, infer and apply knowledge as it goes along. For example:  applying the IG process to determining defensible deletion of emails might typically involve the creation of a long list of associated search words like “unsubscribe” in order to ferret out the documentation. With machine learning, as applied to the Altitude IG platform, the software will take the example of the word “unsubscribe” but then apply a “find more like this” approach, determining for itself whether similar words like “opt out” and “manage your subscriptions” should also fall within the scope of the defensible deletion IG process.  Machine learning understands that each of these phrases may be applicable to your search and instead of having to create a complex series of rules to filter and sort, CAAT learns to semantically sort content into specific categories.  As more and more documents are added, it learns new ways to filter them thus eliminating the need for an IT professional to constantly maintain complex filtering rules.  

Automatically Group and Classify

What this means for an organization in terms of bottom line is pretty simple.  For the investment into the CAAT/Sherpa Software, the return is quicker sorting of documentation, less employee time commitment to the task of IG (without losing any efficiencies) and smoother integration of various software tools – saving both time and money in the long run. That just makes sense, whether man or machine thought of it!

 Please share this information with your colleagues or send us your questions, comments and feedback to: . Please click here or you can find more information on how to Automatically Group and Classify | information governance | eDiscovery resources on our web site and we look forward to answering any Information Governance | eDiscovery questions you may have; please contact us at 1 (800) 263-8733

Automatically Group and Classify Electronically Stored Information(ESI)


This website uses cookies to improve your web experience.