When looking at best practices for successful data protection, the first or second step is almost always discovery and identification. Data-loss prevention (DLP) tools can give you some sense of what you have and can monitor the flow of information, but they cannot provide identity to the degree of certainty that is required for proper data governance. In contrast, data classification provides the permanent and explicit identification labels that DLP systems need to correctly process the data.
Let’s take a look at five reasons why organizations that want to get the most from their DLP implementation choose to roll out classification first.
Top 5 Reasons to Classify First
- Data Security Is a Business Problem That Technology Alone Cannot Solve
A widely held belief says that data security can be solved by implementing a new piece of technology. The truth, however, is not that simple; true security is an ongoing process that involves everyone in an organization. Many DLP implementations hit their first snag with the initial setup. Often, the IT department receives a list of criteria that define sensitive information and security policies for dealing with it. The data and business-process owners are not involved in enforcement.
IT staff program the search algorithms that catch data breaches. To ensure that nothing is leaked, these algorithms are set to be stringent at first, meaning they catch many potential breaches. But the tighter the security, the more “false positives” occur and the more calls workers place to the IT department asking for data to be released. This situation stops business workflow and frustrates users.
Users must be empowered to take responsibility for the security of data they use and create. User-driven classification provides much greater data-identity accuracy and will thus help ensure the DLP system handles the data correctly. Greater accuracy will also release the IT team from excessive manual monitoring. Moreover, user classification has the added benefit of fostering a culture of security in the user community.
- Classification Fosters a Security Culture
Security systems are bad at preventing accidental disclosure by careless users with legitimate access. Although a DLP’s failure to catch a particular breach can be classified as an “error,” the user who accessed and distributed the information is the real problem. Asking users to classify each file helps to improve the source of the problem: users who lack awareness of the proper security procedures.
Common data-breach accidents include such things as sending sensitive data in an email or attachment, accessing data from unsecure public sources, and inappropriate sharing of information to personal email and devices. Although a DLP system is vital to providing a second look when these mistakes occur, a lack of classification may cause some breaches to slip by. Even if a DLP system does catch the breach, there is usually no informative response to help users remediate or learn from their error.
A classification tool, however, consistently reminds users of data-security policies each time they save a document or send an email. By requiring users to identify the sensitivity of the information, data security remains constantly top of mind.
And by checking the selected classification against the email content and attachments, classification tools can immediately identify possible breaches before the email ever leaves the user’s control.
- DLP Systems Must Know the Data to Know How to Manage It
To achieve data-loss prevention, your DLP technology must know what to block. On the basis of what they find, DLP systems have several options, from preventing access to denying copy actions to encrypting data. But all these useful data-governance actions depend on how the DLP system identifies the data. Failure of the search algorithm means either failure to enforce the proper security policy or freezing the data until it is manually reviewed.
DLP searches look for key strings of text in the data or in its properties. In some cases, this data can be very specific, such as a Social Security number (SSN). In other cases, the sensitive data indicators might be a specific string of text unique to your organization. In both cases, the DLP system is still making a guess.
Regardless of the content or the formatting, explicit classification tags allow DLP systems to manage data with certainty. It doesn’t matter whether the DLP scan confuses a telephone number with a Social Security number. Classification provides precise governance instructions in either case. Of note, the DLP system should still be configured to record when its scan conflicts with the classification. By using both tools, any irregularities in worker behavior can be tracked to locate careless or possibly malevolent employees.
- DLP Works Best on Known Threats
DLP systems are designed to check for specific patterns in text. But if the identifying data is difficult to isolate as risky (common phrases or shared terms) or isn’t text based, DLP systems can miss it altogether.
Intellectual property (IP) often falls into this category. Unlike a credit-card number or a patient ID, intellectual property is widely varying in format. For each new project, DLP administrators may need to create and test new rules based on the expected content.
Chemical formulas, manufacturing processes, customer lists, product-development documents: these are all examples of data that could either contain such specific terms that a DLP cannot realistically be updated to detect or are so common that filtering to find them would bring up far too many false positives. Media files—such as videos, audio recordings and images—may contain private data or IP as well, but scanning their contents is difficult. Unless these files receive an explicit classification using metadata the DLP can read, the DLP search capabilities are nearly powerless.
Since intellectual property is generated by your users, they should be tasked with identifying its sensitivity. This action will not only dramatically help your DLP systems protect IP from illicit access or sharing, but it will also remind users that this information has real value and belongs to the organization.
- Additional Benefits of Classification
Classification provides several other benefits, beyond enhancing DLP, that should not be overlooked.
Interoperability With the Entire Security Ecosystem
Persistent classification metadata offers the ability to trigger other protection systems on the basis of classification, such as the automatic application of Microsoft AD Rights Management Services (RMS) or S/MIME protection for email.
Data-Retention Management
Classification simplifies data retention because it provides more information to a content-archiving system and for users to process when making decisions about the appropriate retention period. Classifications can include date or status fields that, when filled or edited, can instantly update the retention and disposition status.
Email Redactions
Email text often contains sensitive information. By checking the email’s classification level against the email content, it’s possible to alert users when they are about to send information in a manner that conflicts with policy. Users can have the option to redact the sensitive data.
Flexible Email and Document Visual Markings
Classification can enable the application of customizable headers and footers, watermarks, email subject-line marking, email message-body labeling, dynamic disclaimers and portion markings. These markings remind users of information sensitivity, which promotes responsible handling.
Classification Markings on File Icons
Users can quickly identify the sensitivity of a document by the icon overlay. Files are flagged with a customizable marker that shows the classification to users without their having to open the file.
E-Discovery
Classification helps organizations avoid accidentally including too much of or even the wrong information in an e-discovery process. Classification labels can help sort and qualify only the data required.
A Dual Approach to Data Safety
Although data-loss prevention systems are extremely powerful and useful in the bid to keep private data private, the technology alone won’t guarantee success. By empowering users to classify their data, it’s possible to foster a culture of security awareness. Providing classification definitions and clear feedback makes it easy for users to correctly apply the right classification, which helps the DLP to enforce the correct handling policy. The application of classification markings to the document or email provides an extra reminder to staff, resulting in greater attention to security and fewer errors.
Tim Upton, Founder & CEO, TITUS