- More porn sneaks onto the iPhone
- 'Swatting' case shows need to ban caller-ID spoofing
- Why the iPhone can't be "killed"
- Nortel enterprise chief wants to bring back Bay
- US sets final emergency responder wireless pilot
Corporate efforts to secure data, comply with regulations, tier storage and meet new legal-discovery demands depend on having a good data-classification method in place.
Traditional classification methods that rely on file-system metadata lack comprehensive content visibility, because the Common Internet File System for Windows and the Network File System for Unix offer no more than eight metadata summaries for classification, such as file name, directory name, file size, type and modified or access dates.
These basic solutions are proving inadequate to address IT's requirements for accurate data classification.
That has spurred the emergence of a market segment called Information Classification and Management (ICM). These tools offer advanced features such as file-path metadata parsing, in-file content visibility, context category classification, file-classification tagging and policy-based management and tracking.
Unfortunately, some of these solutions still suffer from serious performance, scalability, flexibility and capability issues because of their foundational architectures, namely relational databases and/or enterprise search engines.
The latter have proven quite adequate for Web-based searching, as demonstrated by Google and others. Their limitations may make them unwieldly, and many IT professionals find them unsuitable for ICM requirements in enterprise environments. Think of search as building a dictionary to find a few words. One must first build a large index of all words in all files.
This process is quite slow (as much as two weeks to index 10TB) and can consume lots of storage (increasing requirements by 50% to 300% in most cases).
Advanced solutions targeted at ICM must go beyond search to provide true data mining of information. This includes the ability to find Social Security numbers, credit card numbers, source code or confidential information stored in unsecured locations. They also must be able to find data that resembles a name, company name, account number or litigation case name, or even a data-point value in a spreadsheet cell.
Some tools can use pattern or context recognition to detect document summaries or themes. This provides content visibility similar to search but adds context to make sure that John Apple is classified as a name and not a company or fruit.
Partner Content
Explore the Ultrium Edge
The powerful tape technology can address data security with tape encryption as well as long term data protection.
Find Out More
Disk and Tape Square Off
Discover what disk and tape really cost and which solution provides lower total cost of ownership and optimizes energy use for your organization
Download this White Paper
Don't Fall for the Myths
The Clipper Group explores the truth behind the myths of tape, digging into the misconceptions in the disk vs. tape debate.
Review this information
information examination
An examination of information security issues, methods and securing data with LTO-4 tape drive encryption
Read this analysis
Comment