Data classification is the process of organizing data into categories based on attributes like file type, content, or metadata. The data is then assigned class labels that describe a set of attributes for the corresponding data sets. The goal is to provide meaningful class attributes to former less structured information.
Data classification can be viewed as a multitude of labels that are used to define the type of data, especially on confidentiality and integrity issues.[1] Data classification is typically a manual process; however, there are tools that can help gather information about the data.[2] Data sensitivity levels are often proposed to be considered.
A corporate data classification policy sets out how employees are required to treat the different types of data they handle. Automated classification techniques are sometimes applied by software algorithms based on keywords or phrases in the content to analyze and classify it. It might be used for reports generated by ERP systems or where the data includes specific personal information that is identified. In some cases, employees might be responsible for deciding which label is appropriate.[3]