The value of data today is greater than ever before, with companies looking for ways to optimize its collection and utilization to provide customers with timely, personalized experiences. As data’s value increases, so do the associated risks and costs. Cloud storage alone accounts for 30% of a company’s overall IT budget, with one terabyte (TB) of data costing $3,351 per year on average. That’s a cool $1M in storage costs alone for 300 TB of data. Apart from the rising costs of data storage, data breaches are also becoming more prevalent with the volume and variety collected by organizations today. The average damage of a data breach in 2022 sat at $4.35M.
The problem is clear: More data, more costs, more risk.
But is there more value? That’s up to how your organization makes use of it. Hoarding data or collecting it without a clear purpose not only increases the issues of storage cost and breach risk mentioned above, but also violates myriad regulations and other principles of data minimization and data retention policies.
Unstructured data and its challenges
Well, if it’s so clear that data minimization and data retention is the answer to high storage costs, data breach risks, and non-compliance issues, why isn’t everyone doing it? More than 80% of the data stored by organizations is unstructured.
This means it’s in the form of:
This data also usually becomes meaningless in 90 days, and nearly a third of it is considered redundant, obsolete, and trivial (ROT). ROT data not only adds empty data storage costs, it’s also prime fodder for data breaches as it typically sits outside secure systems. It expands the attack surface of your company, which is all the possible risk areas from which an unauthorized user or attacker could breach your system.
Keeping these concerns with unstructured data and a growing attack surface in mind, most privacy regulations today call out the need to include data minimization practices as a part of standard operation procedures. Recent enforcement actions from the Federal Trade Commission (FTC) show that privacy and data security best practices have data minimization as a key tenet. Companies can start to include this in their data workflows, using privacy by design principles in their products or services to ensure data is minimized from the outset and collection and use are clearly communicated to customers.
How can companies operationalize data retention and minimization?
Now that the solution of incorporating privacy by design into your products and services from their inception is clear, the next step is figuring out how to integrate them into your processes seamlessly.
1. Observe your current data lifecycle
To kick things off, look at your most common data workflows and scenarios. Analyze your metadata to see relevant fields data created, last accessed/modified. Identify when data stops being necessary, where data is commonly deleted in these situations, and see how this could correlate to a data retention schedule.
2. Establish a deletion method
After identifying where data is deleted and formulating a retention schedule around these scenarios, you can apply these retention periods to your data, e.g. archiving or deleting SharePoint files after they cross a certain time threshold.
3. Use a centralized data governance tool
When your retention periods are defined and deletion methods are established, using a tool to power this mechanism is the most efficient way to go about this process.
How can automation help?
OneTrust Data Discovery can help your organization operationalize data retention policies by helping you first identify unstructured data across your entire IT infrastructure. After having full visibility across your data ecosystem in structured, semi-structured and unstructured environments, you can then:
To learn more about how OneTrust Data Discovery can take your organization’s data retention and minimization policies to the next level, request a demo today.