Skip to main content

On-demand webinar coming soon...

Blog

How to better govern your unstructured data

From emails to images, unstructured data containing sensitive information abounds in your ecosystem

Jason Koestenblatt
Manager, Content Marketing
September 13, 2023

Three business associates have a friendly chat in an office lobby.

Unstructured data is any information that isn’t stored in a traditional row-and-column format and can be broadly defined. It is becoming a significant aspect of organizations’ data landscapes. In fact, up to 90% of the world’s data is held in an unstructured format. 

For most organizations, applications that store or process unstructured data such as Sharepoint, Outlook, Google Drive, and Slack are intrinsic to the day-to-day operations of the business. Just like all data, unstructured data has potential risks that need to be addressed by privacy, security, and governance professionals, however unstructured data poses unique risks that makes uncovering and addressing it particularly challenging.

Examples of unstructured data:

  • Text files or documents: Word processing documents, spreadsheets, presentations, PDFs, emails, and log files.
  • Emails: Email is sometimes considered semi-structured data because the metadata in email is considered structured, since analytics tools can classify and search easily for keywords. However, the email message fields are unstructured because currently analytic tools cannot parse it.
  • Media: Digital images, video, and audio files
  • Social media: Data from social networks like Facebook, LinkedIn, and Twitter.
  • Websites: Instagram, YouTube, and photo sharing sites are examples of unstructured data

Conversely, structured data is stored in a relational database or RDBMS so that it is identifiable, for example, a SQL (Structured Query Language). This type of data is easily mapped and organized into designated fields so that it can be searchable by data type within the actual content.

Learn how to build a data governance program with this ultimate guide.

 

Data type presents major challenges

The variety of unstructured data sources is staggering — PNG, JPEG, TXT, PDF, CSV, MP4, etc.

One of the biggest challenges of unstructured data is the huge variety in the data type. Access any file hosting or sharing application and you will likely find everything from PNGs and PDFs to TXT and MP4 files. The sheer range of file formats and quantities of data can be bewildering, yet they all contain data and therefore potentially personal or sensitive information. By the very nature of these file types, almost any information can be uncovered. 

Therefore, organizations are required to understand the types of data and classifications of data found in these files in order to meet data privacy and protection obligations. PDFs, for example, can contain anything from bank account information, a complete profile of an individual, or huge lists of personal data. The same goes for images, which can easily contain classified information saved for use at a later date. Storing this type of unstructured data could mean a violation of internal privacy and security policies or, in the worst-case scenario, even the law.

Realistically, classifying and categorizing the data found within unstructured file types cannot be done manually due to the volume of data that would need to be processed. For a full and accurate picture of what is hidden in your unstructured data, automation is essential

Technology is a must for unstructured data discovery projects to find, comprehend, and catalog all of this data, allowing privacy, security, and governance teams the opportunity to implement the appropriate controls over it.

 

Accessibility: A benefit full of unstructured data risk

A significant and beneficial feature of file hosting and sharing applications is the flexibility to allow users to host, share, and access files quickly and easily. Almost anything can be shared or accessed by almost anyone and for organizations, this promotes cross-functional collaboration, improves efficiencies for day-to-day tasks, and inspires innovation. Although with this flexibility comes a potential downside — data getting into the wrong hands. 

The amount of data and file types found in unstructured data and the potential for sensitive or restricted data to be contained in these files, combined with often open access to this data means that you are greatly increasing the likelihood of a major incident or breach involving restricted, personal or other protected data types. Understanding the classification of the data found in unstructured sources is rarely enough to govern this data properly. Once discovered, classified, and cataloged, proper access controls need to be applied to personal data and sensitive information, and remedial action needs to be taken to understand who has, and who has had, access to better protect the data.

 

Have you kept your data for too long?

Raise a hand if you have emails dating back years, or even decades stored on your email host’s server. 

Do you know what is contained in those emails? 

Now, extrapolate that email volume across hundreds or thousands of employees, and you can start to understand the scale of the problem that unstructured data causes for organizations. The personal information hidden within emails that have been lingering in the archives for years may now be in violation of data retention policies. And this problem extends further than just email, as files stored in file share applications can go unused and unaccounted for longer than is necessary and therefore need to be deleted. According to the General Data Protection Regulation (GDPR), you have to justify the length of time you store data.

 

How OneTrust Data Discovery helps

OneTrust Data Discovery serves as a valuable tool for Chief Data Officers, Chief Privacy Officers, and Chief Information Security Officers alike. Enhanced unstructured data discovery capabilities help find unstructured data across common shared-use applications as well as understanding the compliance obligations attached to the sensitive or personal information found within these files.  

OneTrust’s enhanced unstructured data discovery capabilities utilizes advanced machine learning-based classification to give users a clearer view into at-risk, sensitive, or personal data down to the individual data element level and automatically populate data maps to help maintain compliance with privacy and security regulations. Moreover, OneTrust Data Discovery adds further context to your data by helping you understand who has access and that the right level of access is implemented alongside applicable governance policies.  

OneTrust Data Discovery automatically populates data inventories, giving governance teams a clear, centralized view of their data, helping with compliance obligations, retention periods, and access controls.

Learn more about OneTrust Data Discovery tools and Data Governance by requesting a demo


You may also like

eBook

Privacy & Data Governance

Data governance across industries: Leveraging your organization's most valuable asset

Download our new eBook and learn how to leverage the value of data governance across industries, including financial services, healthcare, retail, and manufacturing.

April 17, 2024

Learn more

Report

Data Discovery & Classification

The KuppingerCole Leadership Compass on Data Governance

OneTrust has been named a leader in the 2024 KuppingerCole Leadership Compass on Data Governance, receiving the highest rating for Product​, Innovation​, and Market.

March 08, 2024

Learn more

Infographic

Data Discovery & Classification

OneTrust Privacy & Data Governance Cloud gains momentum with widespread industry recognition

OneTrust maintains its leading position in Privacy & Data Governance, with a record number of recognitions in the last six months from KuppingerCole and Forrester

March 07, 2024

Learn more

Infographic

Data Discovery & Classification

Data governance in manufacturing: Challenges and use cases

Learn the impact a data governance program has in manufacturing and how it enables greater efficiency across your supply chain

February 26, 2024

Learn more

Infographic

Data Discovery & Classification

What to look for in a data discovery solution

Make sure you choose the right data discovery solution for your organization with our comprehensive breakdown of key benefits and features to look for.

February 20, 2024

Learn more

Infographic

Data Discovery & Classification

Data governance in retail: Challenges and use cases

Learn how data governance can help manage the high volume and sensitivity of data that runs through your retail operations.

February 12, 2024

Learn more

Infographic

Data Discovery & Classification

Data governance in healthcare: Challenges and use cases

Learn how data governance can help your healthcare organization effectively manage its protected health information (PHI) and other sensitive data.

February 08, 2024

Learn more

Infographic

Data Discovery & Classification

Data governance in financial services: Challenges and use cases

Learn how data governance can help address common challenges in the financial services industry and protect your most critical information.

January 12, 2024

Learn more

Webinar

Data Discovery & Security

A guided tour of OneTrust Data Discovery magic

Our expert speaker will demonstrate how common real-world data challenges can be identified, addressed, and reported on, leading to better data governance, security, and alignment with business goals. 

October 26, 2023

Learn more

Webinar

Data Discovery & Security

Data minimization and risk assessment in data discovery

Explore the concept of data minimization and its crucial role in enhancing security, privacy, and reducing risk.

October 19, 2023

Learn more

Webinar

Data Discovery & Security

Data Discovery Dispelled: Unmasking the mysteries of data

Join us for a journey into the heart of data management as we explore the depths of data within organizations and shed light on how technology can enhance data security, privacy, and compliance.

October 12, 2023

Learn more

Webinar

Data Discovery & Security

Data Discovery Dispelled: Data's dark corners

Join the first part of our Data Discovery Dispelled webinar series where we will discuss the hidden sensitive information that could pose risks for your organization.

October 12, 2023

Learn more

Infographic

Privacy & Data Governance

Understanding the EU Data Boundary

Download our free infographic and get the information you need to understand the EU Data Boundary and how to properly handle data in the European Union.

September 22, 2023

Learn more

eBook

Data Discovery & Classification

Ultimate guide to building a data governance program

Download this eBook and learn practical methods in building a flexible data governance program that aligns with your business.

August 14, 2023

Learn more

Webinar

Data Discovery & Classification

Live demo: OneTrust Data Discovery

See how OneTrust Data Discovery can help your organization achieve complete data visibility to empower your security program and reduce risk.

June 23, 2023

Learn more

Webinar

Data Discovery & Classification

OneTrust Data Discovery Day: A deep dive into automating data discovery and classification

Join us for a two-hour deep dive into data discovery and how OneTrust helps privacy, IT, and security teams understaind their data and achieve risk reduction goals.

June 13, 2023

Learn more

Infographic

Data Discovery & Classification

How OneTrust Data Discovery integrates with Microsoft 365

Explore three key integration capabilities of OneTrust Data Discovery and Microsoft 365.

June 13, 2023 3 min read

Learn more

Webinar

Data Discovery & Classification

Monitoring least privilege access risks

Understand common scenarios for applying data access governance within your business and key considerations for evaluating open access risk.

May 18, 2023

Learn more

In-Person Event

Privacy & Data Governance

Privacy in practice

Join us for a deep dive into embedding privacy by design into the fabric of your business to promote the responsible use of data.

May 09, 2023

Learn more

Webinar

Data Discovery & Classification

Orchestrating data retention & deletion to reduce ROT data

Learn how organizations who orchestrate data retention not only satisfy retention requirements, but also reduce data sprawl and breach risk. 

April 27, 2023

Learn more

Webinar

Data Discovery & Classification

De-Risking data with visibility & classification

Join this interactive webinar to learn how Data Discovery helps information security teams gain visibility into risky data and prioritize investments.

April 11, 2023

Learn more

Infographic

De-risking data through visibility and action

The rapid growth of data has increased the risk of data breaches, learn how IT and security teams can secure, monitor, and de-risk that digital information.

March 09, 2023

Learn more

Infographic

Data Discovery & Classification

The CISO challenge: Data. Threats. Regulations.

Unstructured data poses risks due to its open access and lack of governance, and CISOs need to implement measures to track, de-risk, and protect it.

March 03, 2023

Learn more

Webinar

Data Discovery & Classification

Mitigating US privacy risk to control your organization’s attack surface

In this session, we'll discuss how the requirements under upcoming US Privacy laws create an opportunity for businesses to embed privacy by default.

November 17, 2022

Learn more

Webinar

Data Discovery & Classification

UK panel: What are data subject access requests and how do you manage them?

Join our UK legal experts as they discuss data subject rights access requests (DSAR) and how automation streamlines fulfilment and protects privacy.

April 19, 2022

Learn more

Webinar

Privacy Management

Privacy rights: Enhance Your DSAR process with automation, discovery & redaction

As part of our Privacy Automation webinar series, we discuss why it's important to automate DSAR fulfillment and the latest regulatory trends. 

March 22, 2022

Learn more

Webinar

Data Discovery & Classification

UK DSAR automation: From intake to redaction and beyond

Join us for this instalment of our Future of Privacy Automation Series for a discussion of the challenges, key components, and building blocks of DSAR automation.

March 14, 2022

Learn more

Webinar

Data Discovery & Classification

Meeting California's employee privacy rights requirements

Watch this webinar to learn more about California's employee privacy rights requirements and how to comply.

March 08, 2022

Learn more

Webinar

Data Discovery & Classification

Tackling unstructured data challenges

In this webinar, learn about the risks of unstructured data and effective strategies in automating discovery.

March 02, 2022

Learn more

Webinar

Data Discovery & Classification

Snowflake and OneTrust: Integrated data governance for your enterprise data

Watch this webinar where we discuss how Snowflake leveraged OneTrust to help better understand and classify their data.

October 05, 2021

Learn more