In the first part of this blog series, we explored the challenges associated with today’s data landscape and the reasons why more and more organizations need an automated data discovery solution to address these challenges. The ever-increasing amount of data being processed, the evolving regulatory landscape, and the variety of technologies (legacy systems, data lakes, SaaS tools, etc.) that store and process data are just a few of the challenges organizations face when it comes to their data.
Watch the Webinar: Automate Your Privacy Program with Data Discovery
Organizations take varied approaches based on a variety of factors to address data challenges. While the right approach may vary based on internal and external factors, there are a few common mistakes organizations make when trying to know and govern their data:
- Locking down and over-governing: Many organizations live in fear of misusing their data, resulting in fines, data breaches, or loss of trust from their stakeholders. A typical response to ensure that this doesn’t happen is the locking down or over-governing the data. This might mean only giving data access to a limited number of individuals or data scientists to try to meet the entire business’s needs or putting lots of approval processes and red tape between the company and the data they need. In a data-driven economy, where data is an asset and a competitive advantage, this is not a solution that can scale for most businesses.
- Relying on manual data discovery processes: Organizations may choose to forgo deploying automated data discovery solutions in favor of a manual approach that may be cheaper and quicker to implement. Manual approaches to data discovery typically involve sending surveys to IT owners that ask questions about the types of data, how the data flows, and why it is used. While surveys and assessments play a key role in privacy, security, and data governance programs, using surveys as the sole method for data discovery can be tedious and a drain on resources. This tactic often provides an outdated and inaccurate reflection of what data you have and how it is governed.
While locking down data and leveraging manual discovery processes are key to privacy, security, and data governance programs, these methods should be leveraged in tandem with an automated data discovery solution to address modern data challenges.
Watch the Webinar: Automate Your Privacy Program with Data Discovery
Essential elements of an effective data discovery solution
A truly automated data discovery solution helps organizations understand their data across their business and third-party relationships. This is done by connecting to and scanning IT systems, leveraging AI, machine learning, and other technologies to classify, tag, enrich the data, and then inventory and take action on that data in different ways. Almost all data discovery solutions accomplish these key elements in some way, shape, or form, so an organization must dig deeper when evaluating the tool that is the best fit for them. A few essential capabilities to look for in a data discovery solution include:
- Discovery in the systems you know about and those you don’t: Many data discovery tools require the organization to know all of the IT systems they want to connect to and scan. In modern organizations, data is often sprawled across hundreds or thousands of systems, and shadow IT is almost always present in some form. Therefore, it is key for your data discovery solution to also help identify all of your systems by referencing existing sources of this information in the business like CMDB, IAM, CSPs, and CASB tools to ensure you have an accurate inventory of all data assets.
- Going deeper than the metadata: Some data discovery solutions only scan at the metadata level and attempt to give you information about the data you have without analyzing the actual data. Often the sensitive data you want to know about won’t be found in this manner. This can be the case with structured data – maybe credit card data was mistakenly entered into a field called “Phone Number”- and mostly unstructured data such as files, emails, and free text fields structured systems.
- Scaling across large volumes of data: Today, most organizations have petabytes of data. It is not scalable to have a discovery solution scan every piece of data every time a scan runs. While it is critical to go deeper than the metadata, it is also key to take a scalable approach that balances scanning full data sets with scanning smaller samples of data where appropriate.
- AI trained on the correct data: AI, machine learning, and other similar technologies used for data classification and enrichment are only as good as the data on which they are trained. Look for discovery solutions that train their technology on the most up to date regulatory requirements, frameworks, and data definitions to ensure the proper governance context is applied to your data.
The final critical element to consider is once you know your data, what do you do with that information? Many data discovery solutions are just that – data discovery solutions. Once the data is discovered, these tools need to integrate with other solutions to help your privacy, security, and data governance teams act and govern the data. Each of these teams has different use cases when it comes to actioning this information.
Privacy teams need to not only process DSARs and other data requests but build records of the processing of personal data. Security requires the ability to control access and understand the security risks posed by IT systems. Data governance teams must map the lineage of data, apply governance policies, and enable the use of the data for analytics and business value. Look for a data discovery tool that can integrate with these solutions or consider a platform approach.
OneTrust not only helps you discover, classify, and know your data but also provides the full platform to help privacy, security, and data governance teams operationalize the downstream actions, governance processes, and compliance reporting that need to occur after discovery.
In the next blog in this series, we’ll discuss how data discovery helps privacy teams automate the challenges of processing personal data in compliance with evolving privacy regulations worldwide.