# Document Datasets with Personal Data[¶](https://knowledge.dataiku.com/latest/courses/governance/gdpr/document-datasets.html#document-datasets-with-personal-data "Permalink to this headline")

Now that a user in a GDPR admin group has configured the project-level data privacy settings, let’s see how a user in a GDPR documentation group can identify datasets containing personal data.

While logged in as a user with the *privacy\_doc* group privilege, from the project Flow, open the *Customers* dataset.

Click on the blue **GDPR** icon next to the dataset’s name to view the four GDPR fields for the dataset.

| GDPR Field | Documenting: |

| --- | --- |

| Contains personal data | Whether or not the dataset contains personally-identifying information |

| Purposes | For auditing purposes, the reason why this data was collected |

| Retention policy | For auditing purposes and to take appropriate filtering actions, how long the personal data can be used |

| Legal basis for consent | For auditing purposes, how the personal data came into our possession |

In this case, provide the following sample answers and click **Save**:

* Contains personal data: Yes

* Purposes: `Marketing communication Recommender system`

* Retention policy: `3 years after last action`

* Legal basis for consent: `Explicit consent on website`

Notice that the formerly blue GDPR icon has now turned red after indicating that the dataset contains personal data. Now select **Actions > Share** and choose any project to share this dataset with.

Because the *privacy\_doc* user has documented this dataset as containing personal data, and because of the project-level privacy settings configured by the *privacy\_admin* user, Dataiku DSS prevents the share from happening.

Now open the *Orders\_enriched\_prepared* dataset. The Prepare recipe that produced it has removed the personally-identifying information, and so we can mark this dataset as not containing personal data. The GDPR icon is now colored green to signal the lack of personal data.

In the Flow, you can view the GDPR status of each dataset by choosing **Metadata fields** from the View menu at the bottom left of the screen. Ensure that **GDPR fields - Contains personal data** is the selected field.

## Personal Data Status in Recipe Output[¶](https://knowledge.dataiku.com/latest/courses/governance/gdpr/document-datasets.html#personal-data-status-in-recipe-output "Permalink to this headline")

It is important to understand how recipes impact a dataset’s GDPR status with respect to the presence of personal data. The principle to keep in mind is that the documentation of personal data is, by design, a human task. Accordingly, documenting the output of a recipe with respect to personal data is also a human task.

Note

**Key Concept**

* When building a Flow, the initial *output* dataset to any recipe, regardless of whether its *input* dataset was marked as blue, green or red, will **not be defined** with respect to the presence of personal data.

It remains a human task to document the presence of personal data in an output dataset because it is not obvious whether the recipe in question has introduced or removed personal data.

After a user has documented the status of personal data in an output dataset, attention must also be given to the effect of **editing and saving** the recipe that produced it.

If all inputs to a recipe have been documented as being free of personal data (green), and a user has also marked the output dataset as green, then editing and saving the recipe will not affect the status of personal data in the output dataset. The output dataset remains green.

On the other hand, if the inputs to a recipe are not entirely clear of personal data (at least one blue or one red input), regardless of the previous documentation of the output dataset, editing and saving the recipe will revert the status of the output dataset to not yet defined (blue). Once again, the plugin ensures a human is in the loop to document the presence of personal data.
