Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not export contact_email by default for better GDPR compliance #91

Closed
drn05r opened this issue Aug 19, 2020 · 3 comments
Closed

Do not export contact_email by default for better GDPR compliance #91

drn05r opened this issue Aug 19, 2020 · 3 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@drn05r
Copy link
Contributor

drn05r commented Aug 19, 2020

As contact_email has moved from core DataObj/EPrints.pm to flavours/pub_lib/cfg.d/eprint_fields_common.pl. If any changes had been made to this field in core (i.e. not to prevent it being exported for GDPR reasons), this change will be reverted.

For better GDPR compliance It makes sense to set export_as_xml to 0 by default (in flavours/pub_lib/cfg.d/eprint_fields_common.pl) and then system admins can change this if they are unhappy. I believe this will still appear in history revisions but it would be useful to check how easy it is to re-include this in internal (authenticated) exports.

@drn05r drn05r added the bug Something isn't working label Aug 19, 2020
@drn05r drn05r added this to the 3.4.3 milestone Aug 19, 2020
@drn05r drn05r self-assigned this Aug 19, 2020
@drn05r drn05r closed this as completed in 8d8f704 Aug 28, 2020
@jesusbagpuss
Copy link
Contributor

jesusbagpuss commented Jul 27, 2021

Hi @drn05r,

I believe this will still appear in history revisions

Did you validate this is true?

From observations in v3.3.16, and code inspection of the current 3.4 branch, I don't think this is the case - are you able to confirm that export_as_xml => 0 fields appear in your XML revision files? (I've just tried testing this on http://tryme.demo.eprints-hosting.org/ but it's 3.4.0 and the contact_email field doesn't have export_as_xml => 0 set)

I think they will get excluded by the following:

next if !$field->property( "export_as_xml" );

To my mind, this means XML revision files are an incomplete record of changes. For contact_email and GDPR purposes, this may still be desirable, but other export_as_xml fields possibly should be included in the revisions.

The export_as_xml flag is being used for two purposes - including/excluding public information (e.g. by Export::Simple) and by back-end processes (e.g. revision files).
There should probably be two different flags for this (or export_as_xml could be a set rather than a boolean - but that would be quite messy IMO.)

@drn05r
Copy link
Contributor Author

drn05r commented Jul 27, 2021

Hi John. You are right. Another example of me assuming the logical thing, (i.e. all non-volatile fields need to be included into revision files or you cannot properly track the revision changes), and that not actually being the case. I think this may explain why I often see new revisions that don't seem to change anything but lastmod and rev_number fields, which should only be updated if something non-volatile in the eprint has changed.

What I have initially just done (not yet committed) is add an options flag for calling $dataobj->to_xml called "revision_generation" that if you set to 1 will override the export_as_xml, (i.e. will not skip adding field whether true or false). I have found that this now means the fileinfo field is exported when previously it wasn't. I am not sure if this is a good or bad thing, as it is a generated field based on the value of others, so maybe it should have been set as volatile all along.

@drn05r drn05r reopened this Jul 27, 2021
@drn05r drn05r modified the milestones: 3.4.3, 3.4.4 Jul 27, 2021
@jesusbagpuss
Copy link
Contributor

I think this is longer-standing than just this change - but it felt like the right place to comment.

I think there's interplay with the 'hide_volatile' flag which is passed in to revision xml generation.
I haven't fully traced this through, but I think it is honoured by Subobjects that are documents, but not other metafield types - which might be where the fileinfo stuff comes in?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants