Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed Work Item: First-Party Sets #17

Closed
krgovind opened this issue Jun 3, 2020 · 48 comments
Closed

Proposed Work Item: First-Party Sets #17

krgovind opened this issue Jun 3, 2020 · 48 comments
Labels
interest: blink Implementer interest from Blink (e.g. Brave, Google/Chrome, Microsoft/Edge) interest: webkit Implementer interest from WebKit (e.g. Apple/Safari, Igalia/GNOME Web) work item? Formal request to adopt this proposal as a Work Item of the CG

Comments

@krgovind
Copy link

krgovind commented Jun 3, 2020

First-Party Sets is a web platform mechanism that allows a set of registrable domains (or origins) to be defined as "first-party" to each other. Our primary motivation for this proposal is to define a privacy boundary that allows browsers to eliminate cross-site tracking that currently relies on mechanisms such as third-party cookies and fingerprinting. Tracking policies and privacy models from various browser vendors - Chromium, Edge, Mozilla, WebKit - scope access to user identity to some notion of first-party , which we refer to as a privacy boundary.

Although the top-level document’s registrable domain can act as a natural privacy boundary; it is clear that multi-domain sites are a reality, which compels us to define a better alternative. For example, Firefox ships an entity list to group together domains belonging to the same organization.

Organizations generally prefer maintaining distinct domain names to manage branding, or to allow for future business sales/acquisitions. Additionally, choosing the registrable domain as the privacy boundary may compel organizations to move all their web properties to a single parent domain. The parent domain that a property is hosted on may change with business ownership, and train users to make security decisions based on the subdomain component of URLs. This could make them more susceptible to phishing attacks.

First-Party Sets allows site operators to assert a list of domains as being associated with the same entity. This then allows us to define a top-level document’s First-Party Set as the privacy boundary. Browsers may choose to not impose cross-domain communication restrictions across members of a given First-Party Set (such as is done in practice with disconnect.me’s extension, Firefox ETP’s use of the entity list, and Edge Tracking Protection’s similar exception for same-party domains). However, it is important to apply a set of countervailing pressures:

  • Preventing abuse by unrelated websites forming a First-Party Set - This is achieved by requiring every organization to submit their list for acceptance based on conformance with a published UA policy.
  • Making site associations visible to the user - This is achieved by making First-Party Sets discoverable via various browser UI surfaces.
  • Discourage formation of arbitrarily large sets by imposing storage and entropy limits - Browser storage limits and entropy limits such as the proposed Privacy Budget that are currently applied per-domain are applied per First-Party Set

First-Party Sets has recently been the subject of discussion on various forums; including at PrivacyCG F2F, and WebAdvBG.

We have been working to incubate First-Party Sets in WICG, and it was recently transferred there: https://github.com/WICG/first-party-sets

We'd like to propose that the Privacy CG discuss it and see if the group would like to take it on as a Work Item.

@hober hober added interest: blink Implementer interest from Blink (e.g. Brave, Google/Chrome, Microsoft/Edge) needs implementer interest Proposals cannot become work items without multi-implementer interest work item? Formal request to adopt this proposal as a Work Item of the CG labels Jun 4, 2020
@othermaciej
Copy link

Apple supports adopting this proposal as a Privacy CG Work Item. We have proposed similar mechanisms in the past and continue to be interested in this area.

In honesty, we would probably not implement the spec as-is because it leaves too many of the hard problems with such a mechanism unsolved or up to each individual browser, but we believe they are eminently solvable, and Privacy CG would be a great place to work through them.

@melanierichards
Copy link

Echoing Microsoft Edge sentiment from the WICG Discourse thread: we believe that First-Party Sets could be useful in helping unblock valid intra-organizational use cases while maintaining the right privacy promises. We’re supportive of exploring this idea further. Agreed that as a community we’ll need to continue workshopping mitigations against abuse while striking the right balance between organizational cohesion vs. sets that can be reasoned about by most users. We’re hopeful that we can collectively come up with solutions to these considerations, and are interested in continued discussion on First-Party Sets.

Privacy CG would be a great home for this.

@hober hober added interest: webkit Implementer interest from WebKit (e.g. Apple/Safari, Igalia/GNOME Web) and removed needs implementer interest Proposals cannot become work items without multi-implementer interest labels Jun 4, 2020
@hober hober changed the title First-Party Sets Proposed Work Item: First-Party Sets Jun 4, 2020
@pbannist
Copy link

pbannist commented Jun 5, 2020

Echoing what I wrote on the Discourse thread, I think this proposal is better discussed in WICG. Privacy is a major consideration here, but it is not the overriding or exclusive consideration. The Privacy Group would seem to relegate all other considerations to second-class, which is not appropriate for a standard that has so many implications that go beyond privacy.

@johnwilander
Copy link

Echoing what I wrote on the Discourse thread, I think this proposal is better discussed in WICG. Privacy is a major consideration here, but it is not the overriding or exclusive consideration. The Privacy Group would seem to relegate all other considerations to second-class, which is not appropriate for a standard that has so many implications that go beyond privacy.

First Party Sets aim to relax privacy (and potentially security) protections on the web. Such protections are an overriding concern but not an exclusive concern. If we don't figure out how to uphold existing protections, browser vendors who prioritize user privacy are unlikely to implement First Party Sets and the end result would be a bifurcated web in terms of how domain names are handled. That's why I think First Party Sets should be discussed in the Privacy CG. This is a place where we have a reasonable chance of figuring out a version of this proposal that's acceptable by most browser vendors.

@pbannist
Copy link

pbannist commented Jun 5, 2020

If adopted by browsers other than Chrome (like Safari/Webkit) then, yes FPS does have the side effect (not aim) of reducing privacy, and perhaps security, protections. However, within Chrome, it is part of a set of proposals that aim to increase privacy and security, while limiting economic damage to publishers.

It is possible that a more desirable outcome for "the web" is:

  1. More privacy, on the whole
  2. Less economic damage to publishers
  3. A bifurcated web around standards (which already exists in many cases)
  4. Less power concentrated among a small number of multi-national conglomerates

It seems less likely that an honest conversation across all stakeholders can be had if privacy is the overriding concern.

@johnwilander
Copy link

If adopted by browsers other than Chrome (like Safari/Webkit) then, yes FPS does have the side effect (not aim) of reducing privacy, and perhaps security, protections. However, within Chrome, it is part of a set of proposals that aim to increase privacy and security, while limiting economic damage to publishers.

It is possible that a more desirable outcome for "the web" is:

  1. More privacy, on the whole
  2. Less economic damage to publishers
  3. A bifurcated web around standards (which already exists in many cases)
  4. Less power concentrated among a small number of multi-national conglomerates

It seems less likely that an honest conversation across all stakeholders can be had if privacy is the overriding concern.

I don't understand what "within Chrome" means. Do you mean this is a one-browser feature? If the aim is not to get browser interoperability, I don't see why it should be discussed anywhere within W3C. This is a place were we work together to enhance and develop a web platform that works regardless of which (modern) browser is being used. Given that the goal is interoperability, I think the Privacy CG is the right place to work on First Party Sets.

I'll let Google and @krgovind speak to whether they share your views since they are the ones proposing First Party Sets.

@pbannist
Copy link

pbannist commented Jun 5, 2020

I mean that if Webkit/Safari and other browsers could consider other perspectives (around economic benefits, increased competition, support for diverse voices, etc.) beyond privacy, perhaps an interoperable standard could be created that does decrease privacy in return for other end-user benefits. Or, the choice could be made that an interoperable standard is not possible.

However, without an honest conversation around all considerations, it seems that the only possible outcomes of an interoperable FPS standard are:

  1. increased privacy on the whole: increased privacy in Chrome, slightly decreased privacy in other browsers
  2. one additional user benefit: transparent cross-domain functionality within an "organization" - I am oversimplifying the benefit, to be fair
  3. significant degradation of other considerations: reduced competition/innovation, reduced economic outcomes, reduced diversity of voices on the web

I'm also very interested in the Chrome team's point of view.

@erik-anderson
Copy link
Member

The Privacy CG has very explicit goals around multi-implementer support and evaluating web compatibility impacts, so a characterization that it doesn't take a holistic view is, in my opinion, unfair. Current Work Items, including Storage Access API and Private Click Measurement, are designed to provide capabilities to help address some of the concerns outlined. Privacy considerations will be an important part of the conversation no matter where this is incubated.

The Privacy CG has a more regular cadence for discussion than WICG (which is designed to be lightweight), including twice-a-month teleconferences, breakout sessions, and face-to-faces. It's likely to get more focused time and attention from a diverse set of interests, including both the ads industry and browser developers. As a result, I believe it's likely to move forward more quickly in the Privacy CG.

@bslassey
Copy link

bslassey commented Jun 6, 2020

First Party Sets aim to relax privacy (and potentially security) protections on the web.

I'll disagree with this framing. First Party Sets are aiming to establish a well defined notion of first parties that can safely maintain existing capabilities granted to third parties in order to enable browsers to put greater restrictions on true third parties. It's also important to note that both Firefox and Edge have seen the need to use entities.json for a similar purpose. So this is hopefully standardizing that existing behavior.
If we end up doing an origin-based FPS, this could allow a tightening of same-site security boundaries. I think this opportunity is really interesting, if only for the potential to deprecate or at least reduce the dependency on the PSL.

@johnwilander
Copy link

First Party Sets aim to relax privacy (and potentially security) protections on the web.

I'll disagree with this framing. First Party Sets are aiming to establish a well defined notion of first parties that can safely maintain existing capabilities granted to third parties in order to enable browsers to put greater restrictions on true third parties. It's also important to note that both Firefox and Edge have seen the need to use entities.json for a similar purpose. So this is hopefully standardizing that existing behavior.
If we end up doing an origin-based FPS, this could allow a tightening of same-site security boundaries. I think this opportunity is really interesting, if only for the potential to deprecate or at least reduce the dependency on the PSL.

Sorry, I should have been more precise. Today, a third-party means differing registrable domain from the top frame. With FPS, the intention is to, for at least some engine decisions, treat some such differing registrable domains as first party. That to me is a relaxation.

But all of this should be discussed in issues, not the proposal. 🙂 I‘ve been wanting to solve this for years, as shown by my two pitches of the idea to WebAppSec in 2017, and I really hope we can get to a definition that holds over time as new business decisions are made based on the existence of FPS and that meets user expectations. I even have some ideas for how to resolve some things. I’ll share once we have a repo.

@annevk
Copy link

annevk commented Jun 6, 2020

existing capabilities

  1. I think allowing for this at all was a design mistake.
  2. Standards have always allowed for different policies to prevent tracking, e.g., https://html.spec.whatwg.org/#user-tracking. What Safari has done and others are doing as well is making that the default.

It's also important to note that both Firefox and Edge have seen the need to use entities.json for a similar purpose. So this is hopefully standardizing that existing behavior.

How would that work without it being centrally managed?

@krgovind
Copy link
Author

krgovind commented Jun 6, 2020

It's also important to note that both Firefox and Edge have seen the need to use entities.json for a similar purpose. So this is hopefully standardizing that existing behavior.

How would that work without it being centrally managed?

@annevk I think what you're advocating for is a centralized/unified UA policy as defined in the current proposal, in order to enable standardization? Please feel free to open an issue on the repo with that suggestion. :)

@jackfrankland
Copy link

I'd like to make a quick argument for the proposal #11, if I may :)

Instead of defining a relationship between domains, I believe a better solution is to define the relationship between a domain and the business that owns it. A business may own multiple domains, and therefore relationships between domains can be inferred, potentially serving the same goals as first party sets. In just this regard I believe it has the following advantages:

  • There is no need for a possibly arbitrary owner domain.
  • It's not necessary to make requests to two domains to confirm the relationship.
  • I believe there is value in knowing the business/entity that has access to the user's data for a domain, and that this relationship is a more easily defined thing that user agents can freely use to determine differing behaviour. This is in contrast to something that arguably has less meaning/value by itself when it depends on dynamic UA policy for its definition. In order for first party sets to be most successful, it may require consistency of behaviour between user agents, which could be difficult.

@bslassey
Copy link

bslassey commented Jun 8, 2020

Sorry, I should have been more precise. Today, a third-party means differing registrable domain from the top frame. With FPS, the intention is to, for at least some engine decisions, treat some such differing registrable domains as first party. That to me is a relaxation.

I just want to be really clear about this point, while FPS establishes a set of domains that are owned/controlled/run by the same party, it is not suggesting to treat them as first party to each other such that they would be equivalent to subdomains of the same registrable domain. Perhaps this was a mistake in naming (perhaps "Entity Sets" would be better to put it in the context of entities.json and happy to revisit that choice).

existing capabilities

  1. I think allowing for this at all was a design mistake.

And that is why we are all looking to reduce the capabilities of third parties, which this helps to enable. Or are you suggesting that those capabilities are too powerful to allow for a set of domains that are owned/controlled/run by the same party?

  1. Standards have always allowed for different policies to prevent tracking, e.g., https://html.spec.whatwg.org/#user-tracking. What Safari has done and others are doing as well is making that the default.

It's also important to note that both Firefox and Edge have seen the need to use entities.json for a similar purpose. So this is hopefully standardizing that existing behavior.

How would that work without it being centrally managed?

As @krgovind pointed out, central management is certainly a possibility, but defining what that means is important. I am very much of the opinion that the current central management isn't working very well for a number of reasons (no clear policy, sets that are clearly wrong, lack of awareness or opt-in from affected entities, etc.).

@hober
Copy link
Member

hober commented Jun 8, 2020

@krgovind, would you like to talk about this during this week's telcon? If so, please add the 'agenda+' label to this issue. Thanks!

@krgovind krgovind added the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Jun 8, 2020
@johnwilander
Copy link

johnwilander commented Jun 9, 2020

Sorry, I should have been more precise. Today, a third-party means differing registrable domain from the top frame. With FPS, the intention is to, for at least some engine decisions, treat some such differing registrable domains as first party. That to me is a relaxation.

I just want to be really clear about this point, while FPS establishes a set of domains that are owned/controlled/run by the same party, it is not suggesting to treat them as first party to each other such that they would be equivalent to subdomains of the same registrable domain. Perhaps this was a mistake in naming (perhaps "Entity Sets" would be better to put it in the context of entities.json and happy to revisit that choice).

This is very similar to the discussion back in spring of 2017 when I called this proposal Same-Origin Policy v2. People thought I proposed relaxing parts of the existing same-origin policy, similar to what you describe above with subdomains. That was never the case and it is not the case here where I say relaxation.

There are many more "engine decisions" made on first versus third party than same-origin policy ones. I went through some of them back in 2017 and would like to explore them anew as part of this work item. Some examples:

  • Partitioning. You could envision a joint FPS partition or even no partitioning within a FPS.
  • Cookie blocking.
  • CORS preflights. You could envision a cross-origin resource not having to do preflights if it's loaded by a website within its FPS. That mode could be opt-in. (You could argue that this is actually a case of relaxing the existing same-origin policy. 🙂)
  • Storage Access API decisions on prompting or wording in the prompt.

@englehardt
Copy link

It would be helpful to understand precisely the problems we’d like to solve with First Party Sets, and why those problems can’t be solved through other web platform features or proposals (e.g., the Storage Access API).

The definition of “first party” should be clear and understandable to users, web developers, and publishers. The simplest, most natural approach is to enforce a strict one-to-one mapping between first party and registrable domain (i.e., eTLD+1) or a narrower selector (e.g., origin). Using information from the top-level URL is the ideal way to indicate first party because this is already familiar to most users, it is based on a unique identifier for the website owner, it is consistent across web browsers, it is visible in the address bar, and is even visible in a URL to a page that has not yet been visited.

Unfortunately, a definition of first party based on top-level URL isn’t compatible with all sites on the web today. Some cross-site applications expect unrestricted access to third-party cookies. For this reason, Mozilla has deployed Disconnect’s entity list. This is a web compatibility intervention that we hope to deprecate as fewer browsers support third party cookies and fewer sites rely on them.

Standardizing such an intervention through First Party Sets solidifies new means of cross-site communication that are unintuitive, and that reduce the accountability a site has to a user. This is opposite of the direction we'd like to move the web.

Shared membership in a First Party Set is not easily discoverable. Why should a user expect that a visit to siteA-flowers.example would automatically be correlated to their siteB-roses.example account? We should not have to rely on their shared ownership being implicit knowledge. We don’t see an additional “UI treatment” that will fix the unwanted surprise.

Requiring the user agent to enforce a policy puts too much onus on the user agent in constructing a policy and rules for determining which First Party Sets are permitted. Inconsistent application of those rules, especially between different browsers, creates considerable uncertainty for sites. This creates compatibility problems for all browsers that are most felt by smaller actors, and may force browsers to adopt the most permissive of the policies (as pointed out by Maciej). This might be alleviated by agreeing to a common set of rules, but we don’t expect to reach agreement on those rules, leaving uncertainty where there is no agreement.

These issues seem fundamental to the design of the proposal, and hence Mozilla is not supportive of First Party Sets.

@englehardt
Copy link

First Party Sets aim to relax privacy (and potentially security) protections on the web.

[snip] It's also important to note that both Firefox and Edge have seen the need to use entities.json for a similar purpose. So this is hopefully standardizing that existing behavior.

To respond to this specifically: we found entities.json to be necessary for web compatibility, but that shouldn't be used as justification for standardizing such functionality. The need arises from shipping protections in the face of applications that rely on the legacy functionality of permissive third-party cookie policies at a time when blocking some (or all) third-party cookies was not a shared goal among browser vendors. It's something we can do without requiring websites to change, but first party sets requires change.

@brodrigu
Copy link

This creates compatibility problems for all browsers that are most felt by smaller actors

While use-cases of larger actors are clearer and these actors have the resources to be more vocal and represented, we should be cautious about prioritizing the larger actors use-cases above those of smaller actors, particularly if we aim to promote a dynamic and open web.

@jdwieland8282
Copy link

I just want to be really clear about this point, while FPS establishes a set of domains that are owned/controlled/run by the same party, it is not suggesting to treat them as first party to each other such that they would be equivalent to subdomains of the same registrable domain. Perhaps this was a mistake in naming (perhaps "Entity Sets" would be better to put it in the context of entities.json and happy to revisit that choice).

I think the definition of a FPS should expand to include domains acting in a cooperative fashion, otherwise FPS heavily favors big companies, Google.com & youtube.com for example.

@othermaciej
Copy link

I agree with Mozilla's concerns about this proposal. However, I think it's at least possible, if uncertain, that the user-understandability, bad-faith, and interop problems can be solved, and I think it's worth a try.

@jwrosewell
Copy link

A method of addressing the competing concerns the proposal highlights is needed. Two options available are:

  1. First-Party Sets were reviewed by W3C Technical Architecture Group (TAG) during May. TAG have a set of adopted Ethical Privacy Principles which would have been used to assess this, and any other proposal. Is it possible to ask TAG reviewers for their assessment regarding the competing concerns raised here?

  2. The problems described here are not new. The issues surfaced in these comments were documented in 2002 by MIT in their "Tussle in Cyberspace" document. Page 3 is particularly interesting. The “Tussle” should be settled before specific proposals dependent on the outcome of the “Tussle” are contemplated. W3C values among other documents provide some guidance.

Overall the proposal is based on a number of assumptions which do not sit comfortably with both TAG and W3C positions.

  • People are incapable of trusting a domain owner AND their supply chains. In no other industry is this the case.
  • People should not have the ability to make such trust choices.
  • The W3C should create standards to resolve matters related to commercial practice.

My broader comments to TAG review can be found here.

@jdwieland8282
Copy link

I would encourage you to engage with other proposals such as FLoC and TURTLEDOVE, which aim to preserve a vibrant and competitive open web.

@krgovind thank you for addressing our support for expanding FPS. wrt FLoC, our concern is that the only entity with the ability to create FLoCs or cohorts is the browser, we feel that is anti-competitive. What if the FLoCs generated don't preform any better than contextual? What if the FLoC's are unpredictable. It will be very challenging to "preserve a vibrant and competitive open web" if we are made to design bidding strategies against a FLoC created by what is essentially a black box to us.

Our desire for expanding FPS tracks directly to our desire to have another trusted entity that can create FLoCs or cohorts. That trusted entity will need cross domain identity signals to build viable cohorts, similar to what the browser will use, except different in one important way. Whereas the browser could have access to all browsing habits, we are only asking for browsing habits within the FPS.

@michaelkleber
Copy link

Hello @jdwieland8282,

I do believe that some of the ideas on how to build TURTLEDOVE-style interest groups should support your desire here: a bunch of sites that band together and jointly create ad targeting audiences based on activity on any of those sites.

For prior discussion of more powerful ways to build audiences, check out TURTLEDOVE issue #26, Criteo's SPARROW version, and Facebook's approach. But there's room for a lot of flexibility here.

It sounds like you also want to limit these audiences so that they can only be targeted while someone is visiting that same collection of sites? That hasn't come up before, but it would be an easy feature to add.

Anyway, if your goal is building cohorts to target ads at, please work with us in making the TURTLEDOVE/SPARROW idea space support your needs.

@jdwieland8282
Copy link

Hi @michaelkleber,

I do believe that some of the ideas on how to build TURTLEDOVE-style interest groups should support your desire here: a bunch of sites that band together and jointly create ad targeting audiences based on activity on any of those sites.

Not entirely, TD interest groups do an ok job at retargeting, but there is no mechanism for finding the "next 1000" customers interested in my product or service. Modeling, the idea that given a seed, one can predict what other users will be interested is essential to Ad Tech and is more or less what (based on my understanding) FLoC does. Criteo's Sparrow version is promising, but FB's proposal won't work for publishers with limited 1st party data, FB is unique in that they have many many users who generate lots of 1st party data which can be used for a seed and modeled audiences.

It sounds like you also want to limit these audiences so that they can only be targeted while someone is visiting that same collection of sites? That hasn't come up before, but it would be an easy feature to add.

This is not a conclusion I would draw based on my previous comments. I think we can set it aside for now. The core point I'm making is that we need cohort creation to be possible by more than just the browsers, and the only way for small publishers to generate enough data/signal for this cohort creation is for them to be able to share data horizontally among themselves (not necessarily w/ advertisers). ex. a FPS

Thanks for your comments, I plan to attend the Sparrow Tech workshop next week.

@michaelkleber
Copy link

We definitely do want interest groups to support the "next 1000 customers" use case. The SPARROW Lookalike Targeting section is explicitly about this, and I'm happy to work on how something like the FB proposal can be made available to someone who is a third party on many consenting sites, rather than one large first party.

But we (Chrome) are not interested in an approach that involves joining up individual users' browsing histories across many different sites. Our focus is on ways to build audiences that don't require giving out browsing history. First Party Sets is the wrong tool for this problem.

@johnwilander
Copy link

Given the comments on separate proposals above, I think it would be useful to have a separate discussion on them to see if there is any multi vendor interest. Chrome folks, do you intend to set something like that up for e.g. Turtledove?

@michaelkleber
Copy link

Yes! TURTLEDOVE & SPARROW have just moved into WICG (discourse thread), very much because we want to have multi-vendor conversations about it.

@jackfrankland
Copy link

Your proposal mentions an authority that signs domain-to-entity relationship - so there is clearly some policy at play, although you are proposing a unified policy - which I think comprises of two requirements: (a) common ownership, and (b) common privacy policy - across all user agents. Others have also expressed concerns about inconsistencies between user agent policy, and that is something we are willing to work to bring consistency to.

@krgovind thanks a lot for the reply. You're right, an authority would have to follow a policy in order to sign off on the information given.

In my proposal, this would mean verifying that the correct business is being registered for the domain. The correct business should be the one that is named on the published privacy policy on the site. A published privacy policy is already commonplace, and is required by law in certain jurisdictions (e.g. https://gdpr.eu/privacy-notice/). The proposal's main aim is to have some of this information readable by the user agent programmatically, in an effort to reduce the over-prevalence of consent overlays, and to foster better transparency / control of the user's data.

My proposal does not go as far as defining UA behaviour/policy, how it should treat two domains owned by a common business, or requirements for their privacy policies to be the same. In that respect, the goals for these two proposals are quite different. However, my argument is that the publication of the domain-to-business relationship suits the goals for this proposal nicely, and may be more useful than the publication of a domains-to-domains relationship - especially if the policy for this proposal ends up being that the domains must have matching business ownership according to their privacy policies.

@hober
Copy link
Member

hober commented Jul 24, 2020

We have consensus among the @privacycg/chairs and @krgovind (as required by our charter) to adopt this as a Work Item, with @krgovind and @davidben as Editors. I'll work with the @WICG/chairs to transfer the repository over soon.

@krgovind krgovind added the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Aug 10, 2020
@krgovind
Copy link
Author

If they indeed did that it seems like it would make it much more transparent to users who they have a relationship with. How is that not a win?

@annevk I tried to explain this, but perhaps didn't do a good job. :) Essentially, forcing all sites to move to subdomains of their parent/owner domains would have, in my example, manifested as flickr.com moving to flickr.yahoo.com, then flickr.verizon.com and subsequently to flickr.smugmug.com. This would train users to stop paying attention to the registrable domain, and focus only on the subdomain. Thus, it would make them susceptible to entering their credentials on mybank.evil.com, because mybank is in the subdomain, and lead the user to think "perhaps mybank was recently acquired by evil.com"?

I will also mention a couple of other use-cases that we learned about:

  • Sites that serve user-uploaded content may want to serve such untrusted content on a separate domain. For example, googleusercontent.com exists for this reason. Similarly, we recently ran into the example of codepen.io and cdpn.io, where cdpn.io appears to depend on 3p cookies when embedded on codepen.io.
  • We've also heard from others in PrivacyCG that organizations prefer to maintain top-level domain names as an indication of branding/identity. This is motivated by business reasons.

@pbannist - The example that you mentioned, Geico and Dairy Queen, would actually not be a valid set given our current thinking around the FPS policy. Berkshire Hathaway is a holding company, with Geico and DQ being subsidiaries. Regardless, you do bring up challenges around defining the policy in a way that stays true to first principles, but I'm confident that we can work together towards that goal.

Regarding the question of whether ownership/organization is the right principle to design FPS around, there is user research around users' expectations/comfort with being tracked within a first-party. For example, see this paper. Of particular interest are Section 4.2.3, and "Trust" under Section 4.3.2

The UX signal is the only signal that that is useful to a user, and that is independent of the presence of a shared organization.

I agree that it's important to surface FPS affiliation information to users, and we are proposing that it be surfaced in the browser UX. Are you suggesting that this is not sufficient?


@jackfrankland

The proposal's main aim is to have some of this information readable by the user agent programmatically, in an effort to reduce the over-prevalence of consent overlays, and to foster better transparency / control of the user's data.

Got it. Would this be similar to the P3P project? If so, it may be instructive to study the criticisms, and address how we can overcome those issues with your proposal.

However, my argument is that the publication of the domain-to-business relationship suits the goals for this proposal nicely, and may be more useful than the publication of a domains-to-domains relationship - especially if the policy for this proposal ends up being that the domains must have matching business ownership according to their privacy policies.

Our specification of FPS as a domain-to-domains relationship is mostly an artifact of needing to find a domain to host the central/unified manifest file on. :) As I mentioned in my previous response, having a single source of truth makes verification and deployment easier. Do you envision a way that we can maintain a central manifest file using a domain-to-business relationship?

@annevk
Copy link

annevk commented Aug 17, 2020

@krgovind that very much depends on the browser UI, no? If sites all moved in that direction, browsers could respond by highlighting the registrable domain even more prominently (or only showing that).

@krgovind
Copy link
Author

@krgovind that very much depends on the browser UI, no? If sites all moved in that direction, browsers could respond by highlighting the registrable domain even more prominently (or only showing that).

@annevk : I'm not seeing how browsers highlighting the registrable domain would help this situation, because in the Flickr case, the URL bar would have changed from yahoo.com, to verizon.com, to smugmug.com; with the content page itself being the only indication that it is Flickr. Would users notice if it went to somethingelse.com with the phishing page showing a content page identical to Flickr's?

@brodrigu
Copy link

Essentially, forcing all sites to move to subdomains of their parent/owner domains would have, in my example, manifested as flickr.com moving to flickr.yahoo.com, then flickr.verizon.com and subsequently to flickr.smugmug.com. This would train users to stop paying attention to the registrable domain, and focus only on the subdomain. Thus, it would make them susceptible to entering their credentials on mybank.evil.com, because mybank is in the subdomain, and lead the user to think "perhaps mybank was recently acquired by evil.com"?

@krgovind The deprecation of 3rd party cookies would be the forcing function which would push sites to consolidate domains to retain some functional benefits they see in a 3rd party cookie world and set up the situation you are solving for above. This drive to consolidate to as few eTDL+1s for functional benefit would not be limited to only 1st parties that are owned by the same organization. You could imagine sites forming a co-op or joining together in a publisher network where you might see two not-co-owned sites nyherald.co-op.com and laregister.co-op.com sharing a registrable domain. As business needs and incentives change, nyherald.co-op.com might move to nyherald.pub-network.com or nyherald.amp.com causing the same user-apathy towards the registrable domain.

Is this something you have considered?

@annevk
Copy link

annevk commented Aug 17, 2020

@krgovind I think that's a good illustration as to why they might not want to do that (those domain transitions would also not be cheap I suspect).

@krgovind
Copy link
Author

@brodrigu - I think you are arguing for a solution for publisher consortiums/networks. As discussed earlier on this thread, we think that those should be better served by other APIs such as TURTLEDOVE, and are ideally not compelled to join under a single domain. Note that moving registrable domains like you described also has the cost of losing access to your previous state/cookies, so that would need to be weighed against other incentives.

@annevk It sounds like you're taking the position that if a multi-domain site wanted to share data across its domains, the only way it should be allowed to do that is by taking the significant step of consolidating on a single domain? Would that recommendation stand for ccTLD domain variants, as well as for content separated for security reasons (e.g. googleusercontent.com)?

@brodrigu
Copy link

brodrigu commented Aug 18, 2020

@krgovind It's important to note that the publisher consortium use case is more incentivized than co-owned domains to migrate to a shared eTLD+1 and that if the problem first party sets is trying to avoid is user domain apathy, FPS will likely not be successful if the use case isn't addressed.

we think that those should be better served by other APIs such as TURTLEDOVE, and are ideally not compelled to join under a single domain

certainly there are tradeoffs, but the upside for sharing an eTLD+1 amongst a trusted consortium is higher than currently available alternatives.

First Party Sets is a great proposal, but the rigidity of co-ownership as a requirement for set membership hinders its potential to meet a developing security concern.

Update: moved to issue: WICG/first-party-sets#17

@annevk
Copy link

annevk commented Aug 19, 2020

@krgovind on a set of domains that have a common registrable domain as defined by the URL Standard, yes. It's hard enough to get users to grasp that, conveying through UI that two unrelated domains are in a set would go far beyond that and frankly does not really seem feasible.

@johnwilander
Copy link

Since FPS is now a work item, can we continue the conversation in separate issues? Maybe the editors can find the cycles to migrate the subdomains vs registrable domain set discussion into an issue. 🙏🏼

@krgovind
Copy link
Author

Since FPS is now a work item, can we continue the conversation in separate issues? Maybe the editors can find the cycles to migrate the subdomains vs registrable domain set discussion into an issue. 🙏🏼

Thanks for the advice, John. I've created WICG/first-party-sets/issues/19 to capture this discussion.

@hober hober removed the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Aug 25, 2020
@hober
Copy link
Member

hober commented Aug 25, 2020

Closing, as this is now a Work Item.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interest: blink Implementer interest from Blink (e.g. Brave, Google/Chrome, Microsoft/Edge) interest: webkit Implementer interest from WebKit (e.g. Apple/Safari, Igalia/GNOME Web) work item? Formal request to adopt this proposal as a Work Item of the CG
Projects
None yet
Development

No branches or pull requests