Talk:Cookie syncing
![]() | Cookie syncing has been listed as one of the Engineering and technology good articles under the good article criteria. If you can improve it further, please do so. If it no longer meets these criteria, you can reassess it. Review: May 10, 2025. (Reviewed version). |
![]() | This article is rated GA-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||
|
![]() | A fact from Cookie syncing appeared on Wikipedia's Main Page in the Did you know column on 6 March 2025 (check views). The text of the entry was as follows:
| ![]() |
Did you know nomination
[edit]- The following is an archived discussion of the DYK nomination of the article below. Please do not modify this page. Subsequent comments should be made on the appropriate discussion page (such as this nomination's talk page, the article's talk page or Wikipedia talk:Did you know), unless there is consensus to re-open the discussion at this page. No further edits should be made to this page.
The result was: promoted by SL93 talk 01:35, 27 February 2025 (UTC)
- ... that syncing zombie cookies can create a evercookie clone?
- Source: Acar, Gunes; Eubank, Christian; Englehardt, Steven; Juarez, Marc; Narayanan, Arvind; Diaz, Claudia (2014-11-03). "The Web Never Forgets: Persistent Tracking Mechanisms in the Wild". Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. CCS '14. New York, NY, USA: Association for Computing Machinery. p. 683. doi:10.1145/2660267.2660347. ISBN 978-1-4503-2957-6.
- ALT1: ... that the practise of matching cookies could expose personally-identifiable information? Source: Papadopoulos, Panagiotis; Kourtellis, Nicolas; Markatos, Evangelos (2019-05-13). "Cookie Synchronization: Everything You Always Wanted to Know but Were Afraid to Ask". The World Wide Web Conference. WWW '19. New York, NY, USA: Association for Computing Machinery. p. 1439. doi:10.1145/3308558.3313542. ISBN 978-1-4503-6674-8.
- ALT2: ... that cookie matching could compromise the encryption of VPNs? Source: Papadopoulos, Panagiotis; Kourtellis, Nicolas; Markatos, Evangelos (2019-05-13). "Cookie Synchronization: Everything You Always Wanted to Know but Were Afraid to Ask". The World Wide Web Conference. WWW '19. New York, NY, USA: Association for Computing Machinery. p. 1439. doi:10.1145/3308558.3313542. ISBN 978-1-4503-6674-8.
- Reviewed: Template:Did you know nominations/Antiqua et nova
Sohom (talk) 23:54, 2 February 2025 (UTC).
General: Article is new enough and long enough |
---|
Policy: Article is sourced, neutral, and free of copyright problems |
---|
|
Hook: Hook has been verified by provided inline citation |
---|
|
QPQ: Done. |
Overall: Prefer both ALT1 and ALT2 for general understandability for the main page audience, though I admit the term "zombie cookie" is a lot of fun. Will leave up to the promoter to decide which of the three they like best. Earwig pass with 2.0%. Next time you nominate, please remember to use "moved to mainspace" instead of created; it makes it easier for your future reviewers to look for that. ThaesOfereode (talk) 17:27, 4 February 2025 (UTC)
- @ThaesOfereode Will keep the "Moved to mainspace" message in mind. Regarding the hooks, We could do something like the following,
- ALT 3: ... that syncing zombie cookies can create a cookie that is almost impossible to crumble ?
- ALT 4: ... that syncing zombie cookies can create a cookie that is almost impossible to delete ?
- Lmk if these would work better (I recognize that it still might be intelligible to a user if they are not familiar with web security/privacy, but still worth a try). Sohom (talk) 22:17, 5 February 2025 (UTC)
- @Sohom Datta: ALT4 is definitely fine with me. ALT3 is a little more opaque, but I wouldn't object to it being promoted. ThaesOfereode (talk) 23:36, 5 February 2025 (UTC)
GA review
[edit]The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
GA toolbox |
---|
Reviewing |
- This review is transcluded from Talk:Cookie syncing/GA1. The edit link for this section can be used to add comments to the review.
Nominator: Sohom Datta (talk · contribs) 09:09, 25 April 2025 (UTC)
Reviewer: Vanamonde93 (talk · contribs) 16:41, 2 May 2025 (UTC)
Claiming this one. I am not a software specialist, but this seems straightforward enough that I ought to be able to handle it. Vanamonde93 (talk) 16:41, 2 May 2025 (UTC)
- Sure, looking forward to the review, let me know if you get stuck on something. Sohom (talk) 16:49, 2 May 2025 (UTC)
- Passing per my comments here notwithstanding my note in the last spotcheck, but please take a look at that when you have a chance. Nice work, it's not a trivial task to write about a contemporary and technical subject. Vanamonde93 (talk) 16:42, 10 May 2025 (UTC)
Checklist
[edit]GA review – see WP:WIAGA for criteria
- Is it well written?
- A. The prose is clear and concise, and the spelling and grammar are correct:
- All my minor comments have been addressed
- B. It complies with the manual of style guidelines for lead sections, layout, words to watch, fiction, and list incorporation:
- No issues here
- A. The prose is clear and concise, and the spelling and grammar are correct:
- Is it verifiable with no original research?
- A. It contains a list of all references (sources of information), presented in accordance with the layout style guideline:
- No formatting issues
- B. All in-line citations are from reliable sources, including those for direct quotations, statistics, published opinion, counter-intuitive or controversial statements that are challenged or likely to be challenged, and contentious material relating to living persons—science-based articles should follow the scientific citation guidelines:
- See spotchecks below. All text is cited adequately.
- C. It contains no original research:
- Spotchecks are clear
- D. It contains no copyright violations nor plagiarism:
- Spotchecks are clear
- A. It contains a list of all references (sources of information), presented in accordance with the layout style guideline:
- Is it broad in its coverage?
- A. It addresses the main aspects of the topic:
- B. It stays focused on the topic without going into unnecessary detail (see summary style):
- No extraneous material.
- A. It addresses the main aspects of the topic:
- Is it neutral?
- It represents viewpoints fairly and without editorial bias, giving due weight to each:
- It represents viewpoints fairly and without editorial bias, giving due weight to each:
- Is it stable?
- It does not change significantly from day to day because of an ongoing edit war or content dispute:
- No stability issues
- It does not change significantly from day to day because of an ongoing edit war or content dispute:
- Is it illustrated, if possible, by images?
- A. Images are tagged with their copyright status, and valid fair use rationales are provided for non-free content:
- Diagrams created by the author, no issues
- B. Images are relevant to the topic, and have suitable captions:
- A. Images are tagged with their copyright status, and valid fair use rationales are provided for non-free content:
- Overall:
- Pass or Fail:
- Pass or Fail:
Comments
[edit]- This article does a good job overall of making its contents accessible. There are a few instances in which its language remains opaque to the layperson, dropping some examples here:
- "allowing them to link identifiers"
Fixed Sohom (talk) 16:08, 9 May 2025 (UTC)
- "bid on an impression in real time through automated means" (specifically, "impression" is context-specific jargon)
Fixed by explaining impression inline Sohom (talk) 16:08, 9 May 2025 (UTC)
- "from learning information about other non-affiliated sites" (non-affiliated relative to what?)
- Let me know if the current wording works, it should be "non-affiliated to the website the user is currently on" Sohom (talk) 16:08, 9 May 2025 (UTC)
- "allowing them to link identifiers"
- If possible, it would be nice to drop some dates into what is otherwise a timeless article. When did this practice arise? When was the GDPR promulgated? This isn't a major issue, to be clear, but would contextualize some pieces of it better.
- I've added the year GDPR was promulgated, I couldn't find sources mentioning when the practise started (the first known discovery was in the 2016 study, so it must be sometime before that probably? :) Sohom (talk) 16:08, 9 May 2025 (UTC)
- Here we are reaching the limit of my technical knowledge: cookies are not necessarily ad-related, yes? So can we unequivocally state that cookie syncing only follows ads?
- Cookie syncing primarily occurs in ads or in user-tracking contexts, there isn't a reason/incentive for non-ad/non-tracking cookies to be synced. While we can't equivocally say that it doesn't happen in other context, research hasn't shown widespread use in other places. Sohom (talk) 16:08, 9 May 2025 (UTC)
- "Advertisers can, and often do, bid on new users for whom they do not have existing cookies" this seems to be a complicated way of saying people without cookie synced profiles still see ads...what am I missing?
- The sentence after that is the emphasis here.
Winning the bid enables them to serve ads to the user and simultaneously perform cookie syncing, thereby augmenting their dataset for future ad auctions.
The point is to say "Advertisers can and will perform cookie syncing even if they don't have a existing profile about you". (Let me know if I should be rewording this) Sohom (talk) 16:08, 9 May 2025 (UTC)
- The sentence after that is the emphasis here.
- Unless there's any doubt over the findings of the 2019 paper, I think we can say it in Wikipedia's voice: there isn't much wiggle room there, is there? The data can be compromised, or it can't.
- With the popularization of HTTPS, the findings of the paper are much less stronger (the chances of the attack happening is almost nil nowadays), necessatiting the "they said then that this could happen we can't comment on the attack's current efficacy" voice. (Let me know if I should be rewording this) Sohom (talk) 16:08, 9 May 2025 (UTC)
- Similar with the other instances of in-text attribution: I would simplify to "A 2016 survey..." etc. I appreciate the care here, but in the absence of contention we should simplify prose.
Done Sohom (talk) 16:08, 9 May 2025 (UTC)
- I did a sweep for sources: there's a lot of web sources of reasonable reliability covering the topic, but I don't immediately see any content they have that would be encyclopedic and isn't included here. I found this source [1] about the future of this practice, but it seems to me to be a perspective piece, and I wouldn't require its inclusion. Despite the reference list not being long, criterion 3a is met, in my view. Vanamonde93 (talk) 16:42, 10 May 2025 (UTC)
Sources
[edit]I am hesitant about some of the sources used here: of the eight sources, four appear to be conference proceedings, one is a preprint, and one is a guide written for...who exactly? I'm flagging this for discussion because the average conference proceeding at a genuine conference (ie, from a real scientific or professional society, rather than the predatory or pay-to-play equivalents) is likely reliable, but whether it undergoes peer review depends on the conference. A preprint, too, isn't peer-reviewed. That said, I'm willing to be persuaded that these are in fact the best sources on the subject. Can we talk through the five sources I have flagged? Is there evidence of peer review for the proceedings, or evidence that the authors are subject-matter experts, or corroborating sources that have been peer-reviewed?
- So, quick response, this is a artifact of the field. In general, academia in the field of computer security (much like AI or other CS disciplines) really likes publishing in conferences and papers published in conferences are often considered to be more rigorously peer-reviewed than journal submissions. (I know for a fact there are security professors in R1 universities who have only like 2-3 journal publications this guy for example). You can see what Google Scholar rankings of the top conferences in the area, here. For most of the conferences cited in the article, they are among the top 10 conferences in the area and have extremely rigorous peer-review processes. You could make a argument that
"Exclusive: How the (Synced) Cookie Monster breached my encrypted VPN session"
may not have gone through a rigorous peer review process since it was published in a workshop. However, the workshop does mentions that it performs peer review and is colocated with EuroSys (which is a pretty well-respected conference in computer systems research). Regarding the guide, this is documentation published by Google for folks building ad technology companies/solutions. Google is one of the largest ads company and this is their guide on how it manages industry standard processes like real time bidding. I would personally consider their explanation of the process to be generally reliable, even if it is a primary industry source describing it's own functions. Regarding the pre-print, the first link in the citation should be the actual published paper, the preprint is provided cause the first link is behind a paywall (it should be accessible through TWL). I will throw in a more detailed analysis of the individual conferences in a bit. Sohom (talk) 19:56, 4 May 2025 (UTC)
More detailed analysis follows,
- This seems like a journal submission, but it is in fact a confernce proceeding, the conference (NDSS) is organized by Internet Society and does/did have double-blind peer-review standards. Additionally, the last author (which in cybersecurity is typically the person overseeing the work) is Claude Castelluccia who is a research director at INRIA and definitely a subject-matter-expert on topic of web privacy. I wasn't able to find a ton about the first author [2] but they appear to be Invited Expert for W3C working groups, which (imo) implies they are a WP:SME as well (NDSS is also one of the big 4 conferences of cybsecurity)
- Guide written by Google, explaining ad-tech industry standards. The goal of the documentation is explain how to interoperate with Google servers. I would consider it a primary source, but a reliable one since Google doesn't have a incentive to lie about it's own APIs or ad-tech industry standards in a document expressedly meant to help interoperatability.
- This is from a e-con journal, I'm not super familiar with econ journals as they go, but scimago does not rate it badly and it appears to be scopus indexed. The first author is a Associate Professor at Cornell [3] and the overseeing author is a research director at Google [4]. I would consider them both to be subject-matter experts.
- This also a conference that does have/had a stringent peer review process. The first author, Gunes Acar is a associate professor at Radboud University [5] and the last author Claudia Diaz was a professor at KU Leven working on privacy [6] (they have since switched to working part-time). CCS is also one of the big 4 cybersecurity conferences.
- (I've already tackled this in my response above)
- This is one of the seminal research papers in web security, it was published at CCS (which does/did peer-review) is probably the first paper trying to understand web security at a large-scale. The first author has since worked with the Federal Trade Commission, DuckDuckGo, Mozilla, I would consider them to be a WP:SME, the overseeing author is Arvind Narayanan who is the current director of the Princeton's Center for Information Technology Policy and should be a WP:SME.
- This was published at World Wide Web Conference, while this conference is not one of the top conferences in security, it is the top conference for research about web technologies (and their 2019 edition did feature peer-review). The overseeing author[7] is a professor at Foundation for Research & Technology – Hellas and a director of the cybersecurity department of the institute, I would consider them a WP:SME. I was not able to ascertain much about the first author except that ACM says that they had a brief stint at Brave, while I would consider them to be a WP:SME, given that the paper went through peer-review and had SME authors, I would still push for inclusion.
- This paper was also published in a conference ASIA CCS, the conference has/had a peer-review process and is considered among the top 10 conferences in cybersecurity. The first author appears to be a professor at a german institute (Westphalian University of Applied Sciences) [8], I would consider them a WP:SME. The overseeing author is Norbert Pohlmann who is also a WP:SME on security and privacy (same with Thorsten Holz who is a professor at institute under the Helmholtz Association [9] and I would consider a WP:SME)
Sohom (talk) 01:31, 5 May 2025 (UTC)
- Vanamonde93, I've left comments against some of your points and actioned some of them. Wrt to the reliability of the sources, I've done a detailed breakdown of the peeps writing the sources. Lemme know if they make sense or if I should be making other/more changes. Sohom (talk) 16:18, 9 May 2025 (UTC)
- My apologies, I was pulled away by RL more than expected this week: and what time I had left were consumed by commenting at ARCA and some related content pages. I'll get to this before Sunday though, possibly later today. Vanamonde93 (talk) 00:30, 10 May 2025 (UTC)
- Based on your responses. I am convinced that all sources meet the requisite standards, except for the google guide [10]. Google dominates its field, but it is an involved party, not an independent source - and like every involved party it has a vested interest in presenting itself in the best possible light. Is the source replaceable? You could also rephrase to attribute any material directly to google. In the meantime I'll go ahead with spot-checks - otherwise this looks good. Vanamonde93 (talk) 03:54, 10 May 2025 (UTC)
- I've removed the Google guide, since in the only the place it was used, the info was already covered by a research paper. Sohom (talk) 14:47, 10 May 2025 (UTC)
Spotchecks
[edit]- Fn1a: checks out
- Fn1b: checks out
- Fn5: checks out
- Fn6: checks out
- Fn7: checks out in large part, I can't see where it explicitly says that the GDPR does not prohibit cookie syncing, but it's likely I'm missing it, as the entire paper is relevant to this subject.
- Wikipedia good articles
- Engineering and technology good articles
- GA-Class Computer security articles
- Low-importance Computer security articles
- GA-Class Computer security articles of Low-importance
- GA-Class Computing articles
- Low-importance Computing articles
- All Computing articles
- All Computer security articles
- GA-Class Internet articles
- Low-importance Internet articles
- WikiProject Internet articles
- Wikipedia Did you know articles