Wikipedia talk:WikiProject Chemicals/Archive 2025
![]() | This is an archive of past discussions on Wikipedia:WikiProject Chemicals. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Carboxypeptidase
Could you make an image like this but with a carboxypeptidase acting in the other extrem of the tetrapeptide to illustrate this article. Thanks. 2A02:3037:2E0:3F0F:AC77:E39:1D98:1FE (talk) 17:27, 12 January 2025 (UTC)
Appropriate amount of historical information
How detailed should historical sections be in articles about chemicals? Most chemical sources don't seem to care, but I assume that at least the discoverer(s), a date, a link to the original publication and a description of the way of its discovery are the bare minimum, am I right? What about historical names and historical synthesis routes? 5.178.188.143 (talk) 10:56, 22 November 2024 (UTC)
- That all sounds good to include. But if it grows to over a page, then a separate history article would be due. Graeme Bartlett (talk) 11:44, 22 November 2024 (UTC)
- IP editor: There is general guidance for chemicals at MOS:CHEM/Chemicals. My view is that we need reliable secondary sources but if we have them, we should use them. However, we certainly don't need to list all the possible names/code numbers for chemicals (ChemSpider and PubChem do these) nor every possible synthesis. History sections are worthwhile: I recently wrote an article about substructure search and found its history more interesting than the basic topic, mainly because I already knew about that! Mike Turnbull (talk) 12:13, 22 November 2024 (UTC)
- IMHO, you need to exercise judgement. What is "interesting" is hardly universal. In terms of history, some areas are very "clubby", and some editors use history to engage in WP:BOOSTERISM (e.g., overemphasizing institutions). Older organic chemists, those members of the Cult of RB Woodward, favor name dropping (and name reactions). To some extent, history sections are best written by citing an article ON the history of the compound/synthesis/chemist. Otherwise, a semi-long excursion based on an editor's reading of the literature trail becomes close to WP:OR. My two cents. --Smokefoot (talk) 14:41, 22 November 2024 (UTC)
- As of historical names, I don't indeed mean every single one in existence but only those relatively prominent, such as names coined by discoverers or major pre-Geneva names widely used in 18th-19th cc., e. g. trimethylene for cyclopropane.
- As of sources, I struggle to understand you, could you please reword? 5.178.188.143 (talk) 11:28, 23 November 2024 (UTC)
- You seem to understand the loose guidelines we follow. One specific thing that I was trying to say: one risks engaging in OR if one writes extensively (multiple sentences) on history without a source that discusses that history. So, if one were to discuss the history of cyclopropane, one would cite a source that analyzed that history. My other remark is snarky: organic chemists seem to focus on a pantheon (a collection of gods) of "pioneers", which is probably of little interest to most readers.--Smokefoot (talk) 15:03, 23 November 2024 (UTC)
- I was replying to Mike but let me respond to you as well: does a reference book from 1890s-1910s qualify as a suitable secondary source if it discusses the history? 5.178.188.143 (talk) 19:29, 23 November 2024 (UTC)
- P. S. Let's say a history section goes like this: a certain compound was initially identified in a natural source by A in 18aa, who called it α; in 18bb, B reproduced A's research and extensively studied the compound, modifying the name to β, which is still occasionally used; it was first synthesized by C in 18cc, who also proposed a couple of possible structures for it; the discovery of certain type of reactions by D allowed his student E to determine which structure is correct in 18ee; the first industrial synthesis route, which was still used in China as of 2010, was developed by F and G in 19gg, and it was commercialized by Company H in 19hh. Assume that every claim is sourced but there's no single source describing all the history from the beginning to the end. Which combination of sources would suffice for such a compilation? — Preceding unsigned comment added by 5.178.188.143 (talk) 07:51, 24 November 2024 (UTC)
- To reduce use of primary sources, you can use the information in later sources about the predecessors' work. Though check the earlier sources to confirm if reporting is accurate. Use reviews and textbooks if possible. I would think it would be WP:DUE to include this history. Problems occur when it is recent research, and writing is based on press releases from institutions which will puff up importance of the work. Graeme Bartlett (talk) 20:22, 27 November 2024 (UTC)
- Let me update this thread with an illustration of the core disagreement Smokefoot and I apparently have. The earliest literature mention of DMPU can be dated to 1976 and a certain team of three chemists from the University of Cincinnati (one of whom also worked for Exxon, but I doubt it's notable). I believe we must indicate who and when discovered the compound, but Smokefoot removed that information from the article, describing it as 'c-h-e-e-z-e-y' (sic). What does this community believe about this situation? 5.178.188.143 (talk) 19:51, 18 January 2025 (UTC)
- You seem to understand the loose guidelines we follow. One specific thing that I was trying to say: one risks engaging in OR if one writes extensively (multiple sentences) on history without a source that discusses that history. So, if one were to discuss the history of cyclopropane, one would cite a source that analyzed that history. My other remark is snarky: organic chemists seem to focus on a pantheon (a collection of gods) of "pioneers", which is probably of little interest to most readers.--Smokefoot (talk) 15:03, 23 November 2024 (UTC)
- The question is always one of notability. Is info notable? If it is, we want it. If its not, we don't want it.
- Then there are practical issues. Try to envision Wikipedia where every innovation acknowledges either or both the inventor(s) and the institution? Who did that synthesis, and where? Who determined the X-ray structure, and where? Who showed that compound x was effective for an application and where? Such gestures are no doubt well-intended, but IMHO they riisk gumming up the WikiChem machinery by adding lots of content that does not contribute to what readers seek. As a practical matter, one can imagine how many times we would be mentioning Univ of Cambridge, Harvard, ETH? There is no problem if one were to write an article about U of Cincinnati and point out the contributions from that institution. Readers (who are interested in UofC) would find such info relevant. But it is (understatement) unlikely that readers seeking info on a particular tetraalkylurea want to know about UofC's role in their popularization.
- Such shout-outs also open another can of worms: fights over precedent. Who did what first, which can get testy. Even moreso because we dont really like to cite patents here, and often patents precede publications on money-making stuff. I think many editors have no problem with acknowledging those individuals who led major advances. Nobel Prizes etc.
- Lastly, there is a small problem of COI. People associated with UofC will be prowling articles to add such info because they are justifiably proud of their institution. COI is a problem in Wikipedia. Some COI is malicious, but most is just plain boosterism (WP:BOOSTERISM), distracting or tangential info.--Smokefoot (talk) 21:22, 18 January 2025 (UTC)
- Let's first limit ourselves only to the initial discovery for simplicity, consider cases where there's no doubt about the precedent, and start from a single person. Sometimes the discoverer is notable themselves, but more often they are not. We always have their surname, almost always first name and affiliation as well, and sometimes birth and death years are available. Which of them are notable, in your opinion?
- Next, let's move on to two people as co-discoverers. Most often it's a doctoral student and their scientific advisor, the latter is sometimes personally notable, the former is quite rarely in my experience. If neither is, again out of their names, affiliation (almost always common) and years of life, what do you see as notable?
- If there are three or more co-discoverers, I agree that listing names or years of life would be too cumbersome. If you want to omit the affiliation like in this case, then only their nationality is left. Do you see that as notable?
- Regarding unclear precedents, I agree that this is a can of worms that should be managed carefully, especially since people often don't cite previous work properly. However this has never stopped Wikipedians from describing history of scientific discoveries in articles on topics other than chemistry.
- Your argument that "often patents precede publications on money-making stuff" makes sense indeed. But why wouldn't we cite the earliest available patent, at least if it itself is cited in secondary sources, like it has always been done in articles on history of technology?
- The last paragraph reads as an allusion that I might be affiliated with UofC, so I would l like to make sure that I have never been affiliated with any US (or W. European) higher ed or research institution (I have only ever studied and worked in E. Europe). 5.178.188.143 (talk) 09:35, 19 January 2025 (UTC)
- I generally agree. Editors avoid patents because they are unrefereed, I think. Regarding the specific trigger of this exchange, "[DMPU] was introduced by chemists from the University of Cincinnati as an analog of tetramethylurea in 1976". Seems contrived (cheezy). The compound was made well before 1976 by someone (at the illustrious ICI), who tangled with making di-secondary diamines, as indicated on the Talk page. Not to make a big deal.--Smokefoot (talk) 14:58, 19 January 2025 (UTC)
- Could you please answer the questions on notability I asked?
- Regarding the patents, indeed someone could file an application on what they have never synthesized, but do you think large chemical companies practice that?
- Regarding DMPU, if I made a factual mistake then these UofC folks won't be notable indeed. On which talk page is anything indicated about ICI or other earlier syntheses? I couldn't find anything anywhere and would be grateful for your help! 5.178.188.143 (talk) 20:05, 19 January 2025 (UTC)
- IP editor. The first mention of DMPU I can find in the literature is in GB560700, where Bill Boon (a well-known ICI chemist owing to his work on paraquat) mentioned it as a side-product when he was making certain ureas. It is in example 1 of that patent, and is there called 1:3-dimethyl-2-keto-hexahydropyrimidine. You'll find this patent reference in the PubChem entry for DPMU. Boon wasn't interested in the application of DPMU as a solvent, he was just making an accurate summary of what he found in one of his reactions. To your other question: yes, large chemical companies regularly file applications for patents on compounds they have never synthesized. The patent system encourages them to do so when, say, filing on a set of examples they have made they are allowed to include examples they have not made which are described by a Markush structure: see that article for an explanation. Mike Turnbull (talk) 12:04, 20 January 2025 (UTC)
- I had left a note at Talk:1,3-Dimethyl-2-imidazolidinone. It is one Boon's prep. If one is determined to find something notable about cyclic ureas, I guess its their prep. But, Boon didnt seem to imply notability, he just cranked out a slew of these things. A good source on impactful substituted ureas is doi 10.1002/14356007.o27_o04. Some pesticides and apps to stay-pressed clothing. --Smokefoot (talk) 15:11, 20 January 2025 (UTC)
- I generally agree. Editors avoid patents because they are unrefereed, I think. Regarding the specific trigger of this exchange, "[DMPU] was introduced by chemists from the University of Cincinnati as an analog of tetramethylurea in 1976". Seems contrived (cheezy). The compound was made well before 1976 by someone (at the illustrious ICI), who tangled with making di-secondary diamines, as indicated on the Talk page. Not to make a big deal.--Smokefoot (talk) 14:58, 19 January 2025 (UTC)
- IP editor: There is general guidance for chemicals at MOS:CHEM/Chemicals. My view is that we need reliable secondary sources but if we have them, we should use them. However, we certainly don't need to list all the possible names/code numbers for chemicals (ChemSpider and PubChem do these) nor every possible synthesis. History sections are worthwhile: I recently wrote an article about substructure search and found its history more interesting than the basic topic, mainly because I already knew about that! Mike Turnbull (talk) 12:13, 22 November 2024 (UTC)
Consider "Molecule of the Week" for articles
I just cited a few-paragraph ditty on glyceraldehyde from American Chem Soc's archive of "Molecule of the Week" (MOTW). The archive [[1]] has hundreds of entries. Some entries are so short that they are not very useful, like abacavir at https://www.acs.org/molecule-of-the-week/archive/a/abacavir.html, Many, like the one for glyceraldehyde, are multi-paragraph commentaries. The MOTW site is open access, and the discussion is mid-level such that these articles could enhance Wikipedia articles. Like all things with ACS (a for-profit, unlike RSC), these links are a come-on for using SciFinder. --Smokefoot (talk) 16:37, 13 February 2025 (UTC)
Chembox validation in 2025
Posting here instead of the WP:CHEMVAL talk page to get more comments. The situation right now is that almost all chemical articles on Wikipedia link to a page with outdated information about chembox validation. On that page it still says there's a bot that updates validation based on an index (which was not linked on the page until I added it yesterday). As of now, CheMoBot has been inactive for 7 years, so no one is keeping an eye on the s and
s. I marked the page as historical which an admin quickly reverted so I assume that was the wrong thing to do. How should we go about updating the page so people know chembox validation doesn't work like it used to? HansVonStuttgart (talk) 09:20, 23 February 2025 (UTC)
- There was a discussion of chembox validation back in 2020, now at WT:WikiProject Chemistry/CAS validation#Chembox Validation, CheMoBot. I think that such bot-handled validation is not useful and we should scrap it, along with the ticks and crosses. We can rely on human validation, as for the vast majority of edits. The bot was validating what are provided by clickable links to external databases, so human verification is usually easy. Mike Turnbull (talk) 14:55, 24 February 2025 (UTC)
- I've always found them to be useless. Wasn't there a plan to have wikidata handle this? Project Osprey (talk) 15:00, 24 February 2025 (UTC)
- Yes, that was part of the 2020 discussion. The CAS Common Chemistry db became available then. Wikidata is a much better place to do any validation. Mike Turnbull (talk) 16:11, 24 February 2025 (UTC)
- So what we need them is a report that shows differences to Wikidata. But then once checking the entry we should be able to fix or explain the difference and record it. (eg family vs specific) ( or stereoisomer) (error in database entry) (multiple database entries for one thing). Then it does not have to be rechecked over and over.
- I have been slowly manually checking CAS numbers that did not have a tick or cross and adding the green tick, but is that a waste of time? Most of our readers would not care. Some of our readers may be happy to know that the entry was checked, but for the very serious ones, they should check it for themselves. Perhaps we need some categories that are useful to us to identify discrepancies. Graeme Bartlett (talk) 00:13, 25 February 2025 (UTC)
- Yes, that was part of the 2020 discussion. The CAS Common Chemistry db became available then. Wikidata is a much better place to do any validation. Mike Turnbull (talk) 16:11, 24 February 2025 (UTC)
- I've always found them to be useless. Wasn't there a plan to have wikidata handle this? Project Osprey (talk) 15:00, 24 February 2025 (UTC)
- I removed the references to CheMoBot on the page for now. It's still not good enough to be a real explanation of the process: it now implies people are doing exactly the same tasks CheMoBot used to do, but that's closer to the truth than claiming the bot is still active. If it ever comes back online, my changes can be reverted. HansVonStuttgart (talk) 08:36, 25 February 2025 (UTC)
Anyone willing to provide a standard structural formula? I'm currently not sitting at the right PC to do it myself. --Leyo 19:46, 31 January 2025 (UTC)
- I replaced it with a black and white SVG image. Innerstream (talk) 20:24, 31 January 2025 (UTC)
- Thank you. --Leyo 22:11, 2 March 2025 (UTC)
ECHA database mining
If you go to this page and scroll to the bottom you can download a list of all chemicals registered in the EU (you must check the disclaimer box). 4463 of these have active registrations for >100 tons per year. Is there a way to cross-reference these (by CAS number?) against our pages to generate a list of things which are
- Produced on a significant scale:
- We don't have pages on.
It would strike me as very good worklist to have. Project Osprey (talk) 10:10, 28 February 2025 (UTC)
- Here's a dump of the wikilinked chemical names: User:Marbletan/REACH. Not sure it helps, but at the very least it might be a place to get ideas for new articles (or new redirects to existing articles). Marbletan (talk) 15:00, 28 February 2025 (UTC)
- Thanks for that. Browsing through, there are significantly more red links than blue. Some of that might be false negatives - ECHA tends to use formal IUPAC names. --Project Osprey (talk) 15:37, 28 February 2025 (UTC)
- There are also loads of redlinks that we would never want to have articles on, e.g. "Reaction product of....", "No public name....", "Slag....". The list will be very useful if pruned of the ones of obvious no interest. Note that any name having square brackets breaks the wikilinking. Thanks, User:Marbletan for producing that so rapidly. Mike Turnbull (talk) 15:59, 28 February 2025 (UTC)
- ... incidentally, your list is all 22,000+ on the REACH list. I think we only need to focus on the >100 ton examples, per Project Osprey's original suggestion. Mike Turnbull (talk) 16:06, 28 February 2025 (UTC)
- Thanks for catching that. I have now trimmed the list to those that are denoted as ">100 tpa". Much shorter and manageable. Marbletan (talk) 16:18, 28 February 2025 (UTC)
- Please feel free to edit the page as you see fit, to prune it of "ones of obvious no interest" or to fix the broken wikilinks. Marbletan (talk) 16:19, 28 February 2025 (UTC)
- A CAS look-up would probably be more helpful, but I've never been able to get wikidata to work for me. The reaction mixtures can be deceptive "Reaction products of phosphoryl trichloride and 2-methyloxirane" is Tris(chloropropyl) phosphate. This reflects ECHA role as a regulator, they wont use the simple name because the material is never produced pure. Sometimes that can get interesting, there are a few compounds that they refuse to use the CAS numbers for because the CAS relates to a pure compound. Frankly, even if only 1000 of these warrant pages that's still enough to keep me going for years... Project Osprey (talk) 16:53, 28 February 2025 (UTC)
- I've deleted about 1,000 that would be highly unlikely to be names of articles in Wikipedia and begun to put the list in alphabetical order. Your point about reaction products is reasonable but we should only ever have them as the actual product name. Mike Turnbull (talk) 17:27, 28 February 2025 (UTC)
- Oh yes, definitely. I wouldn't even use such descriptions as redirects, I'm just trying to explain why they exist at all. Project Osprey (talk) 17:42, 28 February 2025 (UTC)
- I've deleted about 1,000 that would be highly unlikely to be names of articles in Wikipedia and begun to put the list in alphabetical order. Your point about reaction products is reasonable but we should only ever have them as the actual product name. Mike Turnbull (talk) 17:27, 28 February 2025 (UTC)
- ... incidentally, your list is all 22,000+ on the REACH list. I think we only need to focus on the >100 ton examples, per Project Osprey's original suggestion. Mike Turnbull (talk) 16:06, 28 February 2025 (UTC)
- There are also loads of redlinks that we would never want to have articles on, e.g. "Reaction product of....", "No public name....", "Slag....". The list will be very useful if pruned of the ones of obvious no interest. Note that any name having square brackets breaks the wikilinking. Thanks, User:Marbletan for producing that so rapidly. Mike Turnbull (talk) 15:59, 28 February 2025 (UTC)
- Thanks for that. Browsing through, there are significantly more red links than blue. Some of that might be false negatives - ECHA tends to use formal IUPAC names. --Project Osprey (talk) 15:37, 28 February 2025 (UTC)
- I have started making some redirects. But some do not appear worthwhile eg Aluminum, (octadecanoato-.kappa.O)oxo- which we have as Aluminium monostearate. So Marbletan, would you like me to note these on your list? Or do it here? Graeme Bartlett (talk) 23:00, 28 February 2025 (UTC)
- In the cases where a page exists, either on Wikipedia or Wikidata, but the redirect doesn't seem appropriate, it seems like it would be a good idea to note it on the list in some way (but actual format probably doesn't matter much). Marbletan (talk) 17:10, 2 March 2025 (UTC)
- It would also be helpful to cross-reference them if corresponding Wikidata pages exist.--Leiem (talk) 03:02, 1 March 2025 (UTC)
- There are quite a few Wikidata entries for pure substances when there is no article here. Graeme Bartlett (talk) 03:39, 1 March 2025 (UTC)
- I'd expect most of the pure substances to have Wikidata entries. Note that you can link to them using the template {{ill}}, which will give redlink entries like 1,2-dimethylimidazole (I've updated that one in Marbletan's list as an example). This makes is much easier to create chemboxes when writing articles, as, of course, Wikidata contains many of the IDs we usually include, as well as InChI, SMILES etc. When such links occur in mainspace lists (e.g. List of herbicides), there is even a bot that comes along to convert these links into standard links when an article is created, as you can see in the edit history for herbicides. Mike Turnbull (talk) 12:33, 1 March 2025 (UTC)
- There are quite a few Wikidata entries for pure substances when there is no article here. Graeme Bartlett (talk) 03:39, 1 March 2025 (UTC)
I will paste some code I have used to process the data (no guarantee of working though):
Convert the csv ECHA file to tsv, remove verbose content to make list of name, EC, CAS, (amount indicator 3 is >100 tpa), infocard
- sed -e 's/\(.*\),[^,]*,[^,]*\/\([0-9.]*\),[^,]*,[^,]*$/\1\t\2/g; s/,intermediate\t/\t2\t/g; s/,<100 tpa\t/\t1\t/g; s/,>100 tpa\t/\t3\t/g; s/,not yet assigned\t/\t0\t/g; s/,active registrations(s) under REACH[,]*\t/\t/g; s/,\([0-9\-]*\),\([0-9\-]*\)\t/\t\1\t\2\t/g; s/^"//g; s/"\t/\t/g' <chemical_universe_list_en.csv >output
retrieve Wikidata entries with infocards returning QID, label, CASNo. infocard, and EC_number:
- curl --data-urlencode "query@wiktquery" -H "Accept: text/tab-separated-values" https://query.wikidata.org/bigdata/namespace/wdq/sparql >outputwikidata
THe file wiktquery has this content:
SELECT ?item ?itemLabel ?casNo ?echaId ?einecs ?article WHERE { ?item wdt:P2566 ?echaId . OPTIONAL { ?item wdt:P231 ?casNo . } # CAS Registry Number OPTIONAL { ?item wdt:P232 ?einecs . } #EINECS OPTIONAL { ?article schema:about ?item . ?article schema:inLanguage "en" . ?article schema:isPartOf <https://en.wikipedia.org/> . } SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } }
Graeme Bartlett (talk) 01:14, 5 March 2025 (UTC)
All articles have been assessed
Just wanted to point out that finally, all the Chemicals articles in the wikiproject have been assessed. You can view the classification in the big chart on the main page. This is essentially thanks to a combination of the automatic classification of set index articles as List-class added to Module:WikiProject Banner Shell, and people manually classifying articles. Just got the last article classified. Mrfoogles (talk) 21:33, 7 March 2025 (UTC)
Ammonium oleate
@Lamro: Lamro, do you really think that "Technical Paper. U.S. Government Printing Office. 1917" and "American Druggist and Pharmaceutical Record. ...1895" are good foundations for something (Ammonium oleate) notable? --Smokefoot (talk) 23:16, 7 March 2025 (UTC)
- Actually those sources have substantial content, and would be reliable. But they are not suitable for how they are used, as there is no chemical formula, and they are both on applications of the substance. Graeme Bartlett (talk) 00:26, 8 March 2025 (UTC)