Confusing chemical names: why do some sound so similiar?

It’s the end of March as I write this and, here in the UK at least, things are starting to feel a little bit hopeful. We’ve passed the spring equinox and the clocks have just gone forward. Arguments about the rights and wrongs of that aside, it does mean daylight late into the day, which means more opportunities to get outside in the evenings. Plus, of course, COVID-19 vaccines are rolling out, with many adults having had at least their first dose.

Some COVID-19 vaccines contain polyethylene glycol (PEG), a safe substance found in toothpaste, laxatives and other products, according to Science magazine and health expertsAh, yes. Speaking of vaccines… a couple of weeks ago I spotted a rather strange item trending on Twitter. The headline was: “Some COVID-19 vaccines contain polyethylene glycol (PEG), a safe substance found in toothpaste, laxatives and other products, according to Science magazine and health experts.”

Apart from being a bit of mouthful, this seemed like the most non-headline ever. And also, isn’t it the kind of thing that might raise suspicions in a certain mind? In a, “yeah, and why do they feel the need to tell us that, huh” sort of way?

Why on earth did it even exist?

A little bit of detective work later (by which I mean me tweeting about it and other people kindly taking the time to enlighten me) and I had my answer. The COVID-19 sceptic Alex Berenson had tweeted that the vaccine(s) contained antifreeze. Several people had immediately responded to say that, no, none of the vaccine formulations contain antifreeze. Antifreeze is ethylene glycol, which is definitely not the same thing as polyethylene glycol.

I’m not going to go much further into the vaccine ingredients thing, because actual toxicologists weighed in on that, and there’s nothing I (not a toxicologist) can really add. But this did get me thinking about chemical names, how chemists name compounds, and why some chemical names seem terrifyingly long while others seem, well, a bit silly.

A lot of the chemical names that have been around for a long time are just… names. That is, given to substances for a mixture of reasons. They do usually have something to do with the chemical makeup of the thing in question, but it might be a bit tangential.

formic acid, HCOOH, was first extracted from ants

For example, formic acid, HCOOH, takes its name from the Latin word for ant, formica, because it was first isolated by, er, distilling ant bodies (sorry, myrmecologists). On the other hand limestone, CaCO3, quicklime, CaO, and limewater, a solution of Ca(OH)2, all get their names from the old English word lim, meaning “a sticky substance,” which is also connected to the Latin limus, from which we get the modern word slime — because lime (mostly CaO) is the sticky stuff used to make building mortar.

The trouble with this sort of system, though, is that it gets out of control. The number of organic compounds listed in the American Chemical Society‘s index is in excess of 30 million. On top of which, chemists have an annoying habit of making new ones. Much as some people might think forcing budding chemists to memorise hundreds of thousands of unrelated names is a jolly good idea, it’s simply not very practical (hehe).

It’s the French chemist, Auguste Laurent, who usually gets most of the credit for deciding that organic chemistry needed a system. He was a remarkable scientist who discovered and synthesised lots of organic compounds for the first time, but it was his proposal that organic molecules be named according to their functional groups that would change things for chemistry students for many generations to come.

Auguste Laurent (image source)

Back in 1760 or so, memorising the names of substances wasn’t that much of a chore. There were half a dozen acids, a mere eleven metallic substances, and about thirty salts which were widely known and studied. There were others, of course, but still, compared to today it was a tiny number. Even if they were all named after something to do with their nature, or the discoverer, or a typical property, it wasn’t that difficult to keep on top of things.

But over the next twenty years, things… exploded. Sometimes literally, since health and safety wasn’t really a thing then, but also figuratively, in terms of the number of compounds being reported. It was horribly confusing, there were lots of synonyms, and the situation really wasn’t satisfactory. How can you replicate another scientist’s experiment if you’re not even completely sure of their starting materials?

In 1787 another French chemist, Guyton de Morveau, suggested the first general nomenclature — mostly for acids, bases and salts — with a few simple principles:

  • each substance should have a unique name, as short and specific as possible
  • the name should reflect what the substance consisted of, that is, describe its “composing parts”
  • unknown substances should be assigned names with no particular meaning, being sure not to suggest something false about the substance (if you know it’s not an acid, for example, don’t name it someinterestingname acid)
  • new names should be based on old languages, such as Latin

His ideas were accepted and adopted by most chemists at the time, although a few did attack them, claiming they were “barbarian, incomprehensible, and without etymology” (reminds me of some of the arguments I’ve had about sulfur). Still, his classification was eventually made official, after he presented it to the Académie des Sciences.

Chemists needed a naming system that would allow them to quickly identify chemical compounds.

However, by the middle of the 1800s, the number of organic compounds — that is, ones containing carbon and hydrogen — was growing very fast, and it was becoming a serious problem. Different methods were proposed to sort through the messy, and somewhat arbitrary, accumulation of names.

Enter Auguste Laurent. His idea was simple: name your substance based on the longest chain of carbon atoms it contains. As he said, “all chemical combinations derive from a hydrocarbon.” There was a bit more to it, and he had proposals for dealing with specific substances such as amines and aldehydes, and of course it was in French, but that was the fundamental idea.

It caused trouble, as good ideas so often do. Most of the other chemists of the time felt that chemical names should derive from the substance’s origins. Indeed, some of the common ones that chemistry professors are clinging onto today still do. For example, the Latin for vinegar is acetum, from which we get acetic acid. But, since organic chemistry was increasingly about making stuff, it didn’t entirely make sense to name compounds after things they might have come from, if they’d come from nature — even when they hadn’t.

So, today, we have a system that’s based on Laurent’s ideas, as well as work by Jean-Baptiste Dumas and, importantly, the concept of homology — which came from Charles Gerhardt.

Homology means putting organic compounds into “families”. For example, the simplest family is the alkanes, and the first few are named like this:

Like human families, chemical families share parts of their names and certain characteristics.

The thing to notice here is that all the family members have the same last name, or rather, their names all end with the same thing: “ane”. That’s what tells us they’re alkanes (they used to be called paraffins, but that’s a name with other meanings — see why we needed a system?).

So the end of the name tells us the family, and the first part of the name tells us about the number of carbons: something with one carbon in it starts with “meth”. Something with five starts with “pent”, and so on. We can go on and on to much bigger numbers, too. It’s a bit like naming your kids by their birth order, not that anyone would do such a thing.

There are lots of chemical families. The alcohols all end in “ol”. Carboxylic acids all end in “oic acid” and ketones end in “one” (as in bone, not the number). These endings tell us about certain groups of atoms the molecules all contain — a bit like everyone in a family having the same colour eyes, or the same shaped nose.

A chemist that’s learned the system can look at a name like this and tell you, just from the words, exactly which atoms are present, how many there are of each, and how they’re joined together. Which, when you think about it, is actually pretty awesome.

Which brings me back to the start and the confusion of glycols. Ah, you may be thinking, so ethylene glycol and polyethylene glycol are part of the same family? Their names end with the same thing, but they start differently?

Well, hah, yes and no. You remember a moment ago when I said that there are still some “common” names in use, that came from origins — for example acetic acid (properly named ethanoic acid)? Well, these substances are a bit like that. The ending “glycol” originates from “glycerine” because the first ones came from, yes, glycerine — which you get when fats are broken down.

Polyethylene glycol (PEG) is a polymer, with very different properties to ethylene glycol (image source)

Things that end in glycol are actually diols, that is, molecules which contain two -OH groups of atoms (“di” meaning two, “ol” indicating alcohol). Ethylene glycol is systematically named ethane-1,2-diol, from which a chemist would deduce that it contains two carbon atoms (“eth”) with alcohol groups (“ol”) on different carbons (1,2).

Polyethylene glycol, on the other hand, is named poly(ethylene oxide) by the International Union of Pure and Applied Chemistry (IUPAC), who get the final say on these things. The “poly” tells us it’s a polymer — that is, a very long molecule made by joining up lots and lots of smaller ones. In theory, the “ethylene oxide” bit tells us what those smaller molecules were, before they all got connected up to make some new stuff.

Okay, fine. So what’s ethylene oxide? Well, you see, that’s not quite a systematic name, either. Ethylene oxide is a triangular-shaped molecule with an oxygen atom in it, systematically named oxirane. Why poly(ethylene oxide), and not poly(oxirane), then? Mainly, as far as I can work out, to avoid confusion with epoxy resins and… look, I think we’ve gone far enough into labyrinth at this point.

The thing is, polyethylene glycol is usually made from ethylene glycol. Since everyone tends to call ethylene glycol that (and rarely, if ever, ethane-1,2-diol), it makes sense to call the polymer polyethylene glycol. Ethylene glycol makes polyethylene glycol. Simple.

Plastic bags are made from polythene, which has very different properties to the ethene that’s used to make it.

Polymers are very different to the molecules they’re made from. Of course they are, otherwise why bother? For example, ethene (also called ethylene, look, I’m sorry) is a colourless, flammable gas at room temperature. Poly(ethylene) — often just called polythene — is used to make umpteen things, including plastic bags. They’re verrrrry different. A flammable gas wouldn’t be much use for keeping the rain off your broccoli and sourdough.

Likewise, ethylene glycol is a colourless, sweet-tasting, thick liquid at room temperature. It’s an ingredient in some antifreeze products, and is, yes, toxic if swallowed — damaging to the heart, kidneys and central nervous system and potentially fatal in high enough doses. Polyethylene glycol, or PEG, on the other hand, is a solid or a liquid (depending on how many smaller molecules were joined together) that’s essentially biologically inert. It passes straight through the body, barely stopping along the way. In fact, it’s even used as a laxative.

So the headlines were accurate: PEG is “a safe substance found in toothpaste, laxatives and other products.” It is non-toxic, and describing it as “antifreeze” is utterly ridiculous.

In summary: different chemicals, in theory, have nice, logical, tell-you-everything about them names. But, a bit like humans, some of them have obscure nicknames that bear little resemblance to their “real” names. They will insist on going by those names, though, so we just need to get on with it.

The one light in this confusingly dark tunnel is the internet. In my day (croak) you had to memorise non-systematic chemical names because, unless you had a copy of the weighty rubber handbook within reach, there was no easy way to look them up. These days you can type a name into Google (apparently other search engines are available) and, in under a second, all the names that chemical has ever been called will be presented to you. And its chemical formula. And multiple other useful bits of information. It’s even possible to search by chemical structure these days. Kids don’t know they’re born, I tell you.

Anyway, don’t be scared of chemical names. They’re just names. Check what things actually are. And never, ever listen to Alex Berenson.

And get your vaccine!


If you’re studying chemistry, have you got your Pocket Chemist yet? Why not grab one? It’s a hugely useful tool, and by buying one you’ll be supporting this site – it’s win-win! If you happen to know a chemist, it would make a brilliant stocking-filler! As would a set of chemistry word magnets!

Like the Chronicle Flask’s Facebook page for regular updates, or follow @chronicleflask on Twitter. Content is © Kat Day 2021. You may share or link to anything here, but you must reference this site if you do. If you enjoy reading my blog, and especially if you’re using information you’ve found here to write a piece for which you will be paid, please consider buying me a coffee through Ko-fi using the button below.
Buy Me a Coffee at ko-fi.com

Want something non-sciency to distract you from, well, everything? Why not check out my fiction blog: the fiction phial.

 

So how do you spell element 16?

IUPAC says sulfur, and what they say goes

IUPAC says sulfur, and what they say goes

I found myself yet again discussing the correct spelling of the name of element number 16 today with a group of students. Now, on the one hand, going over this again and again is a tad wearisome. On the other, I’m quietly glad that in a time in which the media constantly blather on about terrible literacy levels, rant about the use of txt spk and generally mutter under their (or there/theyre/one of those) breath about the inability of the nation to use an apostrophe properly, I can consistently find an entire roomful of youngsters who care so much about spelling that they’re willing to argue over the correct use of ‘f’ vs. ‘ph’.

I am, of course, talking about sulfur.

You will note that I have spelled it with an ‘f’.  I should point out that the spelling chequer* on my browser has just underlined that with a row of red dots. It disagrees with me as well.

However, IUPAC (The International Union of Pure and Applied Chemistry – sounds like a fun place for a holiday doesn’t it?) do not, and in this they get the deciding vote. One of the many things IUPAC does is to sort out the official nomenclature of organic and inorganic molecules.

Of course, chemistry professors have been cheerfully ignoring them for years, and so it is that generations of chemistry students have tripped gaily into their first university session, fresh from A-level teachers using systematic names, to be immediately and thoroughly bamboozled by a lecturer talking about acetone, neopentane, para-nitrophenol and the gloriously-named glacial acetic acid.

But there it is, when it comes to element 16, IUPAC are crystal clear. It’s sulfur. With an f. That means it’s also sulfide with an f, and sulfate, with an f. Oh and sulfuric, as in the acid, with an f. Interestingly Richard Osman, on the BBC quiz show Pointless, has been very keen to point out in elements rounds that it’s sulfur, and then in a round about acids spelled it sulphuric. Weird.

In their notes, IUPAC even say that ‘”aluminum” and “cesium” are commonly used alternative spellings for “aluminium” and “caesium.”’ No such note is made for sulfur. Time to get over it.

Volcanic sulfur - it looks prettier than it smells.

Volcanic sulfur – it looks prettier than it smells.

If the Online Etymology Dictionary is to be believed, the ph/f thing has gone backwards and forwards a few times. It was apparently sulphur in Latin, and sulfur in Late Latin. There was an Old English word ‘swefl’ meaning sulfur or brimstone (same thing really, just with more religious connotations), and an Old French one: ‘soufre‘. Actually, according to Google Translate, that’s the modern French spelling as well. I am pretty clueless when it comes to French, so feel free to correct me.

The UK started spelling the word with a ph in around the 14th century, along with several other words that have since fallen out of use, such as phantastic and turph. The ph makes some sense in words with a Greek origin, such as philosophy and orphan, since the Greek alphabet actually has the letter phi, but little sense otherwise. However the scribes of the time believed that the more letters there were in a word the more impressive it would look, so they made everything as long and complicated as possible. Why use f when you can use ph? Why spell it ‘tho’ when you can write ‘though’? And you also have them to blame for all those annoyingly unnecessary double consonants that turn up far from occasionally (I absolutely never get that one right first time).

If we’re honest, this belief still persists to some extent. True we don’t throw extra letters in for good measure any more, but there are plenty of sesquipedalianist writers out there who believe such behaviour makes them look intelligent (see what I did there?) And just look at how annoyed people get about text speak, or how many quietly sneer about tweeting.

So back to element 16. Chuck in a few more centuries and we come, more or less, full circle. IUPAC adopted the spelling sulfur in 1990, and the Royal Society of Chemistry Nomenclature Committee followed suit in 1992. The Qualifications and Curriculum Authority for England and Wales switched in 2000, and it’s now the spelling you will see in both GCSE and A-level examinations and, consequently, the one in any text book published within the last decade. For those that complain it’s an American spelling, even The Oxford Dictionaries admit that “In chemistry… the -f- spelling is now the standard form in all related words in the field in both British and US contexts.”

So it’s sulfur. With an f. It’s not “the American spelling”. Well, ok, it IS, but it’s also the British spelling. And the rest of the world’s spelling. So add sulfur to your spell checker’s dictionary and let’s move along.

——

* this is a joke. Probably not a very good one, since a number of people have pointed out my ‘mistake’. It’s never a good sign if you have to explain your attempts at humour is it? Anyway, it’s a reference to this famous (well I thought it was, anyway) poem.

My Pointless addiction

I love the TV show Pointless (5:15pm weekdays on BBC1, and I didn’t have to look that up).  I am an unashamed addict.  For those that have never watched, they give 100 members of the public 100 seconds to answer questions, and the aim of the contestants is to name the most obscure answer provided in particular category, in other words the one the fewest people answered correctly.  As I write, the current topic is TV Armstrong 4elements of the periodic table.  They use the periodic table quite a bit – they must have a chemist in their team of research elves (it’s sort of implied that the lovely Richard Osman makes up all the questions, but I’ve always assumed he has at least some help).

Naturally I can name all the answers in this round.  They are: sulfur, copper, mercury, sodium, fluorine, nitrogen and barium.  Yes all right, I’m showing off now.  I reckon nitrogen might be the lowest, since its clue is about its boiling point.  We’ll see in a bit…

I’m actually rather comforted by some of the high scores.  34 people out of 100 recognised sodium and 39 barium.  54 got copper and 80 mercury.  There is hope for the nation after all.

Ah, it turns out that fewer people knew it was a compound of fluorine in toothpaste.  Heston Blumenthal and his habit of chucking liquid nitrogen around has a lot to answer for.  It was close though, the second lowest score.

Now for the next round.  Now I’m helped here by the fact that they’ve included atomic numbers, else I’m not sure I’d have got the one referring to album sales.  I’m vaguely aware that gold and platinum albums are possible (and silver?), but I’m clueless as to the sales numbers required.  Ok: oxygen, zinc, platinum, magnesium, bromine, radon and carbon.  I reckon the lowest is going to be… um… well probably platinum.

And while I wait, thank you Richard for stating the correct spelling of sulfur in two episodes now.  I’m always arguing with people about that.

Hmm, I couldn’t have been more wrong.  Platinum was in fact the second highest answer on the board after oxygen.  As Alexander Armstrong pointed out, I suppose that’s really about your knowledge of the music business (an area in which I am pitifully uninformed).  The lowest score was bromine.  I should have realised.

Oh dear the next round is on English Football.  I’d be straight out.