Encoded URLs, Canonical Tags & Special Characters for SEO: A Deep-Dive - Detailed.com
LATEST, July 2024: The SEO Playbook of Digital Goliaths (Detailed Q4) 🎉
Blog SEO Extension SEO Blueprint
#
#
#
#
#
#
#
#
#
#
PRESS

Encoded URLs, Canonical Tags & Special Characters for SEO: A Deep-Dive

Written by Glen Allsopp | +552 this month
LAST UPDATED
June 18th, 2023

I recently came across an interesting client situation where a number of URLs were internally linked to in both an encoded and “decoded” manner.

Canonical tags were the same, and they weren’t always consistent.

A URL might be internally linked to as https://detailed.com/site-advice-(next-level)

But then have a canonical tag as https://detailed.com/site-advice-%28next-level%29

Notice how the left and right brackets change to %28 and %29 respectively.

Although I have been doing SEO and auditing websites for more than 15 years, I couldn’t name from memory which characters fit into standard encoding, and there was no single source of truth for how to handle canonical tags in this situation either.

As I’ve done hours of research into this, I decided that I would document my findings and solution for anyone who comes across this problem in the future.

Regarding special characters in URLs, we have multiple classification types.

Reserved Characters for URL Syntax

The reserved characters are: ! * ‘ ( ) ; : @ & = + $ , / ? % # [ ]

Written out with UTF-8 encodings and the name of each character, you get:

  • ! – Exclamation mark (%20 with UTF-8 Encoding)
  • * – Asterisk (%2A with UTF-8 Encoding)
  • ‘ – Apostrophe / Single quote (%27 with UTF-8 Encoding)
  • ( – Left parenthesis (%28 with UTF-8 Encoding)
  • ) – Right parenthesis (%29 with UTF-8 Encoding)
  • ; – Semi colon (%3B with UTF-8 Encoding)
  • : – Colon (%3A with UTF-8 Encoding)
  • @ – At sign (%40 with UTF-8 Encoding)
  • & – Ampersand (%26 with UTF-8 Encoding)
  • = – Equals (%3D with UTF-8 Encoding)
  • + – Plus (%2B with UTF-8 Encoding)
  • $ – Dollar sign (%24 with UTF-8 Encoding)
  • , – Comma (%2E with UTF-8 Encoding)
  • / – Forward slash (%2F with UTF-8 Encoding)
  • ? – Question mark (%3F with UTF-8 Encoding)
  • % – Percent (%25 with UTF-8 Encoding)
  • # – Pound sign (%23 with UTF-8 Encoding)
  • [ – Left square bracket (%5B with UTF-8 Encoding)
  • ] – Right square bracket (%5D with UTF-8 Encoding)

If you want to create URLs with emojis, you probably shouldn’t. Google’s advice is that any special characters in URLs, including with foreign languages, should be written with UTF-8 encoding in mind.

A good real world example of URLs with characters in them that can rank is Wikipedia.

As you end up with URLs like: https://en.wikipedia.org/wiki/28_(number)

If we look at the canonical tag for the page, it’s also written in the same way. With paranthesis, rather than %28 or %29.

Another example is https://en.wikipedia.org/wiki/40%25_(song), where 40% is actually referencing the content of the page (a song, named 40%).

They rank first in Google for the name of the song followed by its creator (granted, it is Wikipedia) and a few other well ranking results include % as well.

If You Can Avoid Special Characters, Then Do So

Google is just one platform on the internet.

Even if you know how things are processed there, links may still break on chat applications, forums, social media platforms like Twitter and Faceook and so on.

If you can keep URL simple, that’s always best.

I also understand though that if you’re doing SEO for a site, these things may have been set-up without your knowledge, or before you joined the project.

Written by Glen Allsopp, the founder of Detailed. You may know me as 'ViperChill' if you've been in internet marketing for a while. Detailed is a small bootstrapped team behind the Detailed SEO Extension for Chrome & Firefox (250,000 weekly users), trying to share some of the best SEO insights on the internet. Clicking the heart tells us what you enjoy reading. Social sharing is appreciated (and always noticed). You can also follow me on Twitter and LinkedIn.

"Think of us like the Bloomberg of SEO."

Exclusive insights from tracking the rankings & revenue of 3,078 digital goliaths.

    "Glen found a very sneaky technical SEO issue on our homepage. Sometimes a fresh set of eyes goes a long way."

    BILL KING

    "Glen's recommendations helped us improve crawl budget, remove deadweight pages and led to overall improvements in organic traffic to our key pages."

    STEVE TOTH

    "I've been a practitioner of digital marketing for over a decade and I've learned more from Glen about SEO than anyone else."

    CLAY COLLINS