Encoded URLs, Canonical Tags & Special Characters for SEO: A Deep-Dive

Written by Glen Allsopp |

36.9K Followers ^{+552 this month}

Detailed is for people who are past the SEO basics (or at least looking to level-up quickly).

I've been writing about SEO since I was 15 years old, and a decade later I'm fortunate to have consulted for companies I love like Ahrefs, Kinsta, Buffer, ConvertKit and multi-billion dollar brands.

The basics can be incredibly effective, but hundreds of sites cover them well and I want to focus on unique, creative ways to achieve better rankings.

Instead, here's my promise: I will put my absolute all into guides like this one to give original insights that help you get an edge over your competition.

That's it. That's my pitch for you to stick around (or perhaps let you know this isn't the site for you).

This is already too much text for a 'click-to-read-more-fade-thing' but there's more if you like.

Thank you for being here!

I recently came across an interesting client situation where a number of URLs were internally linked to in both an encoded and “decoded” manner.

Canonical tags were the same, and they weren’t always consistent.

A URL might be internally linked to as https://detailed.com/site-advice-(next-level)

But then have a canonical tag as https://detailed.com/site-advice-%28next-level%29

Notice how the left and right brackets change to %28 and %29 respectively.

Although I have been doing SEO and auditing websites for more than 15 years, I couldn’t name from memory which characters fit into standard encoding, and there was no single source of truth for how to handle canonical tags in this situation either.

As I’ve done hours of research into this, I decided that I would document my findings and solution for anyone who comes across this problem in the future.

Regarding special characters in URLs, we have multiple classification types.

Reserved Characters for URL Syntax

The reserved characters are: ! * ‘ ( ) ; : @ & = + $ , / ? % # [ ]

Written out with UTF-8 encodings and the name of each character, you get:

! – Exclamation mark (%20 with UTF-8 Encoding)
* – Asterisk (%2A with UTF-8 Encoding)
‘ – Apostrophe / Single quote (%27 with UTF-8 Encoding)
( – Left parenthesis (%28 with UTF-8 Encoding)
) – Right parenthesis (%29 with UTF-8 Encoding)
; – Semi colon (%3B with UTF-8 Encoding)
: – Colon (%3A with UTF-8 Encoding)
@ – At sign (%40 with UTF-8 Encoding)
& – Ampersand (%26 with UTF-8 Encoding)
= – Equals (%3D with UTF-8 Encoding)
+ – Plus (%2B with UTF-8 Encoding)
$ – Dollar sign (%24 with UTF-8 Encoding)
, – Comma (%2E with UTF-8 Encoding)
/ – Forward slash (%2F with UTF-8 Encoding)
? – Question mark (%3F with UTF-8 Encoding)
% – Percent (%25 with UTF-8 Encoding)
# – Pound sign (%23 with UTF-8 Encoding)
[ – Left square bracket (%5B with UTF-8 Encoding)
] – Right square bracket (%5D with UTF-8 Encoding)

If you want to create URLs with emojis, you probably shouldn’t. Google’s advice is that any special characters in URLs, including with foreign languages, should be written with UTF-8 encoding in mind.

A good real world example of URLs with characters in them that can rank is Wikipedia.

As you end up with URLs like: https://en.wikipedia.org/wiki/28_(number)

If we look at the canonical tag for the page, it’s also written in the same way. With paranthesis, rather than %28 or %29.

Another example is https://en.wikipedia.org/wiki/40%25_(song), where 40% is actually referencing the content of the page (a song, named 40%).

They rank first in Google for the name of the song followed by its creator (granted, it is Wikipedia) and a few other well ranking results include % as well.

If You Can Avoid Special Characters, Then Do So

Google is just one platform on the internet.

Even if you know how things are processed there, links may still break on chat applications, forums, social media platforms like Twitter and Faceook and so on.

If you can keep URL simple, that’s always best.

I also understand though that if you’re doing SEO for a site, these things may have been set-up without your knowledge, or before you joined the project.

Written by Glen Allsopp, the founder of Detailed. You may know me as 'ViperChill' if you've been in internet marketing for a while. Detailed is a small bootstrapped team behind the Detailed SEO Extension for Chrome & Firefox (250,000 weekly users), trying to share some of the best SEO insights on the internet. Clicking the heart tells us what you enjoy reading. Social sharing is appreciated (and always noticed). You can also follow me on Twitter and LinkedIn.

Encoded URLs, Canonical Tags & Special Characters for SEO: A Deep-Dive

Reserved Characters for URL Syntax

If You Can Avoid Special Characters, Then Do So

"Think of us like the Bloomberg of SEO."

Reports

Projects

Rankings