Skip to main content
Teun van Veggel's picture
Posted by Teun van Veggel on July 23rd, 2014

SEO on a multilingual Drupal 7 site

Many blog posts have been written about SEO for Drupal 7, but less so about the specifics of multilingual SEO. What if you have some content translated in various languages while other contents are the same for all languages. How does that work for hreflang metatags and xml sitemaps?

Drupal in combination with some contributed modules offers powerful solutions for improving your ranking on search engines, but having a multilingual website is a whole different cattle of fish. I looked into the options and tried to find the modules that would implement them.
When you enable more than one language in Drupal (with the URL language negotiation option switched on) and translate your nodes with the Entity Translation module, every single page can be accessed through as many URL's as there are enabled languages. Even if the content isn't actually translated but language neutral, it will still be available in e.g. Spanish or German by using the language prefix 'es/' or 'de/' in the URL. For example, a single blog post might be accessed through 'es/blog/this-is-a-blog-post' OR 'de/blog/this-is-a-blog', showing exactly the same content, but with a translated UI. Google might consider this duplicate content.
The reason why Google doesn't like duplicate content is because some people duplicate their content to make Google believe that they actually have a lot of content. Google doesn't like being fooled and penalizes this practice of content duplication. (although nobody knows exactly how and to which extend)
Drupal generates quite a few content duplications by default because of the way it is built.. First of all you have the fact that the content can be accessed by both 'system URL' (e.g. node/53) as well as by the URL friendly path (e.g. content/this-is-an-article). Secondly you can access most nodes with or without a trailing slash (/) at the end of the URL. There is a lot of debate about whether or not this has any influence on your ranking, read for example this post post about it (1) which says there isn't. But I think it's better to be safe than sorry, and in any case the solution is easy: install and activate the globalredirect module to create 301 redirects and the Metatag module to add canonical metatags to the head of your pages.There are plenty of blog posts (2,3,4) out there that explain how.
What about node translations? In a way, translated nodes could be considered duplicate content because, even though translated, they originate from the same content. There are ways to let Google know how that we're merely dealing with translations and not about content duplications. Google suggests to use use the rel="alternate" hreflang="XX"  tag. The hreflang tags can be used for fully translated pages as well as pages with language neutral content in which only the navigation blocks might be translated,  as described in this blog post (5).
Apart from avoiding Google marking part of your website as duplicated, adding the hreflang tag will also help Google to serve the right version of the website to the right audience. That means that your website will be displayed in e.g. Spanish when people are searching for it on or from Spain.

Multilingual XML Sitemap

There are two ways to go about when adding the rel="alternate" hreflang="XX" tag. You could add it to your XML Sitemap, explained in detail this article (6). A XML Sitemap is an XML file that lists all your pages in a nicely organised XML file, which you can then upload to your Google Webmasters account. There is a powerful module to generate your XML sitemap automatically based on the contents on your website, but unfortunately it currently does not yet implement the rel="alternate" hreflang="XX" option. You will need to apply the patch in this issue queue (7) to enable the hreflang tag for your sitemap file. 

HrefLang link in page header

You can also add the tag in your HTML page header like this:

<link rel="alternate" hreflang="es" href="" />

As mentioned in the same blog post, all translations of the page, whether the content is translated or just the UI, need a link in the header, including the current language. The Metatag module implements a lot of page header attributes, but at this stage it surprisingly doesn't support the hreflang tag yet, although there is a feature request pending. There is another module that does add rudimental support for the hreflang tag, which is the i18n_hreflang module which ships with the Internationalization Contributions module (8). Problem with that module that, for some reason, the hreflang tags are only added to the frontpage, and not to the rest of the pages. A patch is pending review in this issue queue.
With the patched XML Sitemap module and the patched I18N_Hreflang module you should be able to improve the SEO of your multilingual website considerately. I've applied both things to this very website and saw considerate improvements. E.g. typing in 'drupal madrid' on the spanish version of Google would previously show Nuez Web not until the 6th page, and with the description in English! Now we are now somewhere on the second page (and climbing I hope).
SEO is not an exact science and lot's of opinions and facts are mixed up in the various blog posts I've read about it. If you think I'm missing something please let me know in the comments.