Duplicate content problem with WordPress and Drupal

Robots.txt VS noindex, follow meta tag

Robots.txt VS noindex, follow meta tag

I’ve experienced a somewhat wierd problem with my WordPress blog. Since 3 februari 2011 I’ve had a seriously decreasing traffic hit. When I used my google webmaster tools, I saw that nobody could find my website in Google anymore. I didn’t had this problem for 2 years and now suddenly I had this problem. So now I had to find the solution for this. But first I had to know what made me “disappear” in google.

Step 1: Why can’t I find any of permalinks (my single-page url’s) in Google index?

The first thing I done is looking for my website in google with the site-tag. For example site:blog.linxiting.com. When I did the search I saw this:

Date index by google

Date index by google

To my surprise all of my tag-, category and date pages where indexed by Google, but 99% of my permalinks where deleted out of the Google index. So why made the Googlebot (the one who is crawling my website) make my permalinks disappear.

Step 2: Why made the Googlebot make my permalinks disappear?

I’ve searched the internet and pretty fast I found what the problem was. Because of my tag, category and date pages, the Googlebot thought I had duplicate content.

The googlebot found my tag-page and when he saw my permalink-page, it said to itself (I know it can’t speak, but it’s a figure of speech :p) I allready found this, so I don’t have to add this to the index anymore, this is duplicate content.

Step 3: How can I delete the tag-, category and date pages from the Google index?

I did not find many information about this, because I did not now how I could delete these kind of pages from the google index. Thankfully the Google-forum exists. I posted the question on the forum and after a long conversation with Webado we found the solution. But first some info on what that I did wrong.

Step 4: What did I do wrong – Robots.txt?

I created a Robots.txt (for more info on that visit the website) and said to the Googlebot, do not crawl the pages with the following tags: /category/, /page/, /2009/, /2010/, /2011/, /2012/, /tag/, etc… It looked like this:

Robotstxt example

So the Robots.txt made sure my the above tags weren’t crawled anymore. But silly me, the ones that where already indexed will stay indexed because the Googlebot will not crawl them anymore, so it could not delete them. Keep your robots.txt on your local drive, I will come in handy further.

Step 5: The solution – noindex, follow meta tag

What is the noindex, follow meta tag. Well its simple, If you add this meta-tag to a webpage the Googlebot will know this page doesn’t want to be indexed in Google. So what did I do about my problem. I deleted the tags in my Robots.txt and made it almost empty. Looking like this:

Robots.txt for noindex, follow meta tag

Robots.txt for noindex, follow meta tag

You can see I’ve deleted some tags out of my first Robots.txt. The following I did is I added the noindex, follow meta tag to all of the pages that don’t have to be indexed by Google using a great WordPress plugin, the Ultimate Noindex Nofollow Tool II. Drupal users can find more information on Tom Lamberts website.

Step 6: Wait and update Robots.txt again

No I had to wait for the Googlebot to see all the pages with the noindex, follow meta tag. After a week I searched my website in google again with site:blog.linxting.com. Now I saw all of my tag-, category and date pages where deleted out of the index and my permalink-pages (single page url’s) reappeared. It now looks like this:

My permalinks are back in the Google index

My permalinks are back in the Google index

Great!! I’m back! But I’m not done just yet. No I had to reupdate my Robots.txt to the first one (see my first Robots.txt-image). Now I never have to fear about duplicate content anymore. Okay I know, you always have to write unique stuff to make sure you never have te fear about duplicate content, but I think the people who are reading this post already know that…

So this was my journey of my problem about duplicate content. Hopefully I helped someone whit this post and I you have trouble understanding something, post your question here below, I will try to answer it as good and understandable as possible.

Popularity: 3% [?]

Related posts:

  1. What is SEO and why is it so important? Real tips, no bull….
  2. What’s new on www.linxiting.com

About the Author