I’ve experienced a somewhat wierd problem with my WordPress blog. Since 3 februari 2011 I’ve had a seriously decreasing traffic hit. When I used my google webmaster tools, I saw that nobody could find my website in Google anymore. I didn’t had this problem for 2 years and now suddenly I had this problem. So now I had to find the solution for this. But first I had to know what made me “disappear” in google.
Step 1: Why can’t I find any of permalinks (my single-page url’s) in Google index?
The first thing I done is looking for my website in google with the site-tag. For example site:blog.linxiting.com. When I did the search I saw this:
To my surprise all of my tag-, category and date pages where indexed by Google, but 99% of my permalinks where deleted out of the Google index. So why made the Googlebot (the one who is crawling my website) make my permalinks disappear.
Step 2: Why made the Googlebot make my permalinks disappear?
I’ve searched the internet and pretty fast I found what the problem was. Because of my tag, category and date pages, the Googlebot thought I had duplicate content.
The googlebot found my tag-page and when he saw my permalink-page, it said to itself (I know it can’t speak, but it’s a figure of speech :p) I allready found this, so I don’t have to add this to the index anymore, this is duplicate content.
Step 3: How can I delete the tag-, category and date pages from the Google index?
I did not find many information about this, because I did not now how I could delete these kind of pages from the google index. Thankfully the Google-forum exists. I posted the question on the forum and after a long conversation with Webado we found the solution. But first some info on what that I did wrong.
Step 4: What did I do wrong – Robots.txt?
I created a Robots.txt (for more info on that visit the website) and said to the Googlebot, do not crawl the pages with the following tags: /category/, /page/, /2009/, /2010/, /2011/, /2012/, /tag/, etc… It looked like this:
So the Robots.txt made sure my the above tags weren’t crawled anymore. But silly me, the ones that where already indexed will stay indexed because the Googlebot will not crawl them anymore, so it could not delete them. Keep your robots.txt on your local drive, I will come in handy further.
Step 5: The solution – noindex, follow meta tag
What is the noindex, follow meta tag. Well its simple, If you add this meta-tag to a webpage the Googlebot will know this page doesn’t want to be indexed in Google. So what did I do about my problem. I deleted the tags in my Robots.txt and made it almost empty. Looking like this:
You can see I’ve deleted some tags out of my first Robots.txt. The following I did is I added the noindex, follow meta tag to all of the pages that don’t have to be indexed by Google using a great WordPress plugin, the Ultimate Noindex Nofollow Tool II. Drupal users can find more information on Tom Lamberts website.
Step 6: Wait and update Robots.txt again
No I had to wait for the Googlebot to see all the pages with the noindex, follow meta tag. After a week I searched my website in google again with site:blog.linxting.com. Now I saw all of my tag-, category and date pages where deleted out of the index and my permalink-pages (single page url’s) reappeared. It now looks like this:
Great!! I’m back! But I’m not done just yet. No I had to reupdate my Robots.txt to the first one (see my first Robots.txt-image). Now I never have to fear about duplicate content anymore. Okay I know, you always have to write unique stuff to make sure you never have te fear about duplicate content, but I think the people who are reading this post already know that…
So this was my journey of my problem about duplicate content. Hopefully I helped someone whit this post and I you have trouble understanding something, post your question here below, I will try to answer it as good and understandable as possible.
Popularity: 3% [?]
Related posts:





Hi, I had the same issue and could not find any information about this apart from this article. All my posts urls were listed with my website address and then a category or tag. I have been pulling my hair out for months trying to sort this out. Hopefully the actions you have mentioned above have helped. Fingers crossed. Thanks
Yes, hopefully they will! Let me know!
I am facing a similar issue. Have googled for a solution but nothing came to the fore apart from your article.
I am intending to remove only the page navingation from google index. Meaning, take for example, a url “/tag/mobile”. Now google has multiple copies of this tag viz. “/tag/mobile/page=1″, “/tag/mobile/page=12″, etc. Is there a way to remove all the urls except for the tag page i.e “/tag/mobile”..?
I have been looking for an answer to this but been unsuccessful so far
Have you tried the no index no follow plugin?