Edwin says:
Thanks, I added the robots.txt to my blog.
Wasabi says:
You try this:
# This rule means it applies to all user-agents
User-agent: *
# Disallow all directories and files within
Disallow: /cgi-bin/
Disallow: /stats/
Disallow: /dh_
Disallow: /about/legal-notice/
Disallow: /about/copyright-policy/
Disallow: /about/terms-and-conditions/
Disallow: /contact/
Disallow: /tag/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/
Disallow: /wp-
Disallow: /trackback/
# The Googlebot is the main search bot for google
User-agent: Googlebot
# Disallow all files ending with these extensions
Disallow: /*.php$
Disallow: /*.js$
Disallow: /*.inc$
Disallow: /*.css$
Disallow: /*.gz$
Disallow: /*.wmv$
Disallow: /*.tar$
Disallow: /*.tgz$
Disallow: /*.cgi$
Disallow: /*.xhtml$
# Disallow Google from parsing indididual post feeds and trackbacks..
Disallow: */feed/
Disallow: */trackback/
# Disallow all files with ? in url
Disallow: /*?*
Disallow: /*?
# Disallow all archived monthlies
Disallow: /2006/0*
Disallow: /2007/0*
Disallow: /2006/1*
Disallow: /2007/1*
# The Googlebot-Image is the image bot for google
User-agent: Googlebot-Image
# Allow Everything
Allow: /*
# This is the ad bot for google
User-agent: Mediapartners-Google*
# Allow Everything
Allow: /*
# SiTeMap per i motori di ricerca
Sitemap: http://siteweb/sitemap.xml
EngLee says:
Wow, that’s an advanced version. Will examine your robots.txt file. Thanks.