How to add a page to the robots.txt file

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.
10 years ago
Hi,

It looks like Google Webmaster Tools reads a generic version of the nopcommerce robots.txt file off my web site. How can I add specific pages to my robots.txt file.

Do I need to update inside the Common Controller and add them to the list?

Thanks for any help.
Ted
10 years ago
Hi Ted,

To disallow specific pages from web crawlers/search engine indexers, edit your robots.txt file and add the necessary pages you want to disallow.

If you do not already have one you can either create one yourself from scratch or if you're feeling a bit lazy generate one here (http://www.mcanerin.com/EN/search-engine/robots-txt.asp).

If you would like to see an example of a robots.txt file a look at http://www.last.fm/robots.txt.

For the full documentation go to http://www.robotstxt.org/robotstxt.html
10 years ago
Tbone123 wrote:
Do I need to update inside the Common Controller and add them to the list?

Right. Robot.txt file is generated in this controller
10 years ago
Hi,
I have nopcommerce running and bought a domain on which Magentao was running. It has over 30K 404 error pages. I cannot add them in the code/ hardcode. What I suggest is the option to do next:

1. Add a robots.txt location; if present use that instead of hard coded url.s Of course provide list which should be added to the robots.txt

2. Implement option to extend the robots file with next code snippet:

         //Load robots.txt additions to end of file.
            //Or even load complete file regardless of this file
            string pathToFiles = Server.MapPath("~/Content/");
            string robotsFile = System.IO.Path.Combine(pathToFiles, "Robots.txt");
            if (System.IO.File.Exists(robotsFile))
            {
                string robotsFileContent = System.IO.File.ReadAllText(robotsFile);
                sb.Append(robotsFileContent);
            }

            Response.ContentType = "text/plain";
            Response.Write(sb.ToString());
            return null;


In this way I am able to extend with 30K lines.

J.

Note: I also read that Google requires a nofollow tag if a 404 page is thrown....that will stop indexing. Other option is to extend robots.txt file. Maybe idea to add this tag as well (see Google WEbmaster tools crawl errors). Check here and here.


  //page not found
        public ActionResult PageNotFound()
        {
            this.Response.StatusCode = 404;
            this.Response.TrySkipIisCustomErrors = true;
            //20140124 QE JO WorkItem: --> Too many 404 errors in Google Webmaster Tools
            //https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag?hl=nl&csw=1
            this.Response.AddHeader("X-Robots-Tag", "noindex, nofollow");
            return View();
        }
10 years ago
I added to robots.txt file:

Disallow: /*?SID=
Disallow: /*?q=
Disallow: eucookielawaccept
Disallow: /*?viewmode
Disallow: /account/register
Disallow: /cart
Disallow: /*p=


And other urls from 404 crawl errors of Google Webmaster tools
10 years ago
I found out that this could best be settable per shop.
9 years ago
Is it possible to add exclusions to Robots.txt without modifying CommonController?

For example: to exclude certain routes of a plugin, it would be great to be able to extend the content of Robots.txt directly from the plugin's code.

If not, it would be a great new feature.
8 years ago
If you want to change robots.txt without replace CommonController just comment or remove this section in Web.config file.
<add name="RobotsTxt" path="robots.txt" verb="*" type="System.Web.Routing.UrlRoutingModule" resourceType="Unspecified" preCondition="integratedMode" />

And than add your file robots.txt in the root directory.
8 years ago
string robotsFile = System.IO.Path.Combine(_webHelper.MapPath("~/"), "robots.custom.txt");
            if (System.IO.File.Exists(robotsFile))
            {
                //the robots.txt file exists
                string robotsFileContent = System.IO.File.ReadAllText(robotsFile);
                sb.Append(robotsFileContent);
            }
            else
            {
                //doesn't exist. Let's generate it (default behavior)

                var disallowPaths = new List<string>

Look the code. You can set robots.custom.txt. Then add the the code in that file. Thats how you dont need to change any of the code.
7 years ago
anik1991 wrote:
string robotsFile = System.IO.Path.Combine(_webHelper.MapPath("~/"), "robots.custom.txt");
            if (System.IO.File.Exists(robotsFile))
            {
                //the robots.txt file exists
                string robotsFileContent = System.IO.File.ReadAllText(robotsFile);
                sb.Append(robotsFileContent);
            }
            else
            {
                //doesn't exist. Let's generate it (default behavior)

                var disallowPaths = new List<string>

Look the code. You can set robots.custom.txt. Then add the the code in that file. Thats how you dont need to change any of the code.


Where i should put the robots.custom.txt file? I mean the exact path.
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.