Hi,
I have nopcommerce running and bought a domain on which Magentao was running. It has over 30K 404 error pages. I cannot add them in the code/ hardcode. What I suggest is the option to do next:
1. Add a robots.txt location; if present use that instead of hard coded url.s Of course provide list which should be added to the robots.txt
2. Implement option to extend the robots file with next code snippet:
//Load robots.txt additions to end of file.
//Or even load complete file regardless of this file
string pathToFiles = Server.MapPath("~/Content/");
string robotsFile = System.IO.Path.Combine(pathToFiles, "Robots.txt");
if (System.IO.File.Exists(robotsFile))
{
string robotsFileContent = System.IO.File.ReadAllText(robotsFile);
sb.Append(robotsFileContent);
}
Response.ContentType = "text/plain";
Response.Write(sb.ToString());
return null;
In this way I am able to extend with 30K lines.
J.
Note: I also read that Google requires a nofollow tag if a 404 page is thrown....that will stop indexing. Other option is to extend robots.txt file. Maybe idea to add this tag as well (see Google WEbmaster tools crawl errors). Check
here and
here.
//page not found
public ActionResult PageNotFound()
{
this.Response.StatusCode = 404;
this.Response.TrySkipIisCustomErrors = true;
//20140124 QE JO WorkItem: --> Too many 404 errors in Google Webmaster Tools
//https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag?hl=nl&csw=1
this.Response.AddHeader("X-Robots-Tag", "noindex, nofollow");
return View();
}