Shouldn't the /admin directory be disallowed from the robots.txt

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.
7 years ago
Some admin pages are showing up in Google searches on our NopCommerce deployment.

Example page: https://store.domain.tld/admin/customer/Edit/1

Somehow, it has details to the page, who the customer is, etc, even though the page forces a redirect.

Shouldn't the /admin directory be disallowed from the robots.txt?

I am looking at the code and it is not excluded.
https://github.com/nopSolutions/nopCommerce/blob/develop/src/Presentation/Nop.Web/Controllers/CommonController.cs

I can add the exclusion, but admin requires a login anyway.

Why and how are pages from /admin showing up?
7 years ago
We found this post: https://www.nopcommerce.com/boards/t/37032/builtinsearch_engine_recordcom-is-this-some-spam-or-.aspx

Turns out that there is a builtin@search_engine_record.com account. On our system it was a member of the administrators group. My guess is that it shouldn't be an administrator and that this is the cause.

Now, how did this user become and administrator?
7 years ago
rhyous wrote:
Shouldn't the /admin directory be disallowed from the robots.txt?

Sure. I've just created a work item. But anyway guests don't have access to admin area

rhyous wrote:
Turns out that there is a builtin@search_engine_record.com account. On our system it was a member of the administrators group. My guess is that it shouldn't be an administrator and that this is the cause.

Now, how did this user become and administrator?

Of course, it should not be in this role. Otherwise, search engines will have access to admin area. But this customer record is not in this role by default. I presume somebody added it to it
7 years ago
Thanks. I appreciate you creating a work item.
6 years ago
Dear Andrei and nopCommerce team,

Sorry to reopen this case, but I saw that with nopCommerce version 3.90 this issue has been solved by adding /admin/* to the robots.txt. Adding sites to the robots.txt file actually won't prevent Google & co. from crawling these sites, they simply won't show in the search result. See also: https://stackoverflow.com/a/18316292

Hence, I would, if I may, recommend using the following code (meta tag) on the admin layout page to prohibit search engines from even crawling these pages:

<meta name="robots" content="noindex">


Best, Max
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.