4.0 not using browscap.xml?

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.
6 years ago
I was having an issue with multiple customers being created due to my use of UpTimeRobot.

The line:
   crawlerItems = XDocument.Load(sr).Root?.Elements("browscapitem").ToList();

returns a zero length list (because there are no elements in the crawlerOnly file.  However, the next line checks for null:

   if (crawlerItems == null)

Because crawlerItems is a zero length List, the full browscap.xml file is never loaded.



In a related item, I tried deleting the browscap.crawleronly.xml file, which should cause the system to pull all the crawler items from browscap.xml and place them in browscap.crawlersonly.  However, the line:

   crawlerItems = XDocument.Load(sr).Root?.Elements("browscapitem")
      //only crawlers
      .Where(IsBrowscapItemIsCrawler).ToList();

doesn't load any items.  Am I missing something?  Based on this, all crawler bots may be generating a large amount of blank customers in the customer table because the system isn't classifying them as a bot correctly.
6 years ago
Looks like part of the problem is that the line should read:

   crawlerItems = XDocument.Load(sr).Root?.Elements("browsercapitems")?.Elements("browscapitem")

The browscapitem is nested inside a browsercapitems element so you need to reference both.

I'm still pretty sure that the check for null is incorrect as well.
6 years ago
mdlyen wrote:
Looks like part of the problem is that the line should read:

   crawlerItems = XDocument.Load(sr).Root?.Elements("browsercapitems")?.Elements("browscapitem")

The browscapitem is nested inside a browsercapitems element so you need to reference both.

I'm still pretty sure that the check for null is incorrect as well.

Hi.
... not sure, but  it works for me ...
XDocument xDoc2 = XDocument.Load(xmlYOURFilePath);
                IEnumerable<XElement> xElms2;
              
                xElms2 = xDoc2.Descendants......


mdlyen wrote:

.....
if (crawlerItems == null)
.....
I'm still pretty sure that the check for null is incorrect as well.



I suppose it should be string not NULL ... you CAN DO any.... for example ...
   foreach (XElement xElm2 in xElms2)
     if (xElm2.Element("XXXXXXXX").Value == ds.Tables[0].Rows[ii].ItemArray[0].ToString())
      listBox1.Items.Add("XXXXXXX =" + xElm2.Element("XXXXXXX").Value);
         listBox1.Items.Add("YYYYYYYY =" + xElm2.Element("YYYYYYY").Value);
...

Best Regards.
6 years ago
Do you have App_Data\browscap.crawlersonly.xml file? Does it have any records?

I've just found an issue when this file is not properly re-created (when didn't exist). Although this file exists by default in all 4.00 packages, it's not available on github (https://github.com/nopSolutions/nopCommerce/tree/develop/src/Presentation/Nop.Web/App_Data) repository. So I presume you've cloned the package directly from github repository .In the meantime I would recommend you to download this browscap.crawlersonly.xml file and upload to your \App_Data
6 years ago
a.m. wrote:
Do you have App_Data\browscap.crawlersonly.xml file? Does it have any records?

I've just found an issue when this file is not properly re-created (when didn't exist). Although this file exists by default in all 4.00 packages, it's not available on github (https://github.com/nopSolutions/nopCommerce/tree/develop/src/Presentation/Nop.Web/App_Data) repository. So I presume you've cloned the package directly from github repository .In the meantime I would recommend you to download this browscap.crawlersonly.xml file and upload to your \App_Data


As a future reference (until it gets fixed).  You are correct that I cloned the repository and deployed from there.

I think this is what happened:

1. Deployed without the browscap.crawlersonly.xml file (by design).
2. The first time the application ran, it attempted to read the browscap.xml file.
3. Because of the bug above, this results in an xml document containing the following:

      <?xml version="1.0" encoding="UTF-8"?>
      <browsercapitems/>

4. The next time the application starts, it reads the empty file and doesn't load any records.
5. Every request (including all crawlers) is considered a valid session and creates a cookie and a customer record as a guest.

Thanks for creating the bug record and getting this fixed!
6 years ago
mdlyen wrote:
Looks like part of the problem is that the line should read:

   crawlerItems = XDocument.Load(sr).Root?.Elements("browsercapitems")?.Elements("browscapitem")

The browscapitem is nested inside a browsercapitems element so you need to reference both.

I'm still pretty sure that the check for null is incorrect as well.


I just realized the real solution is probably to change the .Elements() to .Descendants()

.Elements() only pulls the immediate children.
.Descendants() pulls all underlying nodes including children, grand-children, etc.

I changed .Elements("browscapitem") to .Descendants("browscapitem") and it seems to be working correctly now.
6 years ago
Hello Mark, thank you for your help, I fixed this problem. Please see this commit for more details
6 years ago
This seems like a largish bug in 4.0 to me... why are such things not backported to 4.0 in a maintenance release e.g. 4.0.1???
5 years ago
Hi,

My site runs abysmally slow and I'm looking for ways to speed it up. I'm running 4.1 on Everleap with the single site medium server that Everleap recommends.  20,000+ items in database

Reading this about the browscap.xml file I'm hoping updating it to the newest version will help since I'm getting about 100k new customers per day added to my database.

Am I correct to assume the 4.1 now creates the file browscap.crawlersonly.xml from the full browscap.xml automatically when site starts?

Date of browscap.xml included with 4.1 is May 18, 2018
My plan if this will work.


Replace browscap.xml file in the App_data folder with newest file from browscap.org
Delete browscap.crawlersonly.xml
Restart nopcommerce

Am I correct in assuming this will work?

Thanks!
This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.