Multilingual blog post sitemap double language V4.60.2

8 ヶ月 前
- Version 4.60.2
- SEO friendly URLs with multiple languages enabled checked on

When I post a blogpost in one language, I expect it to show in the sitemap for that given language when I check the box it should be included in sitemap and NOT in other languages I support.

Lets say I support English (en) and Dutch (nl) on my website. And I make a blog article:

{website_url}/en/blog/wonderful_blog

What I expect in the sitemap is just the language of the blog, because it is not available in another language:
  <url>
    <loc>{website}/en/wonderful-blog</loc>
    <xhtml:link rel="alternate" hreflang="en" href="{website}/en/wonderful-blog"/>
    <changefreq>weekly</changefreq>
    <lastmod>2022-09-14</lastmod>
  </url>

(or I guess it does not need the alternate tag)

But I get the following sitemap entry:
  <url>
    <loc>{website}/en/wonderful-blog</loc>
    <xhtml:link rel="alternate" hreflang="en" href="{website}/en/wonderful-blog"/>
    <xhtml:link rel="alternate" hreflang="nl" href="{website}/nl/wonderful-blog"/>
    <changefreq>weekly</changefreq>
    <lastmod>2022-09-14</lastmod>
  </url>

Now Google will get a Dutch and English version of the same blog article while it is written and published in English. Thus creating confusion when it ranks in google on the wrong languaged version.
You can get redirected to the {website}/nl/wonderful-blog version, read the blog in english and when you click to another page be greeted by a Dutch website!
8 ヶ月 前
Thanks a lot for reporting. I've just created a work item
8 ヶ月 前
The work item has en error. It should not have this as the correct situation:

<url>
    <loc>{website}/en/wonderful-blog</loc>
    <xhtml:link rel="alternate" hreflang="en" href="{website}/en/wonderful-blog"/>
    <xhtml:link rel="alternate" hreflang="nl" href="{website}/nl/wonderful-blog"/>
    <changefreq>weekly</changefreq>
    <lastmod>2022-09-14</lastmod>
  </url>


But this:

<url>
    <loc>{website}/en/wonderful-blog</loc>
    <xhtml:link rel="alternate" hreflang="en" href="{website}/en/wonderful-blog"/>
    <changefreq>weekly</changefreq>
    <lastmod>2022-09-14</lastmod>
  </url>


I also delved deeper into the sourcecode and found this is actually 2 bugs.
1 - The xml sitemap creates languaged versions for each blog (like I already reported)
2 - The url sitemap versions on {website}/en/sitemap and {website}/nl/sitemap lists ALL blogs of ALL languages. It should just list the blogs of the current language

I've created some quick and dirty fixes with my limited knowlidge of the solution.

For the xml sitemap:

In Nop.Web.Factories.SitemapModelFactory.cs line 333

From:

 return await (await _blogService.GetAllBlogPostsAsync(store.Id))
                .Where(p => p.IncludeInSitemap)
                .SelectAwait(async post => await PrepareLocalizedSitemapUrlAsync("BlogPost",
                    async lang => new { SeName = await _urlRecordService.GetSeNameAsync(post, post.LanguageId, ensureTwoPublishedLanguages: false) },
                    post.CreatedOnUtc, UpdateFrequency.Weekly)).ToListAsync();

To:
            return await (await _blogService.GetAllBlogPostsAsync(store.Id))
                .Where(p => p.IncludeInSitemap)
                .SelectAwait(async post => await PrepareLocalizedSitemapUrlAsync("BlogPost",
                    async lang => new { SeName = await _urlRecordService.GetSeNameAsync(post, post.LanguageId, ensureTwoPublishedLanguages: false) },
                    post.CreatedOnUtc, UpdateFrequency.Weekly, post.LanguageId)).ToListAsync();


Line 797 ish
From:
public virtual async Task<SitemapUrlModel> PrepareLocalizedSitemapUrlAsync(string routeName,
            Func<int?, Task<object>> getRouteParamsAwait = null,
            DateTime? dateTimeUpdatedOn = null,
            UpdateFrequency updateFreq = UpdateFrequency.Weekly)

To
        public virtual async Task<SitemapUrlModel> PrepareLocalizedSitemapUrlAsync(string routeName,
            Func<int?, Task<object>> getRouteParamsAwait = null,
            DateTime? dateTimeUpdatedOn = null,
            UpdateFrequency updateFreq = UpdateFrequency.Weekly,
            int langId = 0)


Around line 830 from:
            if (languages == null || languages.Count == 1)
                return new SitemapUrlModel(url, new List<string>(), updateFreq, updatedOn);

To:

bool isBlog = routeName == "BlogPost";
            if(isBlog)
            {
                var pathBase1 = _actionContextAccessor.ActionContext.HttpContext.Request.PathBase;
                //Extract server and path from url
                var scheme1 = new Uri(url).GetComponents(UriComponents.SchemeAndServer, UriFormat.Unescaped);
                var path = new Uri(url).PathAndQuery;

                path = path
                    .RemoveLanguageSeoCodeFromUrl(pathBase1, true)
                    .AddLanguageSeoCodeToUrl(pathBase1, true, await _languageService.GetLanguageByIdAsync(langId));
                url = new Uri(new Uri(scheme1), path).ToString();
            }

            if (languages == null || languages.Count == 1 || isBlog)
                return new SitemapUrlModel(url, new List<string>(), updateFreq, updatedOn);


This fixes the sitemap so now I get for each blogpost record:
  <url>
    <loc>https://{website}/en/wonderful_blog</loc>
    <changefreq>weekly</changefreq>
    <lastmod>2023-01-26</lastmod>
  </url>

No dutch version anymore! But very dirty code ;).


For the url sitemap the fix is simpler:

In the same SitemapModelFactory.cs Find:
                
//blog posts
                if (_sitemapSettings.SitemapIncludeBlogPosts && _blogSettings.Enabled)
                {
                    var blogPostsGroupTitle = await _localizationService.GetResourceAsync("Sitemap.BlogPosts");
                    var blogPosts = (await _blogService.GetAllBlogPostsAsync(storeId: store.Id))
                        .Where(p => p.IncludeInSitemap);

Change into:
                
//blog posts
                if (_sitemapSettings.SitemapIncludeBlogPosts && _blogSettings.Enabled)
                {
                    var blogPostsGroupTitle = await _localizationService.GetResourceAsync("Sitemap.BlogPosts");
                    var blogPosts = (await _blogService.GetAllBlogPostsAsync(storeId: store.Id))
                        .Where(p => p.IncludeInSitemap && p.LanguageId == language.Id);


Dont know what will happen when not multilingual so will probably need some more checks but works on my site.