Pitfalls of Joomla! Multilingual Settings and the Current State of SEO
Since the reopening of Store.Alaudae.JP in April 2024, I've been diligently working on SEO optimization, relying heavily on Google Search Console. However, I've been struggling to see the desired results and have been actively investigating the root cause.
Initially, I suspected issues with redirect settings for old link pages and the internal link structure. I systematically worked on removing content, but the number of duplicate pages remained stubbornly high. I began to question whether the problem was with my approach. Then, I realized something.
It was a bug in Joomla!'s multilingual settings. As part of my SEO strategy, I added ".html" to the end of URLs and implemented canonical tags. However, I discovered that Googlebot was crawling unintended URLs.
Specifically, where a Japanese page should be "alaudae.jp/sample.html", Googlebot was also selecting and indexing URLs like "alaudae.jp/sample.html?format=html" and "alaudae.jp/ja/sample.html", treating them as duplicate content.
Despite modifying the language filter settings in the plugin, the language code persisted due to a bug, or unnecessary parameters were added to URLs, hindering my efforts to fix the issue. As a result, I've had to resort to the time-consuming process of removing the canonical URLs that Googlebot has already indexed and then requesting re-indexing.
Currently, I'm removing over 1,000 unwanted URLs daily, and I can finally see the finish line. However, with nearly 10,000 pages still problematic, it's uncertain whether I can resolve all issues this year.
Of course, I could use noindex tags and robots.txt to prevent search engines from crawling these pages. But with such a large number of duplicate pages, I believe it's better to delete them and eliminate any trace.
My top priority for SEO is to completely clean up these unwanted URLs. Once this task is complete, I can finally start focusing on more advanced SEO strategies.