Website Optimization for Global Crawlers

ADMIN BLOG

Seb

Admin

BLOG INFO

Blog Date

June 5, 2024

Location

UK, Manchester

Follow us on

OTHER ARTICLES

Table of Contents

Website Optimization for Global Crawlers

Website Optimization for Global Crawlers

A Cautionary Tale of Building My First Single-Page App

As a seasoned Java developer, I’ve spent the better part of my career working on the behind-the-scenes aspects of software systems. But a couple of years ago, a curious itch led me to dip my toes into the world of front-end development. Armed with a passion for learning and a brand new side project, I set out to create a hobby application that would be used by just me and a few friends.

I stumbled upon JHipster, a well-documented development platform that promised to help me build a web application using modern technologies like Angular, React, or Vue for the client-side, and Spring plus Gradle or Maven for the server-side. Within a few short weeks, I had a functioning application that met all my needs. Little did I know, this was just the calm before the storm.

Suddenly, other people started using my application, and I was thrilled. But as the user base grew, so did the challenges. I found myself spending countless nights and weekends trying to improve the application, only to be met with a growing list of roadblocks. It wasn’t long before I realized that my choice of technologies was becoming a hindrance to making the application better.

I didn’t know what I didn’t know.

This classic developer lament is the epitome of my experience. Despite the initial appeal of JHipster and Angular, there were many reasons why they may not have been the best fit for my project, had I known about them. And as I soon discovered, my lack of knowledge in these areas was coming back to bite me in the form of issues with search engine optimization (SEO), social sharing, and caching.

The Perils of Single-Page Apps and SEO

One of the first problems I encountered was with SEO. You see, traditional web pages work on a client-server model, where the browser sends a request to the server and the server responds with the full HTML document required to render the page. But single-page apps, like the one I had built, break this paradigm.

In a single-page app, the initial response from the server is just a bare-bones HTML document with a few placeholder elements and links to JavaScript files. It’s these JavaScript files that ultimately fetch the content and dynamically update the HTML in the browser to create the meaningful web page.

As the article on the Stack Overflow blog explains, search engine crawlers like Google don’t necessarily execute JavaScript when they crawl a website. They simply analyze the HTML structure to try and understand what the page is about.

And that’s exactly what happened in my case. When I looked at the search analytics for my site, Google was only ranking it for a single keyword – one that had nothing to do with the actual content of my website. It turns out Google’s crawler was interpreting my site as being related to Maven proxy configuration, all because the default HTML template included some occurrences of the words “mvnw” and “proxy”.

Needless to say, my hobby project was not in any way related to the open-source Apache project. But because the crawler couldn’t execute the JavaScript to see the real content, it was left to make its best guess based on the limited HTML structure.

The Challenges of Social Sharing

Another area where I ran into problems was with social sharing. Much like search engines, social networks rely on the content in web pages to understand what the page is about. But instead of focusing on the visible content, they tend to rely more on the metadata – the stuff inside the HTML header that us humans don’t really care about.

When you share a link to a website on Facebook, for example, the first thing that happens is Facebook reads the webpage and generates a nice preview of that article. The preview has a title, a line or two of descriptive text, and an image. But these previews aren’t generated magically – Facebook is relying on the metadata in the HTML to create them.

As the website for Lumar explains, many CMS systems like WordPress make this metadata easy to manage with plugins. But in my case, I was building a brand new application from scratch, and I would have to create this metadata on my own.

And just like with SEO, the fact that my single-page app’s content was being generated dynamically by JavaScript meant that social networks were only seeing the bare-bones HTML template. The result? Every link shared from my website, whether it was users sharing their custom content or one of the static pages I had created, had the exact same preview. Not exactly the best way to entice people to click through and discover my site.

The Struggle with Caching

As my user base started to grow, I also became increasingly concerned with the performance of my application. Were my database queries optimized? Was I taxing my MongoDB instance with too many requests? Would new users get frustrated and give up if pages took too long to load?

One of the first things that came to mind was caching. I had worked with enterprise caching solutions in the past, but they all felt like overkill for my little hobby project. Instead, I decided to leverage the power of Cloudflare’s free page caching feature.

The idea was simple – Cloudflare would act as a reverse proxy, caching the responses from my server and serving them up to users, reducing the load on my own infrastructure. But in practice, it didn’t quite work as expected.

As Yoast’s documentation on crawl optimization explains, Cloudflare doesn’t execute any JavaScript before caching the response. It simply takes the raw HTML and caches that. In my case, that meant it was caching the same bare-bones template HTML for every page, rather than the fully rendered content.

So when a user requested a cached page, Cloudflare would serve up the template, and the user’s browser would then have to load the JavaScript files and make additional requests to my server to get the real content. Essentially, I was still facing the same performance challenges I was trying to solve in the first place.

The Perils of Technology Envy

Throughout this entire journey, I couldn’t help but feel a little envious of the cool new technologies and frameworks I was seeing others use. As a developer always looking to improve my skills, I was constantly tempted to jump on the latest bandwagon, even if it wasn’t necessarily the best fit for my project.

As the team at Lumar emphasizes, it’s important to be adaptable and know the right tools for the job, rather than chasing the latest buzzwords. And that’s exactly what I failed to do when I started my single-page app project.

I was so caught up in the excitement of learning a new technology that I never stopped to consider how that decision would impact the future of the app. And as new feature requests came in or I had new ideas, I found the choice of framework increasingly frustrating, constantly looking up how to do things and going down countless rabbit holes.

In the end, I realized that my project had taken on a new meaning, and it was no longer the right context for me to explore and learn. I ultimately decided to re-write the application using technology I was more familiar with, but the experience was invaluable.

Lessons Learned

While this may have been just a hobby project, the lessons I learned along the way have been priceless. Mistakes are good – they help us learn, and they help us make better decisions down the road.

I now have a deeper understanding of the unique challenges that come with building single-page apps, and I can confidently speak to clients about when a single-page app makes sense for their use case. I can also share my experience with colleagues, helping them to avoid some of the pitfalls I had to deal with.

Most importantly, I’ve gained a new appreciation for the importance of choosing the right tools for the job, and not getting caught up in the excitement of the latest technology trends. Because at the end of the day, it’s not about how shiny and new the tools are – it’s about delivering a great user experience and driving real results for your business.

And that’s exactly what the team at MCR SEO in Manchester, UK is all about. They understand that website optimization isn’t just about chasing the latest algorithms – it’s about creating a technical foundation that allows your business to thrive, no matter where in the world your customers may be.

So if you’re looking to take your website to the next level, be sure to give the MCR SEO team a call. With their deep expertise in SEO, site speed, and accessibility, they can help you navigate the ever-changing landscape of global web crawlers and ensure your online presence is truly optimized for success.

Copyright 2023 © MCRSEO.ORG