A Firebase/React project reviewed - now migrating to AppEngine/Sveltekit

mjoycemilburn

MartinJ

Posted on September 26, 2023

A Firebase/React project reviewed - now migrating to AppEngine/Sveltekit

Image description

Last reviewed: December 2023

In brief:

What went right on my Firebase/React project (Firebase and React!) - what went wrong (SEO!) - what I plan to do to fix things (Next.js/Sveltekit and SSR?)

  1. Firebase and React/ReactRouter provided a perfect low-cost solution for redevelopment of my PHP "budget" system. Running costs are low and performance is excellent.

  2. I messed up badly over coding my single-page webapp for SEO and failed to get Search Engine indexing. Beyond this, the limitations of Google's support for client-side React rendering were painfully exposed.

  3. I intend to try Server Side Rendering for my next project. Next.js on the Google Cloud Platform will probably be the first port of call. I'm hoping this will fix future SEO problems whilst continuing the positive experience with React and Firebase.

Introduction

Twelve months ago I was looking at the code for a non-profit website. This was hosted on a commercial server and used an unholy mixture of client JavaScript and server PHP; it was a mess and my life as a software developer was miserable. But I'd enjoyed some positive experience tinkering with Firebase and React/React Router technologies and thought I'd give these a try on a serious project.

It took me nine months to rewrite my website (there was a lot to learn) and three months to fully implement (entirely due to SEO difficulties), but the results were spectacularly successful. Performance is excellent and costs are minimal. More significantly, from my personal point of view, life is fun again! Although "my" website is small-scale in terms of data volume, it more than makes up for this in terms of complexity. In the past, bug fixes and enhancements took so long that it got to the point that I could barely face opening the system. By contrast, the client-centric arrangements in my new React system make working on the code a positive joy. I don't think I've ever been so happy (yes, I know - sad, isn't it!)

Here are a few key headings from the experience

  1. Pre-implementation experience - working with the local server
    1.1 Project structure and build arrangements
    1.2 Firestore Database and Testing Arrangements
    1.3 Security
    1.4 Cloud Functions and Third-party APIs

  2. Post-implementation experience - SEO problems
    2.1 The npm Head library
    2.2 "Mobile first" issues for a site with distinct mobile and desktop flavours
    2.3 Googlebot React rendering limitations

  3. The future
    3.1 Server-side rendering with Next.js and "Virtual machine hosting" on the Google Cloud platform

1. Pre-implementation Experience

Working up the new code was pure pleasure - I still get a kick when I make a code change in VSCode, refocus on Chrome, and note that Vite has already refreshed the browser's view of the page. Here are some high points:

  • 1.1 Project structure and build arrangements

Although this wasn't my first React project, when I began the development I had still to develop a consistent style for file/folder/component naming etc. I spent quite a while experimenting and backtracking over this while working on the first few routes. For what it's worth, my current standards are documented here

Although I used create-react-app to configure my initial working environment, I switched to Vite part-way through when I found that the React Router tutorial was now using this. I'm still using Vite because it is lightning-fast. It has never given me any problems.

  • 1.2 Firestore Database experience - testing arrangements

I've used quite a few different database engines in the past, most recently mySQL, and I was itching to try Google's noSQL Firestore arrangement. The bottom line is that I loved it. Both setup arrangements in the Firestore console and the syntax of Firestore CRUD functions take a bit of getting used to, but once you're comfortable with these, your coding simply flies. I found it helped a lot to maintain strict consistency over coding style (many variants are available) and now always use templates to get myself started.

However, I think it's likely that I would have been much less pleased with Firestore if I'd been working with thousands of documents rather than the hundreds that are more typical of my site. Only one collection in my system really pushed Firestore. This contains 45000 documents and took over an hour to load. Also, this collection had previously benefited from SQL's wild-card search syntax. There is no equivalent in Firestore and consequently, my system now uses an in-core index - a far from ideal arrangement.

One slightly scary thing about Firestore is that commands running in the local server are perfectly happy to access live collections in the Google Cloud. But then, being still in recovery from the testing messes I'd experienced when working with local test mySQL databases, I realised that I could easily turn this to my advantage. My test data is now hosted in the Cloud alongside its live counterpart and a strictly-applied collection-naming scheme preserves integrity. This arrangement has done wonders for my productivity (and sanity). See Configuring a Firebase/React webapp for life post-implementation if you'd like to read more about this arrangement.

  • 1.3 Security

One of the most welcome benefits of moving my system onto the Cloud platform was the opportunity to use Google's firebase/auth API to replace my homemade PHP security arrangement. This saved an enormous amount of time - and is, of course, much more secure.

  • 1.4 Cloud Functions and Third-party APIs

For one reason or another, I found myself using Cloud Functions quite heavily during the redevelopment. I found these initially intimidating, but once I got used to the logging system (which is where you do most of your debugging) we got along just fine. My only gripe would be the time it takes to register a new version (I know I could have used the emulator, but this is just too much like hard work).

One particular advantage of using a function to code a procedure was that it was then so easy to configure the Cloud Scheduler to run the code automatically.

Another reason for using functions was because these provide protection for your API security keys. The old website had used Paypal, Postmark, and Mailchimp libraries at various points and, when I started redevelopment, I was completely unsure about how I would migrate these bits of the system. It was a huge relief to find that each was supported by well-documented NPM libraries. These, in turn, turned out to be really straightforward to use.

2. Post-implementation Experience

You can't really test for SEO, can you? To be honest, I'd not paid the slightest attention to SEO issues during development. So I got a big shock when, after a flawless implementation, I found my new website was receiving absolutely no Google indexing.

  • 2.1 The npm Head library

What I'd forgotten was that Google gets most of its indexing information from a website's <head> content. When you're busy developing a new route for your SPA it's easy to forget that your system has only one explicitly coded <head>, and that's in your initial index.html - yes, the one that was created when you used create-react-app to initialise your project and that probably still contains React's demo settings! So, when I cheerfully delivered my system's site map to the Google Search Console I was served with a quite shocking list of complaints about "duplicate page with no user-selected canonical" messages. And no indexing! Yes, as far as Google was concerned, every page was the same and contained absolutely nothing of any interest.

Once I'd realised what was happening, the next question was how to fix it. After all, how do you create individualised <head> content for a route? Then I found the NPM Helmet library and was thereby saved from total SEO disaster. Once installed, this allowed me to drop <head> entries straight into eacha route's return() code as follows:

    <Helmet>
        <title> ...page title... </title> 
        <link rel = "canonical" href = " ... the page's React 'URL' ..." />
        <meta name = "description"content = "... page description  ..." />
    </Helmet>
Enter fullscreen mode Exit fullscreen mode

But, in my particular case, the problem went much deeper than this...

  • 2.2 "Mobile first" issues for a site with distinct mobile and desktop flavours

Desktop users of my website are looking for completely different information from mobile users. For example, only the site's desktop users will want to view the PDF files in its archive since these are illegible on small-screen devices. Mobile users on the other hand will just want a snappy summary of information rather than the generous, artistic view of facilities served on the expansive 'real estate' of a desktop screen.

You can only go so far with responsive CSS, so I have two home pages on my site. But I want these to be accessed by the same root URL. If you're not careful when coding the <route> paths for these pages, you'll end up with the desktop and mobile crawlers saying they've seen two pages with different content but the same canonical URL - and once again you'll get no indexing.

It's not too difficult to use the pixel screen width of the viewing device to determine which page to reference in main.jsx's root.render(). In my webapp, the page that renders the mobile home page does so on a route defined as "/" (ie the root address for my website) with Helmet settings as follows:

<link rel="canonical" href="... root ..." />
Enter fullscreen mode Exit fullscreen mode

By contrast, the page that renders the desktop version does so on a route defined as "/home" and declares its canonical as:

<link rel="canonical" href="... root/home ..." />
Enter fullscreen mode Exit fullscreen mode

With these arrangements in place, basic indexing of my Home page started to appear more or less as expected but, frustratingly, indexing still wasn't appearing for pages further down the site's page hierarchy

While I was well aware of Google's "mobile first" policy, the consequence of this for sites like mine hadn't fully dawned on me. Although Google is quite happy for you to have different mobile and desktop versions of your site it expects them both to contain the same content - see Make sure that content is the same on desktop and mobile where it says "almost all indexing on your site comes from the mobile site".

The pages that weren't getting indexed were those that were only referenced on the desktop version of my site.

This resulted in a massive re-appraisal of what I actually considered worth indexing. Once I'd finished (having found ways of painlessly inserting references to the missing links into my mobile page) the list of pages in my sitemap was reduced to a mere skeleton and I had added a <meta name="robots" content="noindex" /> to all my desktop routes. This way I could guarantee I'd avoid all risk of further Google complaints about duplicate canonicals. Note that while I could have achieved the same effect by placing disallow instructions in my robots.txt file, I think it's preferable to keep all of these indexing instructions together. The only entries in my robots.txt are thus for those pages that I specifically wanted Google to index.

  • 2.3 Googlebot React rendering limitations

But there was still a problem with Google's view of the content of my pages - when I inspected "crawled" pages, some (specifically those that were built dynamically by reference to collection content) were incomplete.

At this point, I needed to draw in a deep breath and remind myself that getting indexing from Google is a privilege not a right. Google's indexing engines chew through massive amounts of information every second and this all comes for free. Truly, the Google search engine is a modern miracle. But React webapps present a particular challenge to Google because they need to be rendered before it can think about indexing them. In other words, React webapps make a huge task even more difficult.

You can get a feel for the process if you use the Search Console to request rendering for a React Router page. If you now inspect the "crawled page" you'll just see the code of your index.html page. Only if you "test the Live URL" and then "view the tested page" will you see the rendered code. Eventually, the system will catch up and the "crawled page" will also display the rendered code - but the additional step inevitably slows and complicates the process.

Regrettably, however, I also came to realise that Google's enthusiasm for rendering React code is not always wholehearted.

A page I particularly wanted Google to render displayed lists of pdf-file links that were built dynamically from Cloud content. It seems that there is a limit to the amount of time that googlebot's "headless browser" (the engine that renders React pages) is prepared to spend on stuff like this. All I really know is that the headless browser will sometimes balk at rendering the content for more than one long or complex list.

It may be that Google's actions are guided by its assessment of the "importance" or "value" of your site (people talk of a site's "crawl budget"), but it seems fruitless to speculate further - Google will do what Google does. For what it's worth, my particular page contained three lists of links, one generated asynchronously from bucket content and the other two from Firestore collections. Only the bucket list was ever rendered.

You can get a feel for the situation by asking chatGPT what it knows. The bottom line is: "while Googlebot has improved, it may not always perfectly render and index dynamic JavaScript content."

4. The future

The problem described in 2.3 above is well-known (if not by me!) and the general advice is to find some way of supplying web crawlers with pre-rendered code. One popular approach is to use the commercial services of PreRender.io. To set this up, you open an account with PreRender and ask them to cache copies of the website pages for which you require SEO. Then you find yourself a node.js host and use this to re-engineer the root address of your site to point to a "proxy server" (PreRender can provide you with sample code for this). The purpose of this is to create a platform that can inspect the source of the request and determine which requests have come from bots and which have come from ordinary site users. PreRender's cache is used to give pre-rendered pages to bots while the rest are forwarded to the application's own normal rendering routes. As part of the deal, PreRender also refreshes the cached pre-rendered pages on a regular cycle.

For small-scale projects, use of the site is free. Additionally, you might like to know that PreRender has an excellent support desk.

Just out of interest, I tried various home-grown alternatives using a Cloud function as a cheap way of obtaining a Node.js environment to retrieve a file from the PreRender.io cache via Axios. I had to use Node.js because the PreRender.io cache lacks an allow-cors header. This meant that Fetch, my usual standby, launched from client-side Javascript, failed with CORS errors.

But though this "worked", I now realise that the prerendered code thus generated was still essentially a React page. What Google gets is the webapp's index.html page with a <body> pointing to a script that contains the prerendered code. So, whatever restrictions hamper Google's subsequent indexing of the code still kick in. To avoid this problem, I would have had to use a fully independent proxy server, as advised by Prerender.

Having thought long and hard about this, it seemed to me that the extra complexity simply wasn't justified for my project. As things stand, I have just one dynamic page that defeats googlebot, and in reality, the requirement for keeping this permanently up to date is so low that I can handle things by maintaining my own static version manually. This is all far from perfect, but it works.

  • 4.1 The next project - Next.js and Server-side rendering?

While I've been labouring over my website conversion I've been fascinated to note the excitement that's been building over new framework systems like Next.js and Svelte. At one point I found time to run a quick trial and wrote it up at Nextjs and CDN. This was an eye-opener! I was mainly interested in the performance advantages of Server Side rendering technologies but I now realise that these also provide a natural way of fixing React's SEO issues.

I now seem to have got to a point where I think that if you have a dynamic site you should ideally use SSR from the outset. Planning to resolve SEO problems by deploying a pre-rendering proxy feels simply wrong.

At the same time, however, I don't want to resurrect the problems I'd previously experienced with server-based work. The biggest win on my website conversion project was achieved through React and React Router "state" technologies. I certainly don't want to move to SSR if it involves losing either this or the robust, economical hosting provided by Firebase.

But, my understanding is that both Next.js and Svelte offer React-like "State" management on local servers and can both, in principle, be hosted on the Google Cloud. So, my plan is to give each of these a try on my next project.

Fingers crossed then!

5. Post-post post (joke)

Three months of frantic research later, I think I've reached a final decision on where to go next. My preference is for Svelte with Google App Engine hosting. Here's the selection logic

  • 5.1 Next.js

It took an absolute minimum of effort to create a new project that used Next.js "static" to render a "static" version of my pdf-file links page at build time. This page accessed the Firebase and Cloud storage data in the original live project to create a long list of anchor links. After linking my project to a live url and deploying it to Firebase hosting in the usual way, I was able to use the Search Console to confirm that my new page was correctly indexed by Google.

So, Next.js static rendering would provide a systematic way of replacing the temporary manual arrangement I described earlier. But how would I make this arrangement dynamic so that changes in the collections that provide the base data will automatically feed through into the links list? For this I needed Next.js "dynamic" rendering (SSG). This is where life starts to get tricky for a Firebase user.

What you need now is for the Next.js code that builds the rendered HTML files in the static configuration described above to repeat this trick at run time. But remember that when a static pre-rendered page is generated at build time, Next.js is running in Node.js in a terminal session. It's not easy to see how this might be arranged on Firebase hosting.

In the past you might have taken out an account with a commercial hosting site where a request to serve, say, a "myPHPprogram.php" url would have been configured to launch a myPHPprogram.php file hosted on the site using its PHP engine. This, in turn, would have been programmed to return a page of HTML. On the face of it, no such arrangement is available on a Firebase host.

Except, if you're smart, maybe there is. Recall that firebase functions enable you to run javascript in a remote Node.js environment. Maybe they could provide a way. Recall also how the "rewrites" section in firebase.json can be set to match the pattern of an incoming url and redirect it elsewhere (in a conventional React configuration, this is set to redirect everything to an SPA's index.html file). It turns out that those far-sighted chaps at Google have realised that it might be useful if a rewrite could take an incoming url and make it call a function. So, a firebase.json rewrite rule like :

    "rewrites": [
      {
        "source": "**/my_page",
        "function": "renderMyPageServerSide"
      }
    ]
]
Enter fullscreen mode Exit fullscreen mode

will redirect a url referencing 'mypage' to a 'renderMyPageServerSide' function. What would this function do exactly and how might it be coded?

Since, ideally, you want a common function that serves all incoming urls, the function's first job will be to establish which particular url it's being asked to prerender. Then comes the killer part. It needs to launch the same code that gets run when you build pre-rendered pages with npm run dev. The surprise is that this is available as a next() function call that can be invoked in the Firebase function, just like any other api element.

Details of this arrangement can be found at Deploying Your Next.js App to Google Cloud: A Step-by-Step Guide. I followed this excellent post myself and managed to get the arrangement to work myself but cannot honestly recommend it because:

  1. It's excessively complicated
  2. Testing is difficult
  3. Deployment is slow (since you have to deploy a function)
  4. Google Cloud function instances are stateless but the Next server isn't. It caches by default and its not clear to me that the two arrangements are compatible.
  5. The "start-up" efficiency of Cloud functions is likely to be an issue.

In any case, there are alternatives. The link cited above also mentions the possibility of using "Google App Engine" to host a Next.js server. Google App Engine instances host "containerised" applications so, a container that replicates the Node.js environment that you use to launch a local Next.js server with npm run dev will be just as happy running in Google App Engine.

But to get to a point where you can try this, you need to learn how to use Docker to create a container. - see Next.js app deployed with Docker - does it make sense? for an example.

Once again, however, things here are complicated and at this point I decided to take a break from Next.js and have a look at Svelte.

  • 5.2 Svelte

Setting up a Svelte project in VSCode turns out to be just as easy as Next.js. Svelte is also very similar to Next.js when it comes to defining routes - ie. you do this by creating route files in your project hierarchy. The details differ, of course, but the general approach is the same. Where it differs significantly, however, is when it comes to the way you define "reactivity". Next.js uses the familiar React useState mechanism, but Svelte uses its own idiosyncratic syntax to define the "reactive variables" that would have been the properties of State objects in React. It also dumps JSX.

Strangely, after very little practice, this all seemed perfectly natural.

The crunch point came when, after successfully creating a Svelte version of my anchor link page and testing it in the local Svelte server, I considered how I was going to deploy this in the Google Cloud.

After the Next.js experience, I wasn't going to go anywhere near Cloud functions and instead started to consider how I might use Google App Engine.

The first task here, of course, was to build a container. As previously commented, this can be a complex procedure but, for Svelte, I found this was almost ridiculously simple. Using a svelte community "adaptor" (an npm library at "svelte-adapter-appengine" - massive credits to its developer, HalfdanJ), I found that I could configure my project so that I could produce a container with a single npm run build command!

The next task was to register a Google App Engine a/c. Once again, this proved to be perfectly straightforward and I found I was now able to deploy my container from my VSCode terminal session with a single command :

gcloud app deploy --project <CLOUD_PROJECT_ID> build/app.yaml
Enter fullscreen mode Exit fullscreen mode

This returned a url for my deployment (similar to the familiar test url you get from a firebase deployment) and, when I entered this into my browser, the project ran! I was, frankly, astounded! I have since confirmed that I can use this technique to obtain Google indexing for a clone of my anchor links page - ie. the SSR code produced is accepted by Googlebot. In the world of engineering, you can always "sense" when a design points you to the correct path. I believe that Svelte passes this test.

  • 5.3 Conclusion

Both Next.js and Svelte SSR are capable of fixing my project's SEO problems but each, in its own way, would make migration a serious challenge.

Relatively few code changes would be required to move my project to Next.js since this is basically just an extension of React. But hosting on the Google Cloud platform would be an issue and its likely that a move to Next.js would also imply a move to Verel.

In passing, I find it curious that Google hasn't put more effort into support for development frameworks. Sure, there is an "Integrate Next.js" feature that "enables you to deploy your Next.js Web apps to Firebase and serve them with Firebase Hosting", but this is still in beta and you'd have to be desperate to use it.

Conversely, if I were to try to switch to Svelte, while continued Google hosting seems assured, Svelte's radically different approach to reactivity means that I would have to make massive changes to my codebase.

Fortunately, for my particular project, there is no pressing need to take action since I can get by quite happily by with my manual "patch".

But I've enjoyed this bit of research enormously. It's introduce me to a long list of new and interesting technologies and I've now decided firmly that Svelte will be the platform for my next project.

So, onward and upward!

💖 💪 🙅 🚩
mjoycemilburn
MartinJ

Posted on September 26, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related