Skip to main content

Replacing GCP with Railway for faster cold start

TL;DR I switched a Dart API from Cloud Run to Railway for a 300% faster cold start, simplified DevOps, and a straightforward fee structure.

Problem

I'm working on this project github.com/daohoangson/flutter_widget_from_html. It is a pub.dev package that's super handy for Flutter developers who want to seamlessly render HTML in their apps.

Now, when it comes to HTML, it can get pretty dynamic, right? That's why having a playground to showcase features, troubleshoot issues, and tackle bugs is crucial. The Google team has this fantastic tool called dartpad.dev, which is just perfect for this kind of thing. However, there's a little catch - third-party packages like mine usually can't be used there (unless you have thousands of likes, as explained on Medium).

So I decided to take matters into my own hands, forked it, then deployed try.fwfh.dev with additional package support.

Cloud Run Deployment

Back in 2021, the code was split into two repositories: frontend and backend. I deployed each of them as a Cloud Run service in Google Cloud Platform. For some unknown reason, I couldn't use domain mapping, so a load balancer is required to use try.fwfh.dev. There wasn't much traffic so the computing cost was minimal. Still, it amounted to around $20 per month due to the LB. This is pain point number 1.

Fast forward to June 2023, and the two repositories were merged into one. I noticed that it was now possible to deploy the frontend to Firebase Hosting, which is cheaper and offers better performance. However, the backend remained on Cloud Run and continued to suffer from an incredibly slow cold start. This has been a persistent issue since the beginning, pain point number 2 right here.

I'm in search of a better solution and decided to explore Railway, my goals are twofold:

  1. Achieve a faster cold start time
  2. Keep the cost reasonably low, ideally under $10 per month

Railway Experiment

Getting started is a breeze since everything's already containerized. I opted for the most budget-friendly plan, priced at just $5 per month.

Railway's Hobby plan for $5 per month

Initially, I migrated both the frontend and backend, but it became evident that containerized NGINX couldn't outperform static hosting, so the frontend stuck with Firebase and I only moved the backend. In the final PR, there's just one file. 

Metrics

I used time and curl to call the endpoints and get some numbers. Railway was faster in all tests, which is quite a surprise.

  Cloud Run Railway
Cold start 53,214 ms 14,862 ms
Analyze 830 ms 420 ms
Compile 5,880 ms 5,120 ms

Comparison

Considering the significant performance boost, I have some concerns about the cost. Therefore, I went ahead and deployed a separate pair of services purely for the purpose of benchmarking against each other:

After a week, the results are in:

  Cloud Run Railway
Cold start 46,875 ms

UptimeRobot report of Cloud Run response time
12,780 ms

UptimeRobot report of Railway response time
Compile 30,107 ms

Loader.IO load testing report for Cloud Run
25,314 ms

Loader.IO load testing report for Railway
Cost $0.47 after -$1.21 free tier $0.65

Further cost break down:

  Cloud Run Railway
Usage Cost Usage Cost
CPU core-sec 45,835 $1.1 4,351 $0.0336
Memory GB-sec 45,823 $0.11 159,960 $0.6173
Traffic GB 0.41 North America
0.11 Intercontinental
$0
$0.01
0.01 $0.0009

Some interesting observations:

  • Railway seems to put the service to sleep after 40 minutes of inactivity
  • Cloud Run categorizes an instance as active when it serves requests, then it goes idle, and eventually terminates it after approximately 15 minutes
  • I didn't included storage costs because each provider has a different billing model. Railway charges for disk usage on a minutely GB basis, while GCP bills for container registry storage
  • GCP has a "Networking Traffic Egress GCP Replication" SKU, which costs around $0.39 during the trial period. I incur this cost when Cloud Build pushes the Docker image to the registry and again for replicating data across regions 🤷
  • The providers calculate resource usage differently. Cloud Run's CPU and memory numbers are nearly identical at 45k, whereas Railway's memory usage is significantly higher at 160k, with only 4k in CPU usage. For this particular service, we can potentially tweak Cloud Run to use less CPU and save costs, but with Railway, we don't need to worry since they bill based on actual usage, which is convenient.
  • GCP introduces additional billing items such as build time and request count, among others, whereas Railway only charges for CPU, memory, disk and bandwidth.

Conclusion

Given the comparable costs and Railway's exceptional performance, it's a clear choice for now. I'll keep a close eye on this and reconsider if the performance ever takes a dip.

Referral link: https://railway.app?referralCode=daohoangson

Comments

Popular posts from this blog

Flutter: Fixing Firebase header not found with Notification Service Extension

If you follow the FCM tutorial Send an image in the notification payload and encountered this error message: 'FirebaseMessaging/FirebaseMessaging.h' file not found You are on the right place, I'm going to show you how to fix it. My app was working fine but one day it stopped compiling. Apparently Flutter 1.20 changed the way it uses CocoaPod so the service extension no longer has the proper library configured. After some tinkering, I came up with this pod config, it has to be added to ios/Podfile below the main Runner target. target 'FcmImage' do use_frameworks! use_modular_headers! require File.expand_path('../.symlinks/plugins/firebase_core/ios/firebase_sdk_version.rb', __FILE__) firebase_sdk_version = firebase_sdk_version! pod 'Firebase/Messaging', "~> #{firebase_sdk_version}" end FcmImage is my extension name, replace it with yours We can use a hardcoded version for Firebase/Messaging pod but doing so m...

OAuth with Google, Twitter and... Facebook!

This is sick! Just a few days ago, I ran into OAuth as I want to get my GMail feed based on Google Data API . I succeeded. With a little help of an OAuth open source ( here , available in several programming languages). Then I remember that I once heard that Twitter also uses OAuth as an authentication option so I turned into Twitter and had a good read. Finally, I found out that they are basically the same (hehe, it's obvious since OAuth 1.0 is a worldwide standard). I had an idea of writing a universal class which can handle both Google and Twitter OAuth functionalities. It's not too hard. I took most of the idea from the PHP example ( here , PHP only). I also made a small script which accepts URI to send and intercept response from Google & Twitter servers. At that moment, I was so excited with all the ideas but actually it has no real world benefit so I just left it there... Until today, in the F8 (says "fate") conference of Facebook, I was stunned fin...