Why We Ditched Firebase, Auth0, Okta, Clerk: A Deep Dive into Our Secure, Scalable Auth System

July 21, 2025 (3w ago)

Why We Ditched Firebase, Auth0, Okta, Clerk: A Deep Dive into Our Secure, Scalable Auth System

Ever wondered why teams ditch third-party auth giants like Google or Firebase for a homegrown system? It’s not just about control — it’s about tailoring security to real-world chaos. Here’s how we evolved our auth strategy from basic sessions to a robust, scalable JWT hybrid, solving revocation headaches at scale. (Spoiler: It’s inspired by PayPal and SuperTokens, but customized for API-first, mobile/SPA apps.)


Why Build Custom Auth Over Third-Party Services?

Third-party tools shine for simple logins, but fall short on complexity. We opted for our own system to handle:

This gives us the edge in flexibility without compromising basics.


Stateless vs. Stateful: Why JWT?

JWTs are signed, tamper-evident tokens that verify users without constant DB hits — perfect for stateless auth. Here’s a quick JWT breakdown:

How JWT Works (Simple Implementation):

Example Payload:

{ 
    "jti": "uuid-here", 
    "roles": ["admin", "user"], 
    "user_id": "abc123" 
}

On request:
Server verifies signature; no DB needed for validation.

We chose stateful JWT over traditional session-based auth for these reasons:


Challenges with Stateless JWTs

Stateless is efficient, but not flawless. We hit these roadblocks:

Our initial single-token setup amplified problems:

Dilemma: Short-lived for security = poor UX; long-lived = risky.


Research and Evolution: From Single Token to Hybrid

We dove into articles on token management (shoutout to SuperTokens’ JWT revocation guide) and dissected PayPal’s granular auth. The breakthrough? A hybrid fusion:

This splits concerns: Access for security, refresh for UX. But it meant DB hits every 15 mins for revocation — inefficient at scale.


Link Text

Optimizing with Redis and UUIDs

To cut DB load, we added Redis for caching:

Storage:

SET session:user_id:access_jti <jti> EX 900 # 15 min
SET session:user_id:refresh_jti <jti> EX 604800 # 7 days

Validation:

Results: Lightning-fast checks, minimal overhead, and stateful revocation without full stateless trade-offs.


Handling Redis Failures: No Single Point of Breakdown 🛡️

Redis is great, but failures = SPOF. Our multi-layer backup strategy, inspired by industry pros:

If Redis Fails:
Query Postgres directly — optimized for spikes, like GitHub’s OAuth scaling.
Slack-Style Hybrid: Short TTLs (e.g., 30s) in DB for revocation, avoiding constant calls via async writes.


Link Text

What If Queue Is Delayed (e.g. 20 min)?

Problem Statement!
If refresh_token_jti is written to Redis and queued for async DB persistence, a long queue delay (e.g., 20+ min) creates risk. If Redis evicts the jti before it's written to DB, and the access token is revoked in that window, the user can't refresh their session. The system sees the token as invalid — even though it was issued correctly. This causes unexpected 401s due to a race between caching and persistence.

In this case:

  1. Write refresh_token_jti to Redis (TTL 7d)
  2. Write basic refresh_token_jti to DB immediately (sync) with minimal info
  3. Queue full enrichment write (device info, IP, etc.) for async processing

Key Wins and Takeaways 📈


Final Thoughts

Whether you’re scaling a new product or revamping an existing one, smart token design can balance security, UX, and performance. This hybrid architecture saved us from many fires and gave us deeper control.

What’s your go-to strategy for JWT revocation at scale? Have you battled similar auth dilemmas? Drop a comment — let’s geek out! 🚀


👥 Team Behind the Build

This system was built by a 5-member founding dev team of final-year CSE students at a startup: