Azure AD, Google Directory, and SCIM: picking a user-sync story for a multi-tenant Laravel app

Late 2024 I spent a few weeks digging into how a multi-tenant Laravel platform I was working on should let tenant administrators pull users in from external identity providers. The customer asks were predictable — “we use Azure,” “we use Google Workspace,” “can you just hook into our directory?” — and the answer turned out to be more interesting than the question. After looking at Azure Active Directory, Google Directory, and the System for Cross-domain Identity Management (SCIM) protocol, we landed on SCIM as the primary path, with the two cloud-directory options reduced to footnotes. 🐳

This post is a tidied-up version of the investigation notes. If you’re picking a user-sync mechanism for a Software-as-a-Service (SaaS) app and the customer is pointing at one of these three things, the trade-offs below might save you a week.

Why not just speak LDAP to Azure AD directly?

Azure Active Directory (Azure AD) is Microsoft’s cloud identity service — it sits behind Office 365, handles sign-in for Microsoft cloud apps, and is what enterprise customers usually mean when they say “our directory.” The instinct, coming from a traditional on-prem world, is to point an Lightweight Directory Access Protocol (LDAP) client at it and start browsing users.

You can’t. Azure AD does not natively speak LDAP. What it offers instead is one of three pictures, depending on the customer’s deployment:

  1. On-prem AD synced to Azure AD via Azure AD Connect. Your server can speak LDAP to the on-prem Active Directory box, the way it always has. Azure AD is just a downstream replica used for cloud sign-in. The authoritative data still lives on-prem.
  2. Pure cloud Azure AD with no LDAP at all. No LDAP endpoint exposed, anywhere. You either talk to it via the Microsoft Graph REST API, or you don’t talk to it.
  3. Azure AD with Azure AD Domain Services (Azure AD DS) enabled. This spins up a separate managed domain in the cloud that does support LDAP. It’s a paid feature, and it’s a new domain rather than a view into the existing one — the customer would have to decide to migrate into it.

For our app to “just work” with an existing LDAP browser, the customer needed to be in case (1) or (3). Plenty of enterprise customers aren’t — they’re cloud-first, with no on-prem AD and no Domain Services subscription. For those, LDAP is simply not an option, and the realistic alternative is Microsoft Graph.

Graph is a fine API, but adopting it as a sync source means real development work: capture tenant ID, client ID, client secret, and the consented permission scopes (User.Read.All, Directory.Read.All); add a Create-Read-Update-Delete (CRUD) interface for those settings; bring in something like composer require microsoft/microsoft-graph; build the sync loop. None of it is exotic, but it’s all Azure-specific code we’d then have to write again for the next vendor.

One other footnote worth knowing: Azure AD’s free tier covers basic Graph reads, but stress-testing 20,000 users will hit throttling quickly. Azure AD Connect is free with any Azure subscription; Azure AD Domain Services is a premium feature with its own line item. The cost picture is benign for development, less benign for serious load testing.

Google Directory: same destination, different road

Google Directory is the directory layer of Google Workspace — same job as Azure AD, different ecosystem. It manages user accounts, groups, and devices for Workspace tenants and handles sign-in to Gmail, Drive, and the rest.

And just like Azure AD’s cloud-only mode, Google Directory does not speak LDAP. There is no LDAP browser story here at all — no equivalent of Azure AD Domain Services that opens an LDAP port. The only programmatic access is the Admin SDK REST APIs. So whatever LDAP-based extension you’ve been using on the Active Directory side (in our case directorytree/ldaprecord-laravel, which is genuinely lovely for AD work) is just dead weight here.

The Google API client for PHP (composer require google/apiclient) covers the API surface, and the call pattern is similar to Microsoft Graph: OAuth2 service-account credentials, scoped permissions, paginated list endpoints. The schema mismatch is a small extra annoyance — fields we care about, like “manager email” or “team lead,” aren’t always populated in a default Workspace setup, and the customer may need to extend their directory schema via the Admin SDK before our sync sees anything useful.

Cost-wise: Google Workspace doesn’t have a long-term free tier, just a 14-day trial. For development, that’s enough to wire things up; for sustained QA, someone has to pay.

So now we have two cloud directories that don’t speak LDAP, each with its own REST API, its own auth model, and its own schema quirks. If we want to support both, we write the integration twice. This is the moment SCIM starts looking obviously better.

SCIM: the protocol that lets the identity provider do the work

SCIM (System for Cross-domain Identity Management) is a standard for provisioning users between systems. The relevant Request for Comments is RFC 7643 (core schema) plus RFC 7644 (protocol). The pitch, in one sentence: your app exposes a small REST API in a fixed shape, and the customer’s identity provider pushes user changes to it.

That inversion of control is the whole point. Instead of our app polling Azure for users, then polling Google for users, then polling Okta for users — three different APIs, three different auth dances, three different schemas — Azure, Google, and Okta all push the same SCIM-shaped requests to the same endpoint on our side. We write the receiver once. The vendors compete to be good SCIM clients; we just have to be a correct SCIM server.

The terminology is worth getting straight, because it’s a bit counter-intuitive:

  • The identity provider (Azure AD, Google, Okta) is the SCIM client — it initiates requests.
  • Our application is the SCIM service provider — it receives them.

“Client” feels like it should be the consumer, but in SCIM the client is the pusher. Just memorise it; it’ll come up.

After a conversation with a couple of teammates in early December, we settled on SCIM as the path forward, with Azure AD Graph and Google API integrations parked as “maybe later, if a customer specifically asks.” Below is the shape of what we actually built. 🛠️

What the SCIM receiver needs to expose

For Laravel, the arietimmerman/laravel-scim-server package gives you most of the SCIM endpoint scaffolding for free — base routes, schema discovery, the right error envelopes. Standing it up takes an hour. Making it actually map to your domain takes much longer, because every SCIM server eventually becomes opinionated about what the incoming attributes mean.

The endpoints we needed:

  • /api/scim/v2/ — service root, returns capability metadata.
  • /api/scim/v2/Users — supports HTTP POST (create), GET (read/search), PATCH or PUT (update), DELETE (deprovision). A DELETE doesn’t actually nuke the user; it flips their status to suspended, same as our existing LDAP-based flow did.
  • /api/scim/v2/Schemas — schema discovery. The library generates this from your model definitions.
  • /api/scim/v2/Groupsdeliberately left out. Our internal group model doesn’t line up well with SCIM’s, and forcing the mapping would have been more painful than asking customers to manage group membership in-app.

Because the platform was multi-tenant — Stancl’s tenancy library, one database per tenant — we created a fresh route file routes/tenant_api.php rather than co-mingling SCIM with the central admin API. The controller lives at app/Http/Controllers/Tenant/Api/SCIMUserController.php, with the messier translation logic factored into app/Services/Tenant/ScimService.php. Vanilla shape, but worth being explicit because the SCIM library defaults assume a single-tenant Laravel app.

Auth tokens via Sanctum and Jetstream

The SCIM client side (Azure, Okta, etc.) expects a long-lived bearer token. We already had Jetstream + Sanctum wired up for the app, so the natural move was to let tenant admins mint API tokens from their profile page, with a specific SCIM ability scoped on each token. Sanctum handles the storage, expiry, and revocation; Jetstream handles the management UI; we just had to make sure the tokens lived in the tenant database rather than the central one, and reword the permissions checklist so the SCIM ability was discoverable.

The flow from an admin’s perspective:

  1. Open profile page → create API token → name it “Azure SCIM” or similar → tick the SCIM permission.
  2. Copy the one-time-displayed token.
  3. Paste it into the SCIM client configuration in Azure / Okta / Google, alongside the SCIM base URL (which the in-app guide page displays with a copy button).

The “shown only once” pattern is Sanctum’s default and it’s the right one — but you do have to write a sensible warning into the regeneration flow, because admins will absolutely lose the token and try to regenerate, and you want them to understand that the old token stops working the moment they do.

Schema: just two tables

The data model is unglamorous. Sanctum brings its own personal_access_tokens table (publish it with php artisan vendor:publish –provider=”Laravel\Sanctum\SanctumServiceProvider”), and we added one extra table to capture SCIM-specific overflow.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
TABLE personal_access_tokens {
  id              INT        [pk, INCREMENT]
  tokenable_type  VARCHAR
  tokenable_id    INT
  name            VARCHAR
  token           VARCHAR    [UNIQUE]
  abilities       text
  last_used_at    TIMESTAMP
  expires_at      TIMESTAMP
  created_at      TIMESTAMP
  updated_at      TIMESTAMP

  indexes {
    tokenable_type_tokenable_id_index [tokenable_type, tokenable_id]
  }
}

The companion table stores everything that SCIM tells us about a user that we don’t have a first-class column for. Phone numbers, employee numbers, alternate addresses, locale — none of that is in our core users schema, but the SCIM contract is that we acknowledge it and return it on a subsequent GET. So we stash it as JSON.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
TABLE scim_users {
  id           BIGINT     [pk, INCREMENT]
  user_id      BIGINT     [NOT NULL]
  extra_fields json       [DEFAULT NULL]
  created_at   TIMESTAMP  [DEFAULT NULL]
  updated_at   TIMESTAMP  [DEFAULT NULL]

  indexes {
    user_id_index [user_id]
  }

  foreign_keys {
    user_id [REFERENCES users(id), ON DELETE cascade]
  }
}

Two tables, one foreign key, no surprises. Most of the complexity is in the controller layer, not the schema. 💡

The “no vendor actually batches” surprise

RFC 7644 defines a /Bulk endpoint for batch operations. I built scaffolding for it on day one, assumed it would be the hot path for the initial onboarding sync, and started planning a queued-job pipeline to handle the load.

Then I actually watched Azure AD push 50 users.

It sent 50 individual POSTs to /Users, each followed by a GET to confirm the new resource existed. No /Bulk call. Okta does the same. Google Workspace does the same. None of the major SCIM clients actually use the bulk endpoint, despite it being in the spec, because their internal architectures are already issuing one provisioning event per user and there’s no operational benefit to coalescing them. So we ripped out the batch scaffolding and the queued-job pipeline, and treated user creation as a straightforward sequential operation that returns the SCIM-shaped “user created” envelope synchronously. Much simpler. 🎉

Related simplification: we don’t need to keep our own sync logs. The customer’s SCIM client (Azure, Okta, Workspace) keeps a detailed provisioning log on its side, including every error response we return. Building a duplicate log on our side would have been busywork the customer would never look at.

The dev-tunneling problem

One genuinely annoying problem: how does Azure reach a SCIM server running on a developer’s laptop?

For an external-facing customer-installed app this isn’t an issue — Azure hits a public URL. For local development, you need a tunnel. A teammate suggested Laravel Expose, which is a nice piece of software in principle: it gives your local app a public HTTPS URL via a relay server, exactly what we needed. On their laptop it worked perfectly. On mine, the Vite-served UI elements rendered partially, page loads were broken, and SCIM requests kept timing out for reasons I never fully diagnosed. We worked around it for a while by sharing their laptop as the integration-test environment, but it’s the kind of friction you want to remove before more developers join the team. Cloudflare Tunnel and ngrok are the obvious alternatives if Expose doesn’t behave.

For staging, the equivalent question is “how does Azure reach our staging server?” In our case staging was behind the company training Virtual Private Network (VPN), which Azure obviously can’t see. The answer turned out to be straightforward — once the staging server got a real public DNS name and a public-facing route, Azure could talk to it like any other SCIM endpoint. The VPN was incidental to the staging architecture, not load-bearing.

The IP-restriction question

Worth mentioning because it’ll come up the moment a security-conscious customer reviews your SCIM setup: can we restrict the SCIM endpoint to specific source Internet Protocol (IP) addresses?

SCIM doesn’t require it, and arguably doesn’t want it — the protocol assumes the customer is the one initiating requests, and the bearer token is the security boundary. But Azure AD’s outbound IPs do change over time, and some customers will still ask for an allowlist out of habit. Our position was: not yet, ask us again when a customer makes it a deal-breaker. If we do implement it, it goes at the reverse proxy / firewall layer, not inside the app — keeping the controller agnostic about source IPs is the right call.

What I’d tell past-me

Three things, condensed:

  1. If the customer has any plurality of identity providers, SCIM is the answer. Writing one Azure Graph integration is fine. Writing Azure + Google + Okta is a maintenance tax that compounds. SCIM lets each vendor be responsible for translating its directory into a common shape on the wire.
  2. Build the receiver, skip the bulk endpoint, skip your own sync logs. Vendors push one user at a time and keep their own logs. Anything past that is over-engineering.
  3. The dev-environment story is half the work. Local tunneling, staging public reachability, token rotation flows, and “show this token once” UI are the parts that don’t appear in the protocol spec but absolutely show up in onboarding friction. Plan for them up front.

For a tenant-aware Laravel app, the actual code footprint of SCIM is small: a route file, a controller, a service, two tables, a UI page or two, and a Sanctum ability. The conceptual footprint — getting comfortable with “the customer’s directory is in charge, we just listen” — is the bigger shift, and the one that pays off across every future identity-provider integration. 🔐

Posted in Laravel | Tagged , , , , | Comments Off on Azure AD, Google Directory, and SCIM: picking a user-sync story for a multi-tenant Laravel app

Free Azure AD SCIM provisioning to a Laravel app on your laptop, via home router + dynamic DNS

In the last post I sketched why SCIM (System for Cross-domain Identity Management) won out over direct Azure Active Directory (Azure AD) and Google Directory integrations for a multi-tenant Laravel app I was working on. This one is the hands-on follow-up: how to actually get Azure pushing user-provisioning events to a Laravel application running on your laptop, for free. 🐳

The shape of the setup: open a port on your home router, point a free dynamic-Domain-Name-System (dynamic-DNS) hostname at it, and run Laravel Sail behind that hostname. It mirrors the production push flow closely enough to be a useful test rig, gives you a real Azure portal to point at when you’re tuning attribute mappings, and costs nothing beyond the hour or so of one-time setup.

Why not just use Laravel Expose?

Laravel Expose is the obvious answer — it relays public HTTPS requests to a process on your machine, with a friendly Laravel-shaped Command-Line Interface (CLI). On the free tier it works fine for a single-host app. The wrinkle for us was multi-tenancy: the app routes by subdomain, and every tenant lives at tenantname.app.example.com. To exercise that locally, you need a tunnel that gives you a wildcard subdomain, not just a single hostname.

Expose’s wildcard-subdomain feature is paywalled — $60 USD per year plus tax, which lands at roughly $96 once it clears the border. That’s not a lot of money, but it’s enough to think twice when you remember you’ve already got a static-ish home Internet Protocol (IP) address and a router with a port-forwarding screen. So I bypassed it.

Step 1: poke a hole through your home router

Most consumer routers have a “port forwarding” or “service hosting” section. Pick a public port on the Wide-Area-Network (WAN) side and point it at your laptop’s Local-Area-Network (LAN) IP on the port your Laravel Sail container listens on (port 80 by default inside the Sail web container, exposed however your docker-compose.yml maps it — often 80 or 8000 on the host).

One Internet Service Provider (ISP)-specific gotcha that bit me, and might bite you: some ISPs silently block inbound port 80 to residential connections. They don’t explicitly tell you this on the router admin page; the port-forward rule will save happily and then silently drop every connection. The fix is to forward a non-80 port like 8080 instead — most ISPs allow that, and Azure doesn’t care whether your SCIM endpoint lives at port 80 or 8080, since the Tenant Uniform Resource Locator (URL) field lets you specify the port explicitly. If your forward “works” on the router but a phone on cellular data can’t reach it, suspect a residential port-80 block.

Test from outside your network — phone with WiFi off is the easy version:

1
2
# from your phone or any external machine
curl -v http://<your-public-ip>:8080/

If you see your app’s HTML, the tunnel is open. If you get connection-refused or a timeout, it’s either the ISP, the router firewall, the OS firewall, or the wrong LAN IP — work through those in order.

Step 2: a hostname that isn’t your raw IP

Azure will happily accept an IP-address Tenant URL, but the multi-tenant subdomain routing in the app needs a real Domain Name System (DNS) name. no-ip.com gives you a free dynamic-DNS account: you sign up, pick a hostname from one of their domains (mine ended up on redirectme.net, but they have several to choose from), and either run a tiny daemon on your machine or update the IP via their web form whenever your home IP changes.

So now you have something like myapp.redirectme.net pointing at your home IP, and port 8080 on that IP forwarding to your laptop. Putting http://myapp.redirectme.net:8080/scim/v2 into the Tenant URL field on Azure works the same way it would if you were running on a cloud server. The dev-vs-prod difference is mostly: less uptime, and you have to remember to keep the laptop awake during a provisioning test. 💡

For the multi-tenant subdomain wrinkle, you also need per-tenant hostnames. The pragmatic shortcut is to set up one tenant whose domain matches your no-ip hostname exactly. So if your no-ip hostname is myapp.redirectme.net, you want a tenant in your database keyed to that domain. Two small code changes accomplish this.

Step 3: point your Laravel app at the public hostname

Two files need to know about the new hostname. First, .env:

1
APP_URL=http://myapp.redirectme.net

Second, your tenant seeder. If you’re using Stancl’s tenancy library (we were), there’s typically a manual-test seeder somewhere like database/seeders/Tenant/ManualTestSeeder.php that creates your fixture tenant and assigns it a domain. Change that domain to match the no-ip hostname:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class ManualTestSeeder extends Seeder
{
    private $testTenantId = 'test_tenant_one';
    private $testTenantDomain = 'myapp.redirectme.net';
    private $testTenantAdminEmail = 'test_user1@localhost.com';

    public function run()
    {
        $tenant = Tenant::create([
            'id' => $this->testTenantId,
        ]);
        $tenant->domains()->create(['domain' => $this->testTenantDomain]);
        // ... seed an admin user, an API token, etc.
    }
}

Re-seed, then sanity-check that the app loads at http://myapp.redirectme.net:8080/ from a browser on a network that isn’t yours. If you see the tenant’s landing page rather than the central-app landing page, the subdomain routing is doing the right thing.

Step 4: the Azure portal walkthrough

Now to the Azure side. Sign in at portal.azure.com. A work account on Office 365 works; a personal Microsoft account (Hotmail / Outlook / Live) also works, since Azure gives you a free tenant attached to your consumer identity. Either way, you land on the Azure home page with a row of service tiles. The walkthrough is nine clicks.

  1. Click “Enterprise applications”. This is the catalog of apps that have been onboarded into your Azure AD tenant. You’re going to add a new one that represents the Laravel app.
  2. “+ New application” → “Create your own application”. A side panel slides in asking what you’re integrating. Pick “Integrate any other application you don’t find in the gallery (Non-gallery)”. The gallery is for vendors who pre-registered their Enterprise application templates with Microsoft; yours obviously isn’t there.
  3. Name your app. Whatever you like. I used scimtesterapp3 because I’d already created and torn down two others while figuring this out. The name is just a label inside the Azure tenant.
  4. Go back to the Azure home page → click “Users”. This is the directory of users in your Azure AD tenant. Out of the box, you have one user — yourself — and you’ll need at least one more to provision through SCIM. Self-provisioning the owner is a special case and won’t exercise the create-user path.
  5. “+ New user” → “Create new user”. Fill in the Basics tab (user principal name, display name, mail nickname, a password — none of these will ever be used to actually sign in, so don’t sweat the password complexity). Then the Properties tab: email, first name, last name, job title, company name, department, manager. The properties matter because they’re what Azure will hand to your SCIM endpoint when you provision. I usually create someone called Alice Smith (alicesmith@example.com), give her a job title like “Secretary,” a department, and a manager — enough fields populated that the SCIM payload looks realistic.
  6. Confirm the user shows up in the Users list. Two entries now: you, and Alice.
  7. Back to Enterprise applications → your app → “Users and groups” → “+ Add user/group”. Pick Alice from the picker and assign her to the application. This is the bit that says, in Azure-speak, “Alice is in scope for this app’s provisioning.” Without this assignment, even a perfectly-configured SCIM connection won’t push her anywhere.
  8. Open the “Provisioning” blade → “Get started”. Set Provisioning Mode to Automatic. Under Admin Credentials, fill in:
    • Tenant URL: your full public SCIM endpoint, including the path. For our app that’s http://myapp.redirectme.net:8080/scim/v2.
    • Secret Token: an Application Programming Interface (API) token you generated in the Laravel app’s profile-page token UI, with the SCIM ability checked. (See the prior post for how that’s wired up via Sanctum.)

    Hit “Test Connection.” If everything’s right, Azure will get a 200 from your SCIM service-discovery endpoint and confirm that the supplied credentials are authorised. Don’t forget to click Save — Azure’s screen design here is genuinely sneaky, and the Save button is easy to overlook above the form. Without Save, your next step won’t know about the credentials.

  9. “Provision on demand” → pick Alice → run it. Azure walks through a four-step pipeline: import the user from the source, evaluate scoping rules, look up whether the target already has her, then perform the action (create, in this case). If all four steps come back green, your SCIM endpoint just got a real POST /Users from a real identity provider, and Alice now exists in the Laravel app’s tenant users table. 🎉

Once “Provision on demand” works end-to-end, you can flip the provisioning mode to scheduled and Azure will start pushing changes every 40 minutes or so — but for development the on-demand button is the one you’ll live in, because it gives you a per-step success/failure breakdown and lets you iterate on attribute mappings without waiting for the next cycle.

Quirks worth knowing about

A few things that aren’t bugs exactly, but are worth bracing for:

  • Your dev environment vanishes when your laptop sleeps. Port 8080 stops answering, Azure’s next “Provision on demand” attempt fails with a connection error, and you’ll spend thirty seconds confused about what changed. Nothing changed; you closed the lid. I came to see this as a feature, not a bug — it’s a default-deny posture for the rest of the internet when I’m not actively testing.
  • Your home IP changes occasionally. Most consumer ISPs hand out “dynamic” IPs that in practice change once every few months. no-ip’s daemon handles this transparently; if you go the manual-update route, expect to update the IP a couple of times a year when SCIM mysteriously stops connecting.
  • HTTP, not HTTPS. The setup above uses plain HTTP because port-forwarding to a local TLS-terminating container is fiddly. Azure will accept this for SCIM — it complains in the UI but allows it — and for a local dev box the tradeoff is reasonable, because the only data flowing through is test users you made up. For staging or anything resembling production, terminate Transport Layer Security (TLS) at a proper public host. Don’t be the person who ships an HTTP SCIM endpoint with real customer data on it.
  • Azure caches things. When you change attribute mappings in the Provisioning blade, sometimes the next “Provision on demand” run uses the old mapping. Wait a minute, retry, and don’t go diving into your code looking for the bug straight away.

Why this is worth doing

The point of all this is to get a real Azure portal pushing to your code. SCIM has enough vendor-specific edge cases — Azure’s quirks around externalId, the enterprise-extension schema for department and manager, the slight differences between how Azure and Okta encode multi-value attributes — that you really do want to be testing against the actual identity provider, not a stub. Once the tunnel-and-dynamic-DNS scaffolding is in place, the iteration loop is fast: tweak controller code, re-run “Provision on demand,” watch the Azure log and your Laravel log side by side. 🔐

An hour of one-time setup, no recurring cost, and you’ve got a SCIM integration test rig sitting on your dev machine. The next post in this thread will be the equivalent walkthrough for Okta, which differs from Azure in some interesting ways — but the home-network side of the setup stays the same.

Posted in Laravel | Tagged , , , , | Comments Off on Free Azure AD SCIM provisioning to a Laravel app on your laptop, via home router + dynamic DNS

Laravel Jobs, Queues, Batches, and Redis: A Field Guide

Laravel’s queue system is one of those features you can use for years without really understanding what’s happening underneath. You call SomeJob::dispatch(), a worker somewhere picks it up, and life goes on. But the moment a job mysteriously runs twice, or your failed_jobs table fills up overnight, or Redis OOMs because of a job backlog you forgot about, you suddenly need to understand the moving parts. This is the field guide I wish I’d had. 🐘

What a “Job” Actually Is

A Laravel Job is a plain PHP class that represents a unit of background work — sending an email, generating a Portable Document Format (PDF) report, syncing a record to an external Application Programming Interface (API). You hand it to the queue, and a separate worker process picks it up and runs it later.

The cast of characters:

  • Job class — your code. A class that uses the Dispatchable trait and implements a handle() method.
  • Queue connection — where jobs are stored. Configured in config/queue.php. Common drivers: sync (run immediately, no queue), database, redis, sqs, beanstalkd.
  • Worker — a long-running PHP process started by php artisan queue:work. It pulls jobs off the queue and runs them.
  • Coordinator (optional) — Redis, when used as the driver, also acts as the lock store and pub/sub fabric for things like batches and unique jobs.

The Database Tables Laravel Uses

Even if you end up running on Redis, the database tables tell you what Laravel is conceptually tracking. The three you’ll see most often:

  • jobs — the pending queue. Used only when the queue driver is database. Each row is one serialized job payload waiting to be picked up.
  • failed_jobs — the graveyard. Used regardless of driver. When a job throws and exhausts its retry attempts, it lands here with its exception trace.
  • job_batches — batch metadata. One row per batch, tracking total_jobs, pending_jobs, failed_jobs, cancelled_at, and the serialized then / catch / finally callbacks.

You also indirectly touch the cache / cache_locks tables when you use job middleware like WithoutOverlapping or the ShouldBeUnique contract — but only if your cache driver is database. With Redis, those locks live in Redis instead, which is much faster under contention.

Create the tables with the built-in artisan generators:

1
2
3
4
php artisan queue:table
php artisan queue:failed-table
php artisan queue:batches-table
php artisan migrate

Each command publishes a migration; migrate applies them. ✨

Dispatching a Job

A minimal job class looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<?php

namespace App\Jobs;

use App\Models\Invoice;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;

class SendInvoiceEmail implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public int $tries = 3;
    public int $timeout = 60;

    public function __construct(public Invoice $invoice) {}

    public function handle(): void
    {
        // ...send the email
    }
}

Dispatching is one line:

1
2
3
4
5
6
7
8
SendInvoiceEmail::dispatch($invoice);

// or pin it to a specific queue / connection / delay
SendInvoiceEmail::dispatch($invoice)
    ->onConnection('redis')
    ->onQueue('high')
    ->delay(now()->addMinutes(5))
    ->afterCommit();

That afterCommit() at the end is one of the most important methods in this whole post. We’ll come back to it in the gotchas section. 💡

Batching With Bus::batch

Sometimes you have a thousand things to do and you want to know when they’re all done. That’s what Bus::batch is for:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
use Illuminate\Bus\Batch;
use Illuminate\Support\Facades\Bus;
use Throwable;

$batch = Bus::batch(
    $invoices->map(fn ($invoice) => new SendInvoiceEmail($invoice))
)
    ->name('Send monthly invoices')
    ->allowFailures()
    ->onQueue('emails')
    ->then(function (Batch $batch) {
        // All jobs completed successfully (or failures were allowed)
    })
    ->catch(function (Batch $batch, Throwable $e) {
        // First failure observed
    })
    ->finally(function (Batch $batch) {
        // Batch finished, success or not
    })
    ->dispatch();

return $batch->id;   // a UUID you can use to poll status later

Behind the scenes, Laravel inserts a row into job_batches with total_jobs = 1000, pending_jobs = 1000, and decrements pending_jobs as each child job completes. The then / catch / finally closures are serialized into the row and fired when the appropriate transition happens.

The allowFailures() call is important: without it, the first job that throws stops the rest of the batch from continuing. With it, every job runs and you can inspect $batch->failedJobs at the end.

Worker / Queue Command Lines

The headline command is queue:work. There’s also queue:listen, which restarts the framework on every job — useful only in local development if you want code changes to apply without restarting. In production, you always use queue:work under a process supervisor.

1
2
3
4
5
6
7
8
9
php artisan queue:work redis \
    --queue=high,default,low \
    --tries=3 \
    --backoff=5,15,60 \
    --timeout=60 \
    --memory=256 \
    --max-jobs=1000 \
    --max-time=3600 \
    --sleep=3

What each flag does:

  • –queue=high,default,low — pull from these queues in priority order. high drains first.
  • –tries=3 — retry a failing job twice before sending it to failed_jobs.
  • –backoff=5,15,60 — wait 5s before the first retry, 15s before the second, 60s before the third.
  • –timeout=60 — kill the job if it runs longer than 60 seconds.
  • –memory=256 — restart the worker if memory usage exceeds 256MB. Cheap insurance against leaks.
  • –max-jobs=1000 — restart the worker after processing 1000 jobs. Also cheap insurance.
  • –max-time=3600 — restart the worker after running for an hour.
  • –sleep=3 — when the queue is empty, sleep 3 seconds before polling again. Only matters for the database driver; Redis uses blocking pops.

The maintenance commands you’ll reach for:

1
2
3
4
5
6
7
8
9
php artisan queue:failed                # list failed jobs
php artisan queue:retry all              # retry every failed job
php artisan queue:retry 5                # retry job with id 5
php artisan queue:forget 5               # delete a failed job
php artisan queue:flush                  # delete ALL failed jobs (careful)
php artisan queue:prune-failed --hours=48   # delete failed jobs older than 48h
php artisan queue:prune-batches --hours=48  # delete finished batches older than 48h
php artisan queue:clear redis high       # nuke all pending jobs on a queue
php artisan queue:restart                # signal all workers to gracefully restart

In production, you almost always run workers under Supervisor. A typical config looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /var/www/app/artisan queue:work redis --queue=high,default --tries=3 --max-time=3600
autostart=true
autorestart=true
stopasgroup=true
killasgroup=true
user=www-data
numprocs=4
redirect_stderr=true
stdout_logfile=/var/log/laravel/worker.log
stopwaitsecs=70

Note stopwaitsecs=70 — it should be greater than –timeout so Supervisor gives an in-flight job time to finish before sending SIGKILL.

Redis in the Mix

The database driver is fine for small applications, but it has two real costs: every poll is a SQL query (so –sleep matters), and lock contention on the jobs table grows nonlinearly with worker count. Switch to redis when you have more than a few workers or you need sub-second latency between dispatch and execution.

Behind the scenes, the Redis driver uses a handful of keys per queue. For a queue called default:

  • queues:default — the list of jobs waiting to be processed.
  • queues:default:delayed — a sorted set of jobs scheduled for the future. The score is the run-after timestamp.
  • queues:default:reserved — a sorted set of jobs currently being processed. The score is the lease expiration. If a worker dies, the lease expires and the job is re-queued.
  • queues:default:notify — a list used for blocking pops. This is the magic that makes Redis-backed queues feel instant: a worker does a BLPOP on this key and wakes up the moment a job is dispatched.

The atomic claim-a-job operation is implemented as a Lua script, so two workers can never reserve the same job. This is the part you really, really don’t want to reinvent yourself. 🛡️

For Redis-backed queues, the production answer is Laravel Horizon — a dashboard and process supervisor that replaces hand-rolled Supervisor configs. Install it, configure your queues and worker counts in config/horizon.php, and run:

1
2
3
php artisan horizon              # start the master + workers
php artisan horizon:terminate    # graceful shutdown for deploys
php artisan horizon:status

Horizon also gives you per-queue throughput graphs, recent job inspection, and runtime tagging — well worth the install if you’re already on Redis.

Gotchas (the actual reason you’re reading this)

These are the ones I’ve personally been bitten by. None are bugs; all are design tradeoffs you need to know.

1. Dispatching inside a database transaction

If you dispatch a job inside a DB::transaction(), a fast worker can pick it up before the transaction commits. The job then tries to load a row that doesn’t exist yet, and you get a ModelNotFoundException that’s impossible to reproduce locally.

Two fixes. Per dispatch:

1
SendInvoiceEmail::dispatch($invoice)->afterCommit();

Or globally, in config/queue.php:

1
2
3
4
5
6
7
8
9
10
'connections' => [
    'redis' => [
        'driver'        => 'redis',
        'connection'    => 'default',
        'queue'         => 'default',
        'retry_after'   => 90,
        'block_for'     => null,
        'after_commit'  => true,    // <-- this
    ],
],

2. retry_after MUST be greater than –timeout

The worker’s –timeout is how long the worker waits before killing the job. The connection’s retry_after is how long Laravel waits before assuming the job is dead and putting it back on the queue. If retry_after is less than –timeout, the supervisor re-queues the job while the original worker is still happily running it. You get two simultaneous executions and any side effect (emails, charges, webhooks) happens twice.

Rule of thumb: retry_after = timeout + 30.

3. Code changes are NOT picked up automatically

The worker boots the framework once and holds it in memory forever. Deploying new code doesn’t change the running worker’s behavior at all. You must run php artisan queue:restart after every deploy. That command signals the workers to exit cleanly, and Supervisor (or Horizon) starts them again with the new code.

Many hours of confused debugging have been spent on this. Make it part of your deploy script and never think about it again.

4. Jobs serialize their constructor arguments

The SerializesModels trait stores only the model’s class and primary key, then re-fetches the row when the job runs. This is great — except if the row has been deleted between dispatch and execution, the job blows up with ModelNotFoundException.

For jobs that should tolerate a soft-deleted parent, either pass the ID and load with withTrashed(), or override getRestoredPropertyValue(). For jobs where the row really must exist, log loudly enough that the failure is obvious.

5. failed_jobs only catches thrown exceptions

A job that never throws but loops forever or silently no-ops never reaches failed_jobs. The only safety net is –timeout — set it. A timeout-killed job goes to failed_jobs with a MaxAttemptsExceededException after retries are exhausted.

6. Bus::batch does not respect afterCommit by default

The per-dispatch afterCommit() method does not exist on the batch builder. If you dispatch a batch inside a transaction, wrap it yourself:

1
2
3
4
5
6
7
8
DB::transaction(function () use ($invoices) {
    // ...do transactional work

    DB::afterCommit(function () use ($invoices) {
        Bus::batch($invoices->map(fn ($i) => new SendInvoiceEmail($i)))
            ->dispatch();
    });
});

7. Closures are not queueable out of the box

dispatch(function () { … }) looks tempting but it requires laravel/serializable-closure and a signed app key, and many edge cases (use statements, non-serializable captures) break it. Don’t. Write a real job class.

8. Redis memory pressure with large backlogs

Every queued job lives entirely in Redis memory until it runs. A million jobs at 5KB each is 5GB of Redis. If your payload is large or your backlog is unbounded, either trim the payload (pass an ID, fetch on the other side), or use the database driver for that specific queue where backlog matters more than latency.

9. Horizon’s auto-balancer can starve long-running queues

Horizon’s auto balancing strategy reallocates workers based on queue length every few seconds. If a queue’s jobs are long-running (say, 10-minute video transcodes), Horizon sees “queue empty, reassign workers elsewhere” and the next batch waits. Use the simple strategy for that supervisor, or pin a minimum worker count.

Closing Thoughts

The queue layer is the boring kind of magical: it works exactly the same on top of sync, database, redis, or sqs. You develop against database, ship to redis, and your code doesn’t change. The cost of that abstraction is knowing the gotchas above — none of them are bugs, but all of them have bitten me at least once, and the diagnosis is much faster the second time. 🎉

If you take three things away: use afterCommit() whenever you dispatch inside a transaction, always run queue:restart on deploy, and keep retry_after greater than your worker timeout. Everything else, you can learn the hard way. 🛠️

Further Reading

  • Laravel Queues documentationlaravel.com/docs/queues. The canonical reference for jobs, batching, retries, and middleware.
  • Laravel Horizon documentationlaravel.com/docs/horizon. Configuration, balancing strategies, metrics, deployment.
  • Job middlewarelaravel.com/docs/queues#job-middleware. Covers WithoutOverlapping, rate limiting, and skipping.
  • Bus::batchlaravel.com/docs/queues#job-batching. Including the before, progress, then, catch, finally callbacks and allowFailures().
Posted in Laravel | Tagged , , , | Comments Off on Laravel Jobs, Queues, Batches, and Redis: A Field Guide

Laravel Sail: a developer’s cheat sheet 🐳

Laravel ships with Sail — a thin command-line wrapper around docker compose that gives you the whole Laravel toolchain (PHP, MySQL, Redis, Mailpit, Node) in containers, without you needing to install any of them on your host. The only thing you need on the laptop is Docker. Everything else lives in containers and goes away when you delete the project.

This is the quick-reference I keep open in another tab while building Laravel apps on macOS. 🍎

What you actually need on the host

  • macOS (these notes target Apple Silicon and Intel Macs equally)
  • Docker Desktop — the only hard prerequisite. Sail uses it for everything else (PHP, Composer, Node, MySQL, Redis).
  • That’s it. You don’t need PHP installed locally. You don’t need Composer locally. You don’t need Node locally. You install them once via Sail’s bootstrap and from then on every command runs inside containers.

Spin up a fresh project (with MySQL and Redis)

The official one-liner uses Laravel’s builder image to scaffold a new app and pre-select the services you want. Tell it mysql and redis in the with query parameter:

1
2
3
curl -s "https://laravel.build/example-app?with=mysql,redis" | bash
cd example-app
./vendor/bin/sail up -d

That brings up four containers — your app, MySQL, Redis, and Mailpit (the dev mail-catcher) — and exposes the app on http://localhost. The first run pulls images and takes a couple of minutes; subsequent sail up calls are fast.

Tip: alias sail so you don’t have to type the long path every time.

1
alias sail='[ -f sail ] && sh sail || sh vendor/bin/sail'

Drop that into your ~/.zshrc and you can just type sail up -d, sail artisan …, etc., from anywhere inside a Sail project.

The Artisan commands you’ll reach for daily

Anything you’d run as php artisan … on a non-Sail setup, you run as sail artisan …. Sail just shells into the app container and forwards the command. The most common ones:

1
2
3
4
5
6
sail artisan tinker                      # interactive REPL with your app booted
sail artisan route:list                  # show every registered route
sail artisan migrate                     # run pending migrations
sail artisan make:controller UserController
sail artisan make:model Department -m    # model + migration in one shot
sail artisan queue:work                  # start a worker against the default queue

tinker is the standout feature you’ll likely use most — it’s a Laravel-aware PHP REPL with every facade, every model, and your full config() ready to go. Need to check what User::find(1)->roles returns? sail artisan tinker, type the expression, get an answer. Beats writing a controller-and-route just to peek at data.

Mailpit — see every email your app sends

Sail bundles Mailpit, a friendly local SMTP server with a web UI. Any mail your app tries to send (password resets, notifications, queued emails) gets caught and shown at:

1
http://localhost:8025

No SMTP credentials, no real provider, no actual emails leaving your machine. Just open the inbox and see what your app sent. The .env Sail generates already wires MAIL_MAILER=smtp, MAIL_HOST=mailpit, MAIL_PORT=1025, so it works on first run.

Database workflow: migrate, seed, refresh

The mental model: migrations describe schema changes, seeders insert sample data, and there’s a small family of commands for moving between states while you’re iterating on a feature.

1
2
3
4
5
6
7
8
9
# Wipe the database, re-run every migration from scratch, then run seeders
sail artisan migrate:refresh --seed

# Create a new migration file in database/migrations/
sail artisan make:migration create_departments_table

# Roll back the last batch (or the last N batches) and re-apply forward —
# the fastest way to iterate on a brand-new migration you're still tweaking
sail artisan migrate:rollback --step=1 && sail artisan migrate

The third one is the workhorse for daily development: edit the migration, roll it back one step, run forward, repeat. migrate:refresh –seed is heavier — it nukes everything and re-applies, so save it for when you’ve made many changes and want a clean slate.

Installing dependencies

Composer (PHP) and npm (frontend) both run inside the Sail container. The full “I just pulled a fresh branch” sequence:

1
sail composer install && sail npm install && sail npm run dev

sail npm run dev starts Vite in dev mode for hot reloading. For a production-style build, use sail npm run build and serve the compiled assets.

Routes and pages

The flow for a new page is short. Define a route, point it at a controller method, render a Blade view.

1
2
3
4
5
// routes/web.php
use App\Http\Controllers\DashboardController;

Route::get('/dashboard', [DashboardController::class, 'index'])
    ->name('dashboard');
1
sail artisan make:controller DashboardController
1
2
3
4
5
// app/Http/Controllers/DashboardController.php
public function index()
{
    return view('dashboard', ['user' => auth()->user()]);
}

Then check what’s wired by listing every registered route:

1
sail artisan route:list

Add –except-vendor to hide the Laravel default routes and see only yours; –name=dashboard filters to a single route by name.

Getting a shell inside a container

Sometimes you need to poke around inside a container — inspect a config file, run a one-off mysql command, check redis state. Sail has shortcuts:

1
2
3
sail shell        # bash inside the app container (root — be careful)
sail mysql        # mysql client connected to the dev database
sail redis        # redis-cli connected to the local redis

Under the hood these are just docker exec calls. The equivalents:

1
2
3
docker exec -it example-app-laravel.test-1 bash    # what 'sail shell' does
docker exec -it example-app-mysql-1 bash           # what 'sail mysql shell' does
docker exec -it example-app-redis-1 sh             # what 'sail redis shell' does

The container names are <project-name>-<service-name>-1, so substitute your project’s directory name for example-app. sail shell drops you in as root in the app container — that’s deliberate (Sail’s container is a development sandbox), but it does mean you can break things by being careless. Treat it like an SSH session into a dev box.

Tests

Laravel uses PHPUnit under the hood (with Pest as a popular alternative). Sail makes the runner one command:

1
2
3
4
5
6
7
8
# Generate a unit test stub
sail artisan make:test UserTest --unit

# Run the whole suite
sail artisan test

# Run with HTML coverage (output goes to ./coverage)
sail artisan test --coverage-html coverage

–unit creates the test under tests/Unit/ (no Laravel app boot, fastest to run). Without it, you get a feature test under tests/Feature/ which boots the application and gives you the full HTTP-style helpers ($this->get(‘/dashboard’)->assertOk()). Use Unit for pure logic, Feature for anything touching routes, models, or services.

The –coverage-html flag requires Xdebug or PCOV in the container. Sail’s image ships PCOV, so this works out of the box on a default Sail setup.

When things misbehave: the cleanup checklist

Laravel caches a lot — config, routes, views, compiled service container. After bigger changes (especially editing config/*.php or env vars), the caches can lie to you. The reset:

1
2
3
4
sail artisan cache:clear
sail artisan config:clear
sail artisan route:clear
sail artisan view:clear

And of course, the first place to look when something is broken is the application log. Tail it in a separate terminal while you reproduce the bug:

1
tail -f storage/logs/laravel.log

Stack traces, query logs, anything you’ve Log::info()‘d — it all ends up here. If your app is logging to a different channel (configured in config/logging.php), check there instead.

The day-to-day shape

Once you’ve used Sail for a project or two, the daily loop becomes muscle memory: sail up -d in the morning, sail artisan commands as you build, sail artisan test before pushing, sail down when you switch projects. Nothing leaks onto the host, every project’s PHP/MySQL/Redis versions stay independent, and onboarding a new teammate is “install Docker, clone the repo, ./vendor/bin/sail up“.

For most Laravel work I do these days, I never type php directly anymore. ⛵

Posted in Web Development | Tagged , , , | Comments Off on Laravel Sail: a developer’s cheat sheet 🐳

The Core AWS Stack vs Lightsail: When Building Blocks Beat the Bundle

Amazon Web Services (AWS) gives you a Lego bin the size of a warehouse. Powerful, but overwhelming when you just want a website online. Before you can decide whether the full stack is worth the complexity, it helps to know what the core pieces actually do. ☁️

Virtual Private Cloud (VPC)

A VPC is your own private network inside AWS. You pick the address range, carve it into subnets, decide what is public-facing and what stays internal, and attach firewall rules through security groups and network ACLs. Nothing in AWS runs in a vacuum — almost every other service ends up inside a VPC somewhere.

Elastic Compute Cloud (EC2)

EC2 is virtual machines on demand. You choose an instance type (CPU, RAM, network), an Amazon Machine Image (AMI), and a region, and a few seconds later you have a Linux or Windows server. Pay by the second, terminate when you are done. It is the workhorse of AWS — every shortcut service is, somewhere underneath, EC2 in a trench coat.

Relational Database Service (RDS)

RDS is managed databases — MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, Aurora. AWS handles backups, patching, replication, and failover. You still write your own queries and tune your own indexes, but you do not have to babysit the operating system. For most teams this is worth it on day one. 🗄️

Elastic Load Balancer (ELB)

An ELB sits in front of a pool of EC2 instances (or containers) and spreads traffic across them. It also handles Transport Layer Security (TLS) termination, health checks, and graceful failover when an instance dies. The Application Load Balancer (ALB) speaks HTTP, the Network Load Balancer (NLB) speaks TCP. Without one, scaling horizontally is a do-it-yourself project.

Route 53

Route 53 is the Domain Name System (DNS) service. Register domains, host zones, point records at your load balancer, and use latency-based or geolocation routing if you want users to land on the closest region. The number is a nod to the standard DNS port — port 53. 🌐

Elastic File System (EFS)

EFS is a managed Network File System (NFS) share — a real file system that multiple EC2 instances can mount at the same time. It is the bridge between a single disk attached to one server (EBS) and object storage (S3). If you have a fleet of web servers behind an ALB and they all need to read and write the same uploads folder, EFS is the boring correct answer. 📁

EFS is arguably the more important storage service of the two for an application. It is your main file system — where uploads, session files, generated thumbnails, plugin directories, and shared application state live. Two perks worth calling out:

  • Auto Infrequent Access detection. EFS watches access patterns per file and shuffles cold files into a cheaper Infrequent Access tier on its own. You set the threshold (7, 14, 30, 60, or 90 days of no access) and AWS does the migration. The first access pulls a file back to the standard tier automatically. No lifecycle rules to write, no cron jobs, no did I forget to archive that?.
  • Automatic backups. EFS integrates with AWS Backup and turns on a default daily backup plan when you create the file system — retention defaults to 35 days. You can disable it or replace it with a custom plan, but the safe option is on out of the box.

Simple Storage Service (S3)

S3 is object storage. Buckets hold files, files have URLs, and the durability is famously high (11 nines). It is cheap, effectively infinite, and the right home for archival data — old database dumps, log shipments, build artifacts, finished video renders, static site assets you serve through a Content Delivery Network (CDN).

The mental model is: EFS is your live filesystem, S3 is your archive and your public asset bucket. S3 will not give you POSIX semantics or a mount point your application can fopen(). EFS will not give you 11 nines of durability across regions for pennies a gigabyte. Use both, for different things.

Putting it together

A boring-but-correct production setup looks like this: a VPC with public and private subnets, an ALB in the public subnets, EC2 instances in the private subnets behind the ALB, RDS in another private subnet, EFS mounted on the EC2 fleet for shared application state, Route 53 pointing your domain at the ALB, and S3 holding archives and assets.

1
2
3
4
5
Route 53  -->  ALB  -->  EC2 (private subnet)  -->  RDS (private subnet)
                              |   |
                              |   +-->  EFS (live filesystem: uploads, sessions)
                              |
                              +-->  S3 (archives, static assets, backups)

That works. It also takes a weekend to wire up the first time, and you will read more Identity and Access Management (IAM) documentation than you wanted to.

Then there is Lightsail

AWS Lightsail is the same company saying: what if we hid all of that? 💡 A Lightsail instance is a virtual private server with a flat monthly price, a pre-installed stack (WordPress, LAMP, Node.js, etc.), a managed database option, a built-in load balancer, snapshots, and a one-click static IP. The pricing is predictable — no surprise bill from a forgotten NAT Gateway.

Under the hood it is still EC2, Elastic Block Store (EBS), and friends. Lightsail just buries the dials. You get a simpler console, fewer choices, and a price that fits on a sticky note.

When to pick which

Pick Lightsail when you want a single server (or a small fleet) running a well-known stack, you value predictable pricing over flexibility, and your scaling story is “resize the instance” rather than “auto-scale across availability zones”. Personal projects, small business sites, internal tools, and side hustles all live happily on Lightsail.

Pick the full AWS stack when you need fine-grained networking, multiple environments, custom IAM policies, autoscaling groups, multi-region failover, or any of the hundred services that only plug into a real VPC. The moment you say “we need a private link to another account” or “this has to scale to a million requests a second”, Lightsail is no longer the answer.

A useful escape hatch: Lightsail can peer with a VPC. You can start simple and reach into the bigger toolbox later without rebuilding from scratch. 🚀


A few useful additions.

RDS auto-backup deserves more billing. Automated backups are on by default: RDS takes a daily snapshot plus continuous transaction log capture, which gives you point-in-time recovery anywhere inside the retention window (1 to 35 days). For most teams this alone justifies paying the RDS premium over running MySQL on a raw EC2 box.

Selling points at a glance

If you only remember one thing per service:

  • VPC — your private network, with isolation, subnets, and firewall rules you actually control.
  • EC2 — any server you want, by the second, in any region.
  • RDS — managed databases with automated backups and point-in-time recovery built in.
  • ELB — traffic spreading, TLS termination, and health-checked failover without writing your own load balancer.
  • Route 53 — DNS with latency, geo, weighted, and failover routing baked in.
  • EFS — shared file system across many EC2 instances, auto-tiering for cold files, automatic backups by default. Your live filesystem.
  • S3 — durable, cheap, effectively infinite object storage with Intelligent-Tiering for cost control. Your archive and public asset bucket.
  • Lightsail — most of the above hidden behind a sticky-note price, when you want to ship instead of architect.

The pattern is the same across all of them: AWS sells you a primitive you could probably build yourself, then quietly adds the operational features (backups, tiering, failover, monitoring) that you would forget to build until the night you needed them. ✨

Posted in AWS | Tagged , | Comments Off on The Core AWS Stack vs Lightsail: When Building Blocks Beat the Bundle

GraphQL for Java Developers: What You Actually Need to Know

GraphQL has been on the Java back-end radar for a while, mostly as something the front-end team kept bringing up. In 2022 that shifted. Spring for GraphQL 1.0 became generally available in May. The official Spring team now provides first-party GraphQL integration built on graphql-java, making GraphQL endpoints easier to adopt and evaluate. 🧩

This article is a tour of GraphQL for Java developers — what holds up in practice, what is oversold, and what the Spring side looks like in code.

One Endpoint, Client-Driven Queries

The headline feature of GraphQL is that clients dictate the exact shape of the data they want, and they do it through a single endpoint — almost always /graphql. No more /users/123, /users/123/orders, /users/123/orders/recent; one URL, one POST per request, the query body describes everything.

The wire format looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# The Query
query {
  user(id: "1") {
    name
    email
  }
}

# The Response
{
  "data": {
    "user": {
      "name": "Alex",
      "email": "alex@example.com"
    }
  }
}

The response mirrors the query exactly. If you wanted only the name, you’d ask for only the name and the response would carry only the name. That eliminates two long-running REST headaches in one move: over-fetching (the API gives you fifteen fields you don’t need) and under-fetching (you have to chain three calls to assemble the page).

The Schema is the Contract

Every GraphQL Application Programming Interface (API) is described by a Schema Definition Language (SDL) document — the strict, machine-checked blueprint of every type, every field, every argument. The server refuses to execute a query that names a field the schema doesn’t declare. No more reading documentation and crossing your fingers. ✍️

With Spring for GraphQL, you drop your schema into src/main/resources/graphql/schema.graphqls:

1
2
3
4
5
6
7
8
9
type Query {
  userById(id: ID!): User
}

type User {
  id: ID!
  name: String!
  email: String!
}

The exclamation marks mean “non-null” — Spring for GraphQL maps that straight onto Java’s nullability story, and the schema is loaded at startup with full validation. Typos fail fast. The other consequence of this strong typing is that the response shape mirrors the query exactly, which is wonderful for front-end developers — they can predict the response payload just by reading their own code.

Three Operation Types

GraphQL splits all interactions into three first-class operations:

  • Queries — read-only data fetches.
  • Mutations — writes (create, update, delete).
  • Subscriptions — long-lived streams of events pushed from server to client.

You’ll see a lot of articles claim subscriptions “are delivered over WebSockets.” That’s the most common transport, but it’s not the only one — the graphql-sse spec (Server-Sent Events) is a perfectly valid alternative if you don’t want to drag a WebSocket stack into your service. ⚠️ The point is: subscriptions are an operation type, not a transport. Pick the transport that fits your infrastructure.

Resolvers and the N+1 Trap

On the server, every field is backed by a function called a resolver. The runtime walks the query tree, calling the resolver for each requested field. Resolvers are where you connect GraphQL to your actual data — a Java Persistence API (JPA) repository, a downstream Representational State Transfer (REST) call, a Redis lookup, whatever.

In Spring for GraphQL, resolvers are just @Controller beans with the right annotations: 🔌

1
2
3
4
5
6
7
8
9
10
11
12
13
14
@Controller
public class UserController {

    private final UserRepository users;

    public UserController(UserRepository users) {
        this.users = users;
    }

    @QueryMapping
    public User userById(@Argument String id) {
        return users.findById(Long.parseLong(id)).orElse(null);
    }
}

That maps the SDL field Query.userById to a Java method. Spring for GraphQL handles the dispatch, the argument binding, and the JavaScript Object Notation (JSON) serialization.

The trap waiting for you: N+1 queries. If a client asks for a list of 50 users and each user’s orders field is resolved independently, you’ll hit your database 51 times — once for the user list, then once per user for their orders. The classic fix in the GraphQL world is DataLoader: a batching, caching shim that collects all the individual userId requests within a single execution and flushes them as one batched database call. Spring for GraphQL wraps this with @BatchMapping, which is the most ergonomic version I’ve used:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
@Controller
public class UserController {

    private final OrderRepository orders;

    @BatchMapping(typeName = "User")
    public Map<User, List<Order>> orders(List<User> users) {
        List<Long> ids = users.stream().map(User::getId).toList();
        Map<Long, List<Order>> byUser = orders.findAllByUserIdIn(ids)
                .stream()
                .collect(Collectors.groupingBy(Order::getUserId));
        return users.stream()
                .collect(Collectors.toMap(u -> u, u -> byUser.getOrDefault(u.getId(), List.of())));
    }
}

One database call for any number of users in the query. The java-dataloader library is doing the heavy lifting under the hood; you just don’t have to think about it.

Introspection: Great in Dev, Risky in Prod

One of the genuinely lovely things about GraphQL is introspection — the schema can be queried for its own structure. That’s what powers tools like GraphiQL and Apollo Studio: you point them at the endpoint, they pull the schema, and you get auto-complete, inline docs, and a query playground for free.

The caveat that doesn’t make it into the listicles: most teams disable introspection in production. A public introspection endpoint hands attackers a free map of every field, every mutation, every internal type name. graphql-java exposes a flag to turn it off, and Spring for GraphQL exposes the same via configuration. Leave it on in dev and staging; turn it off (or gate it behind auth) in production. 🛡️

“Versionless” is a Goal, Not a Guarantee

You’ll read that GraphQL APIs are “versionless” — that you evolve continuously, add fields, mark old ones with @deprecated, and never have to ship a /v2/ path. The first half of that is true; the second half is a bit of a fairy tale.

GraphQL gives you genuinely good tools for additive evolution. Adding a field never breaks anyone — old clients don’t ask for it, new clients do. @deprecated shows up in introspection and in IDE tooling, which is more visible than a JIRA ticket. But you can absolutely ship breaking changes: remove a field, narrow an enum, change a nullable field to non-null, rename a type. Any of those will break clients in production. The discipline of versionless evolution is just that — a discipline. The schema doesn’t enforce it for you.

The thing GraphQL really gives you is observability of usage. Because every query names the fields it touches, you can log queries and see exactly which clients depend on which fields. Combine that with deprecation tooling and you can retire fields safely — but it’s still work, not magic.

Errors and Status Codes

This one trips up everyone on day one: in classic GraphQL, almost every response returns Hypertext Transfer Protocol (HTTP) 200 OK, even when the operation failed. Errors come back in a dedicated errors array inside the JSON body, not as a 4xx or 5xx.

1
2
3
4
5
6
7
8
9
10
{
  "data": null,
  "errors": [
    {
      "message": "User not found",
      "path": ["userById"],
      "extensions": { "classification": "NOT_FOUND" }
    }
  ]
}

This is jarring if you’ve spent years writing Spring @ExceptionHandlers that map exceptions to 4xx/5xx. But it’s intentional — the transport (HTTP) succeeded; the application semantics (the query) had a problem, and that’s a different layer. The newer GraphQL-over-HTTP draft spec does loosen this and allow proper status codes when the response uses application/graphql-response+json, but most clients in the wild still expect the old convention. Code for 200 + errors array first; you can adopt the newer behavior later.

So, Is It Worth It?

GraphQL earns its keep when your front end is genuinely composite — a single page assembling data from five backend domains — and when you have clients you don’t fully control (mobile apps shipped to app stores, partner integrations). It’s a poor fit for simple Create, Read, Update, Delete (CRUD) services with one consumer; you’ll spend more time wiring resolvers than you’d ever spend writing REST controllers.

For Java teams in 2022, the calculus has genuinely shifted. Spring for GraphQL turns the integration cost from “weekend project” into “afternoon project,” and @BatchMapping handles the most common performance footgun out of the box. If you’ve been waiting for the right moment to put GraphQL in front of a Spring Boot service, this is the moment. 🎯

Further Reading

  • Spring for GraphQL documentationdocs.spring.io/spring-graphql. The reference is genuinely good; start with the “Controllers” and “Batch Loading” sections.
  • graphql-javagraphql-java.com. The underlying engine; you’ll only touch it directly for advanced customization.
  • The GraphQL specificationspec.graphql.org. Surprisingly readable. The “Validation” and “Execution” sections are worth the hour.
  • Production Ready GraphQL (Marc-André Giroux) — the best single book on operating GraphQL at scale.
  • GraphQL-over-HTTP working draftgithub.com/graphql/graphql-over-http. Where the status code and transport conventions are being formalized.
Posted in java | Tagged , , , | Comments Off on GraphQL for Java Developers: What You Actually Need to Know

Apache Kafka vs RabbitMQ for Messaging in Java (and Where ActiveMQ Fits In)

If you’re standing in front of a whiteboard in Java land and someone has just drawn a box labelled “message queue,” you’re probably going to argue about Apache Kafka and RabbitMQ for the next forty minutes. They’ve become the default two-horse race for back-end messaging, and the choice between them isn’t really about speed or popularity — it’s about whether you want a queue or a log. Get that distinction right and the rest of the decision falls out cleanly. ☕

The One-Sentence Difference

RabbitMQ is a traditional message broker: producers push messages into queues, consumers pop them off, and once a message is acknowledged, it’s gone. Apache Kafka is a distributed commit log: producers append records to partitioned topics, consumers read at their own pace by tracking offsets, and records hang around for a configured retention period — usually days — whether anyone has read them or not. 💡

Everything else flows from that. The protocols differ, the durability model differs, the scaling model differs, the operational pain points differ. But the shape of the abstraction is the thing.

RabbitMQ in 90 Seconds

RabbitMQ implements the Advanced Message Queuing Protocol (AMQP) 0-9-1 by default, with plugins for Message Queuing Telemetry Transport (MQTT), Streaming Text Oriented Messaging Protocol (STOMP), and a Hypertext Transfer Protocol (HTTP) API. It’s written in Erlang, which is part of the secret to its reliability — Erlang’s process model is excellent at the kind of supervised, fault-tolerant work a broker has to do.

The mental model is exchanges, bindings, and queues. A producer publishes to an exchange; the exchange routes the message to zero or more queues based on bindings; consumers subscribe to queues. Routing can be direct (exact key match), topic (pattern match), fanout (broadcast), or headers-based. Once a queue holds a message and a consumer acknowledges it, the broker deletes it. 🐰

A minimal Spring Boot consumer:

1
2
3
4
5
6
7
8
@Component
public class OrderListener {

    @RabbitListener(queues = "orders.created")
    public void onOrderCreated(OrderCreatedEvent event) {
        // ...handle the order
    }
}

You add spring-boot-starter-amqp, declare your queue and exchange as @Beans, and you’re done. The library handles connections, channels, retries, and message conversion via Jackson.

Apache Kafka in 90 Seconds

Kafka is not a broker in the RabbitMQ sense. It’s a partitioned, replicated, append-only log running across a cluster of brokers. Producers write records to topics; each topic is split into partitions; each partition is an ordered, immutable sequence of records stored on disk and replicated to a configurable number of brokers. Consumers read by maintaining their own position (an offset) into each partition.

The model has two consequences that surprise people coming from RabbitMQ:

  • Reading a record doesn’t delete it. The next consumer (or the same one, restarted) can read it again by rewinding the offset. Retention is time-based or size-based.
  • Ordering is per-partition, not per-topic. If you need strict ordering across a logical key (“all events for user 42 in order”), you partition on that key.

A Spring Boot consumer:

1
2
3
4
5
6
7
8
@Component
public class OrderEventListener {

    @KafkaListener(topics = "orders", groupId = "order-processor")
    public void onOrderEvent(ConsumerRecord<String, OrderEvent> record) {
        // ...process the event
    }
}

Same shape, totally different machine underneath.

So… Is ActiveMQ the New Kafka?

Short answer: no. ActiveMQ is the new RabbitMQ — or rather, it has always been RabbitMQ’s older Java-ecosystem sibling. Apache ActiveMQ is a Java Message Service (JMS)-first broker that has been around since 2004. The modern version, ActiveMQ Artemis (formerly HornetQ, donated by Red Hat to the Apache Software Foundation), is a ground-up rewrite that delivers genuinely high throughput and a much better internal architecture than the original ActiveMQ “Classic.”

Artemis is fast, supports AMQP, STOMP, MQTT, OpenWire, and JMS, and is a perfectly reasonable broker. But it’s still a broker. It deletes messages on acknowledgement, it tracks consumer state inside the broker, and it doesn’t have Kafka’s “the log is the truth, rewind whenever” model.

The reason people conflate ActiveMQ and Kafka is that Artemis has gotten fast enough that the old throughput argument doesn’t apply — “use Kafka because ActiveMQ can’t keep up” was true in 2014, not in 2022. But the architectural difference is unchanged: Kafka is a log; ActiveMQ is a queue. Pick based on the abstraction, not the throughput number. 📊

Side-by-Side Comparison

Aspect RabbitMQ Apache Kafka ActiveMQ Artemis
Core abstraction Broker (queue) Distributed log Broker (queue)
Implementation language Erlang Scala / Java Java
Wire protocols AMQP 0-9-1, MQTT, STOMP Custom binary (Kafka protocol) AMQP, MQTT, STOMP, OpenWire, JMS
Message lifetime Deleted on ack Retained for configured period (e.g. 7 days) Deleted on ack
Consumer state Tracked in broker Tracked by consumer (offset) Tracked in broker
Replay Not natively; needs a dead-letter dance First-class — just reset the offset Not natively
Ordering guarantee Per queue Per partition Per queue
Throughput ceiling Tens of thousands msg/s per node Hundreds of thousands to millions msg/s per cluster Tens of thousands msg/s per node
Latency Sub-millisecond achievable Few-millisecond typical (batch-oriented) Sub-millisecond achievable
Routing model Rich (exchanges, bindings, headers) Topic + partition key, no broker-side routing Rich (similar to RabbitMQ)
Transactions Yes (broker) Yes (across topics, idempotent producer) Yes (JMS XA)
Stream processing External (e.g. Spring Cloud Stream) First-class (Kafka Streams, ksqlDB) External
Cluster coordinator Built-in (Erlang clustering / Khepri or Mnesia) ZooKeeper, or KRaft on newer versions Built-in
Management UI Web UI ships in-box None official; tools like AKHQ, Conduktor Web console ships in-box
Spring starter spring-boot-starter-amqp spring-kafka spring-boot-starter-artemis
Best fit Task queues, RPC, complex routing Event streaming, audit logs, change-data-capture JMS-heavy enterprises, AMQP polyglot

How to Actually Choose

Forget the benchmarks. Ask these three questions:

  1. Do I need to replay history? If a new consumer needs to see events from yesterday, last week, or last month, you want Kafka. If “once it’s processed, it’s gone” is your model, you want a broker.
  2. Do I need broker-side routing? If your producer publishes one logical event and seven different consumers each need a different subset based on routing keys or headers, that’s RabbitMQ’s home turf. Kafka does this differently — every consumer reads the whole topic and filters client-side, or you write a separate topic per logical subset.
  3. What does “high throughput” actually mean for you? A few thousand messages per second is comfortably in RabbitMQ / ActiveMQ territory. Tens of thousands per second sustained, with partitioned ordering and replayability, is where Kafka starts to earn its operational complexity tax. Don’t pay that tax if you don’t need it. 🐳

The Operational Footnote

Kafka is more work to run than RabbitMQ. A production Kafka cluster needs at least three brokers, careful disk planning (it’s an append-only log; throughput is bounded by sequential write speed), partition-count decisions you can’t easily change later, and consumer-group rebalancing that misbehaves in subtle ways. Managed options — Confluent Cloud, Amazon Managed Streaming for Apache Kafka (MSK), Aiven — are how most teams avoid building this muscle in-house. 🛡️

RabbitMQ clusters are simpler. Three nodes, quorum queues for the data you really can’t lose, a mirrored management UI, and you’re done. The operational ceiling is lower but so is the floor.

ActiveMQ Artemis sits in between — easier than Kafka, slightly more configuration surface area than RabbitMQ, and a natural fit if your shop is already JMS-heavy.

Closing Thoughts

The Kafka-versus-RabbitMQ debate is mostly a category error. They aren’t competing products; they’re competing abstractions. Kafka is a log you happen to read like a queue. RabbitMQ is a queue with rich routing. ActiveMQ Artemis is a queue with deep JMS integration. Pick the abstraction your problem actually has, run it on the managed service if you can, and put the saved attention into your application code. 🎉

Further Reading

  • Kafka: The Definitive Guide (Confluent)confluent.io/resources/kafka-the-definitive-guide. The standard reference; covers partitions, replication, exactly-once semantics, Kafka Streams.
  • RabbitMQ documentationrabbitmq.com/documentation.html. Especially the “Reliability Guide” and “Quorum Queues” sections.
  • ActiveMQ Artemis documentationactivemq.apache.org/components/artemis/documentation/. The architecture and clustering chapters are worth the time.
  • Spring for Apache Kafkadocs.spring.io/spring-kafka. Listener containers, transaction management, error handling.
  • Spring AMQPdocs.spring.io/spring-amqp. The RabbitMQ-side equivalent.
  • “Kafka is not a queue” — Jay Kreps’ classic LinkedIn engineering post that explains the log abstraction better than any documentation does.
Posted in java | Tagged , , , , , | Comments Off on Apache Kafka vs RabbitMQ for Messaging in Java (and Where ActiveMQ Fits In)

Java Web Servers Compared: Tomcat, JBoss EAP, WildFly, and Spring Boot

If you’ve been writing Java for the web at any point in the last two decades, you’ve had to pick a web server or application server at least once. The choices haven’t changed much in name — Tomcat, JBoss, WildFly, and the relative newcomer Spring Boot are still the four most common answers — but the boundaries between them have. This post is a quick tour of each one and an honest table of when to reach for which. ☕

Servlet Container vs. Application Server

Before we get into the four contenders, one piece of vocabulary that trips people up. A servlet container implements the part of the Jakarta Enterprise Edition (Jakarta EE, formerly Java EE) specification that handles Hypertext Transfer Protocol (HTTP) requests, servlets, and JavaServer Pages (JSP). A full Jakarta EE application server implements that and the rest of the platform — Enterprise JavaBeans (EJB), Java Persistence Application Programming Interface (JPA), Java Message Service (JMS), Java Transaction API (JTA), Contexts and Dependency Injection (CDI), Java API for RESTful Web Services (JAX-RS), and so on.

Tomcat is a servlet container. JBoss Enterprise Application Platform (EAP) and WildFly are full Jakarta EE servers. Spring Boot is something different — it skips the container/server distinction entirely and ships an embedded server inside your application JAR. We’ll come back to that. 💡

Apache Tomcat

Tomcat is the original, the workhorse, the one that has been quietly running half the Java web in production since 1999. It’s developed under the Apache Software Foundation, implements the servlet, JSP, WebSocket, and Expression Language specs, and does not implement the heavier Jakarta EE pieces (no EJB, no JMS, no JTA out of the box).

That smaller surface area is its biggest feature. Tomcat starts in seconds, the configuration files (server.xml, web.xml) are short and well-understood, and the operational model is boring in the best sense — you drop a Web Application Archive (WAR) into webapps/ and a few seconds later the app is live.

1
2
3
4
5
wget https://dlcdn.apache.org/tomcat/tomcat-10/v10.1.18/bin/apache-tomcat-10.1.18.tar.gz
tar xzf apache-tomcat-10.1.18.tar.gz
cd apache-tomcat-10.1.18
cp ~/Downloads/my-app.war webapps/
./bin/catalina.sh run

If your app needs EJB or JMS, you’ll be wiring in Spring or Apache Camel or a standalone broker to fill those gaps. At which point you’re effectively building a half-application-server out of libraries — which is exactly the gap Spring Boot exists to close.

JBoss EAP

JBoss EAP is Red Hat’s commercial Jakarta EE application server. It’s the supported, certified, contractually-backed product version of WildFly. Same codebase, slower release cadence, security patches backported for years, paid support attached. You buy JBoss EAP when you have a procurement department that wants someone to sue if the server crashes at 2 a.m.

From a developer perspective, JBoss EAP is WildFly — the management console looks the same, the standalone.xml configuration looks the same, the deployment commands look the same. The difference is the support contract and the slower, hardened release line. 🛡️

WildFly

WildFly is the upstream open-source community version of JBoss EAP. It was renamed from JBoss Application Server in 2013 to make the distinction between the community product and the Red Hat product less confusing. (It mostly worked.) WildFly is fully compliant with Jakarta EE, supports MicroProfile out of the box for cloud-native APIs, and has a genuinely good administration story — the jboss-cli.sh tool can configure datasources, deploy WARs, change subsystem settings, and a hundred other things, scriptably.

1
2
./bin/standalone.sh -c standalone-full.xml
./bin/jboss-cli.sh --connect --command="deploy /path/to/my-app.war"

If you actually need EJB, JMS via HornetQ/Artemis, distributed transactions, or the rest of the Jakarta EE stack, WildFly delivers all of it without you having to assemble it from libraries. The cost is startup time (tens of seconds, not single digits) and configuration surface area (standalone.xml is a thousand-plus lines on a fresh install).

Spring Boot

Spring Boot is the answer to the question: what if we just put the server inside the app? You write a main() method, the framework boots an embedded Tomcat (or Jetty, or Undertow) on a port, and your application starts handling requests. No external server to install, no WAR to deploy, no separate lifecycle to manage. You build a single executable JAR and run it like any other Java program.

1
2
3
4
5
6
@SpringBootApplication
public class MyApp {
    public static void main(String[] args) {
        SpringApplication.run(MyApp.class, args);
    }
}
1
java -jar my-app.jar

This is the model that won the last decade. It fits perfectly with containers — one JAR, one process, one port, one health endpoint — and it removes an entire category of operational tickets (“why does our staging Tomcat have a different version than production?”). 🐳

Under the hood, Spring Boot defaults to embedded Tomcat. You can switch to embedded Jetty or Undertow by swapping a starter dependency. None of that changes your application code.

Side-by-Side Comparison

The honest summary:

Aspect Tomcat JBoss EAP WildFly Spring Boot
Type Servlet container Jakarta EE app server (commercial) Jakarta EE app server (community) Embedded-server framework
License Apache 2.0 Commercial (Red Hat) LGPL 2.1 Apache 2.0
Vendor Apache Software Foundation Red Hat / IBM Red Hat / community Broadcom / VMware Tanzu
Jakarta EE compliance Web Profile only (Servlet, JSP, WebSocket, EL) Full Platform Full Platform + MicroProfile Not certified; uses pieces via Spring
EJB / JMS / JTA No (add libraries) Yes Yes No (use Spring equivalents or libraries)
Deployment unit WAR dropped into webapps/ WAR / EAR via CLI or console WAR / EAR via CLI or console Executable JAR (or WAR if you must)
Startup time Seconds Tens of seconds Tens of seconds Seconds
Memory footprint Light (~150 MB) Heavy (500 MB+) Heavy (500 MB+) Light to moderate
Config style server.xml, web.xml standalone.xml, jboss-cli, web console standalone.xml, jboss-cli, web console application.properties / application.yml
Hot redeploy Reload context Yes, via CLI Yes, via CLI DevTools restart in dev
Container fit Good Workable, heavy Workable, heavy Excellent (one JAR, one process)
Commercial support Third parties (Tomitribe, etc.) Red Hat subscription Community / Red Hat Broadcom / Tanzu / third parties
Best fit Simple servlet/JSP or Spring-based apps Regulated enterprises needing certified Jakarta EE Teams wanting full Jakarta EE without paying New cloud-native services

So Which Do You Pick?

For a new service in 2022 with no legacy constraints, Spring Boot is the default answer. You get the productivity of a modern framework, an embedded server that fits the container model, and an ecosystem that covers virtually every integration you’ll need. The friction is low and the talent pool is huge. 🎉

If you’re maintaining an existing Tomcat deployment, there’s rarely a reason to migrate just for the sake of it — Tomcat is still excellent at what it does, and a Spring application running inside a standalone Tomcat is a perfectly valid setup. The migration to Spring Boot’s embedded model is a refactor you do when it pays for itself, not a religious obligation.

JBoss EAP and WildFly remain the right answer when you genuinely need the Jakarta EE Full Platform — distributed transactions across multiple resources, message-driven EJBs, two-phase commit between a database and a JMS queue. If that sentence sounds like your domain, you already knew. If it doesn’t, you probably don’t need them.

Closing Thoughts

The boundary between “servlet container” and “application server” mattered a lot in 2005. Today, with Spring Boot doing most of what an app server used to do via libraries, and Jakarta EE app servers slimming down to compete (WildFly’s wildfly-jar-maven-plugin can build a single fat JAR these days), the four contenders overlap more than they diverge. Pick the one your team already knows, unless there’s a specific feature pulling you elsewhere. 🛠️

Further Reading

  • Apache Tomcat documentationtomcat.apache.org. Configuration reference for whichever Tomcat major version you’re running.
  • WildFly documentationdocs.wildfly.org. Admin guide, model reference, and the jboss-cli tutorial.
  • Red Hat JBoss EAP documentationaccess.redhat.com/documentation/en-us/red_hat_jboss_enterprise_application_platform. Production hardening, clustering, and the support matrix.
  • Spring Boot reference documentationdocs.spring.io/spring-boot. The authoritative reference for the version you’re on.
  • Jakarta EE specificationsjakarta.ee/specifications. The actual standards each of these servers claims to implement.
Posted in java | Tagged , , , , , | Comments Off on Java Web Servers Compared: Tomcat, JBoss EAP, WildFly, and Spring Boot

Starting a Spring Boot API Microservice From Scratch With Spring Initializr

The fastest way to get a new Java microservice off the ground is also the most boring one, and that’s a compliment. You go to start.spring.io, click a few checkboxes, download a zip, and you have a runnable Hypertext Transfer Protocol (HTTP) service in under five minutes. No archetype incantations, no copy-pasting a pom.xml from a colleague’s repo, no “wait, which version of Spring Cloud goes with which Boot?” — Initializr handles compatibility for you. ☕

What We’re Building

A small, API-laden microservice — call it orders-service. It exposes a handful of Representational State Transfer (REST) endpoints, talks Java Script Object Notation (JSON), persists to a database, validates inputs, exposes health checks, and publishes an interactive Swagger / OpenAPI page that lets you try the endpoints from a browser. The goal isn’t a particular business problem — it’s a clean skeleton you can fork the next time someone says “we need a new service for X.”

Generate the Project

Head to https://start.spring.io. The form is small but every choice matters, so let’s walk through it.

  • Project: Maven. Gradle is fine too; pick whichever your team already uses. The rest of this post shows Maven snippets.
  • Language: Java.
  • Spring Boot version: the latest non-snapshot release. Avoid the SNAPSHOT and M (milestone) entries — they’re moving targets.
  • Group / Artifact / Name: com.acme, orders-service, orders-service. Group becomes the base package; artifact becomes the JAR name.
  • Packaging: Jar. Wars only make sense if you’re deploying into an existing servlet container, which you shouldn’t be.
  • Java version: the latest Long-Term Support (LTS) release available — 17 at time of writing. Stay on LTS for production.

Then the dependency list. For an API-laden service, click the Add Dependencies button and pick:

  • Spring Web — REST controllers, embedded Tomcat, JSON via Jackson.
  • Spring Data JPA — repository abstraction over the database.
  • PostgreSQL Driver — or your database of choice (MySQL Driver, H2 for in-memory testing).
  • Validation — Jakarta Bean Validation (the @Valid, @NotNull, @Size annotations).
  • Spring Boot Actuator — health, metrics, info endpoints.
  • Spring Boot DevTools — auto-restart on classpath changes during local dev.
  • Lombok — kill the getter/setter/builder boilerplate.
  • Testcontainers — run a real PostgreSQL in tests instead of mocking the repository layer.

Click Generate. You get a zip. Unzip it, cd in, and you can already run the service:

1
2
unzip orders-service.zip && cd orders-service
./mvnw spring-boot:run

That’s it. You’ll see Tomcat start on port 8080 and the embedded actuator endpoints come alive. 🎉

Add Swagger / OpenAPI Manually

Initializr doesn’t ship a Swagger checkbox, so add it by hand. The community-maintained library that plays nicely with current Spring Boot is springdoc-openapi. In your pom.xml:

1
2
3
4
5
<dependency>
    <groupId>org.springdoc</groupId>
    <artifactId>springdoc-openapi-starter-webmvc-ui</artifactId>
    <version>2.3.0</version>
</dependency>

Restart the app and visit http://localhost:8080/swagger-ui.html. The library scans your @RestController classes, reads their annotations, and generates an interactive page where you can click Try it out and fire real requests. The raw OpenAPI 3 spec is served at /v3/api-docs in JSON, which is what you give to consumers who want to generate clients. 📘

Avoid the older springfox library you’ll find in many Stack Overflow answers — it has not kept up with current Spring Boot and you’ll fight startup errors. springdoc-openapi is the answer.

A Minimal Working Endpoint

Initializr drops a OrdersServiceApplication.java with the @SpringBootApplication annotation. Add an entity, a repository, a Data Transfer Object (DTO), and a controller next to it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
@Entity
@Table(name = "orders")
@Getter @Setter
@NoArgsConstructor @AllArgsConstructor
public class Order {
    @Id @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @NotBlank
    private String customerEmail;

    @NotNull @Positive
    private BigDecimal amount;

    private Instant createdAt = Instant.now();
}
1
2
3
public interface OrderRepository extends JpaRepository<Order, Long> {
    List<Order> findByCustomerEmail(String customerEmail);
}
1
2
3
4
public record CreateOrderRequest(
    @NotBlank @Email String customerEmail,
    @NotNull @Positive BigDecimal amount
) {}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
@RestController
@RequestMapping("/api/orders")
@RequiredArgsConstructor
public class OrderController {

    private final OrderRepository repository;

    @PostMapping
    @ResponseStatus(HttpStatus.CREATED)
    public Order create(@Valid @RequestBody CreateOrderRequest request) {
        return repository.save(new Order(
            null, request.customerEmail(), request.amount(), Instant.now()
        ));
    }

    @GetMapping("/{id}")
    public Order get(@PathVariable Long id) {
        return repository.findById(id)
            .orElseThrow(() -> new ResponseStatusException(HttpStatus.NOT_FOUND));
    }

    @GetMapping
    public List<Order> list(@RequestParam(required = false) String customerEmail) {
        return customerEmail == null
            ? repository.findAll()
            : repository.findByCustomerEmail(customerEmail);
    }
}

That’s a real, validated, persistent endpoint in under fifty lines. Lombok handles the boilerplate, Spring Data writes the SQL, and Jakarta Validation rejects bad payloads before they ever reach your code.

Wire Up the Database

The default application.properties Initializr drops is empty. Add database settings, but read them from environment variables so you can deploy the same artifact to multiple environments:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
spring.application.name=orders-service
server.port=${PORT:8080}

spring.datasource.url=${DB_URL:jdbc:postgresql://localhost:5432/orders}
spring.datasource.username=${DB_USER:orders}
spring.datasource.password=${DB_PASSWORD:orders}

spring.jpa.hibernate.ddl-auto=validate
spring.jpa.properties.hibernate.jdbc.time_zone=UTC

management.endpoints.web.exposure.include=health,info,metrics,prometheus
management.endpoint.health.probes.enabled=true

springdoc.swagger-ui.path=/swagger-ui.html
sprindoc.api-docs.path=/v3/api-docs

Two important defaults to flag:

  • spring.jpa.hibernate.ddl-auto=validate — validates that the schema matches your entities at startup, but does not create or modify tables. Use a real migration tool (Flyway or Liquibase) for that.
  • management.endpoint.health.probes.enabled=true — exposes /actuator/health/liveness and /actuator/health/readiness, which are exactly what Kubernetes wants. 🐳

Add Flyway for Migrations

Add to pom.xml:

1
2
3
4
<dependency>
    <groupId>org.flywaydb</groupId>
    <artifactId>flyway-core</artifactId>
</dependency>

Then create src/main/resources/db/migration/V1__create_orders.sql:

1
2
3
4
5
6
7
8
CREATE TABLE orders (
    id              BIGSERIAL PRIMARY KEY,
    customer_email  VARCHAR(255) NOT NULL,
    amount          NUMERIC(12, 2) NOT NULL CHECK (amount > 0),
    created_at      TIMESTAMP NOT NULL DEFAULT now()
);

CREATE INDEX idx_orders_customer_email ON orders(customer_email);

Flyway runs on app startup, applies any pending migrations, and refuses to start if a previously-applied migration’s checksum has changed. That last property is the single most useful safety net in Java backend development — it makes “I changed the migration after it was applied somewhere” loud instead of silent. 🛡️

A Quick Test With Testcontainers

Real database tests are a hill worth dying on. @DataJpaTest with the in-memory H2 driver tests Hibernate’s idea of your database, not the actual database. Testcontainers spins up a real PostgreSQL in Docker, points your test at it, and tears it down when you’re done:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
@SpringBootTest
@Testcontainers
class OrderControllerTest {

    @Container
    static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:16");

    @DynamicPropertySource
    static void props(DynamicPropertyRegistry registry) {
        registry.add("spring.datasource.url",      postgres::getJdbcUrl);
        registry.add("spring.datasource.username", postgres::getUsername);
        registry.add("spring.datasource.password", postgres::getPassword);
    }

    @Autowired private TestRestTemplate restTemplate;

    @Test
    void rejectsInvalidPayload() {
        var response = restTemplate.postForEntity(
            "/api/orders",
            Map.of("customerEmail", "not-an-email", "amount", -5),
            String.class
        );
        assertThat(response.getStatusCode()).isEqualTo(HttpStatus.BAD_REQUEST);
    }
}

Slow on first run (Docker pulls the image), fast on every subsequent run.

Dockerize It

Spring Boot has built-in image building. No Dockerfile needed:

1
2
./mvnw spring-boot:build-image \
    -Dspring-boot.build-image.imageName=acme/orders-service:0.1.0

That uses Cloud Native Buildpacks under the hood to produce a small, layered image you can docker run immediately. If you’d rather hand-roll a Dockerfile for fine-grained control, the multi-stage version is:

1
2
3
4
5
6
7
8
9
10
FROM eclipse-temurin:17-jdk AS build
WORKDIR /app
COPY . .
RUN ./mvnw -B package -DskipTests

FROM eclipse-temurin:17-jre
WORKDIR /app
COPY --from=build /app/target/orders-service-*.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]

The Final Library Cheat Sheet

If you’re starting a new API-laden Spring Boot service today and you want a sensible default stack, pick these:

  • spring-boot-starter-web — REST + embedded Tomcat.
  • spring-boot-starter-data-jpa + the relevant JDBC driver — persistence.
  • spring-boot-starter-validation — request validation.
  • spring-boot-starter-actuator — health, metrics, probes.
  • springdoc-openapi-starter-webmvc-ui — Swagger UI + OpenAPI 3 spec.
  • lombok — boilerplate killer.
  • flyway-core — versioned schema migrations.
  • testcontainers (postgresql, junit-jupiter) — real-database integration tests.
  • spring-boot-starter-security — when you need authentication (often the next thing you’ll add).

Closing Thoughts

Spring Initializr does its best work when you treat it as a starting point, not a destination. Generate the project, run it once to confirm it boots, then add the one or two libraries Initializr doesn’t include (Swagger, sometimes Flyway), wire up a real database, and you’re at the point where you can start writing actual business logic. The whole arc from “empty browser tab” to “first endpoint live in Docker” is comfortably under an afternoon. 💡

Further Reading

  • Spring Initializrstart.spring.io. The actual tool. The web version, the IDE integrations (IntelliJ, VS Code), and the underlying REST API all live here.
  • Spring Boot reference documentationdocs.spring.io/spring-boot/docs/current/reference/html/. Always the authoritative source for the version you’re on.
  • springdoc-openapi documentationspringdoc.org. Configuration, annotations, and migration notes from springfox.
  • Testcontainers for Javajava.testcontainers.org. Containers for PostgreSQL, MySQL, Kafka, Redis, and just about anything else you might integrate with.
  • Flyway documentationflywaydb.org/documentation. Naming conventions, callbacks, and CI integration.

Happy bootstrapping. 🛠️

Posted in java | Tagged , , , | Comments Off on Starting a Spring Boot API Microservice From Scratch With Spring Initializr

Why Ember.js Still Makes Sense for Big Teams Building Big Apps

In a JavaScript world dominated by React’s flexibility and Vue’s friendliness, Ember.js can feel like the quiet older sibling who keeps showing up to work in a suit. It’s opinionated, batteries-included, and unapologetically convention-driven. Which is exactly why some of the largest engineering teams in the world — LinkedIn, Apple Music’s web UI, Intercom, Square’s dashboard, Discourse — still bet on it. 🔥

What Is Ember.js?

Ember is a framework, not a library. The distinction matters. React gives you a rendering primitive and lets you pick the router, state manager, testing harness, build tool, data layer, and folder layout yourself. Ember gives you all of that in one box: router, components, data layer (Ember Data), testing framework (QUnit), build pipeline (Ember CLI), and a strong convention for where every file goes.

It’s the same philosophy as Ruby on Rails: convention over configuration. If you accept the conventions, you write less code and onboard new engineers faster. If you fight the conventions, you have a bad time.

Why Ember? (And Why Now?)

The case for Ember is rarely “it’s the fastest” or “it has the smallest bundle” — those are not its strengths. The case is about cost over time:

  • Stability without stagnation — Ember follows a six-week release cadence with strict semver and a deprecation pipeline. Upgrades from version to version are mostly mechanical, not rewrites. Long-running apps survive 5+ years on Ember without the “big rewrite” you see in React projects every 18 months.
  • One way to do things — there’s a canonical place for routes, components, models, services, and adapters. New hire on day one knows where to look.
  • First-class testing — every ember-cli generator scaffolds a test file alongside the source. Unit, integration, and acceptance tests are first-class citizens, not an afterthought.
  • Ember Data — a JSON:API client baked in. Identity-mapped, cached, with relationships and dirty tracking out of the box. You don’t roll your own data layer.

Is It Better for Bigger Teams?

Yes — and that’s not a coincidence, it’s the design goal. Ember’s opinions act as a coordination tax that pays back at scale.

On a 5-person React team, freedom is a feature. Everyone knows the codebase, conventions emerge organically, and the team can pivot when something better comes along. On a 50-person Ember team across multiple squads, freedom becomes friction. Squad A’s state management decision becomes Squad B’s onboarding pain. Ember sidesteps that whole class of debate — there is a state management story (@tracked, services), there is a routing story, there is a data layer story. Squads argue about features instead of stacks.

This is why LinkedIn’s web app, with hundreds of engineers, ships on Ember. The opinions scale; the decisions don’t multiply with headcount. 💡

The Architecture Pattern

The most common production shape for an Ember app: Ember owns the entire UI as a Single-Page Application (SPA), the backend is sliced into microservices, and Ember Data adapters glue them together over JSON:API.

1
2
3
4
5
6
7
8
9
┌───────────────────────────────────────────────┐
│              Ember.js SPA (browser)           │
│   Routes · Components · Services · Ember Data │
└──────────────┬────────────────────────────────┘
               │ JSON:API over HTTPS
      ┌────────┼────────┬────────────┬─────────┐
      ▼        ▼        ▼            ▼         ▼
   users     billing  catalog   notifications  search
   (Go)      (Java)   (Node)    (Python)       (Elastic)

A typical setup:

  • One Ember app — the monolithic frontend — owns routing, layout, design system, and user experience end to end. No micro-frontend fragmentation.
  • Backend split into focused services (auth, billing, catalog, search, notifications), each exposing JSON:API endpoints.
  • Ember Data adapters per model — sometimes per service — handle URL conventions, auth headers, and error normalization.
  • A thin gateway or Backend-for-Frontend (BFF) in front, if request aggregation or auth fan-out is needed.

An adapter pointing at a specific microservice looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// app/adapters/invoice.js — billing service
import JSONAPIAdapter from '@ember-data/adapter/json-api';
import { inject as service } from '@ember/service';

export default class InvoiceAdapter extends JSONAPIAdapter {
  @service session;

  host = 'https://billing.api.acme.com';
  namespace = 'v1';

  get headers() {
    return {
      Authorization: `Bearer ${this.session.token}`,
      Accept: 'application/vnd.api+json',
    };
  }
}

And a component that fetches and renders invoices is just:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// app/components/invoice-list.js
import Component from '@glimmer/component';
import { inject as service } from '@ember/service';
import { tracked } from '@glimmer/tracking';

export default class InvoiceListComponent extends Component {
  @service store;
  @tracked invoices = [];

  constructor() {
    super(...arguments);
    this.load();
  }

  async load() {
    this.invoices = await this.store.findAll('invoice');
  }
}
1
2
3
4
5
6
{{! app/components/invoice-list.hbs }}
<ul>
  {{#each this.invoices as |invoice|}}
    <li>{{invoice.number}} — {{invoice.amount}}</li>
  {{/each}}
</ul>

That’s the whole component. No reducer, no slice, no thunk, no useEffect dance. The store handles caching, identity, and the network call.

Top 5 Ember Libraries Worth Knowing

A short tour of the addons that show up in almost every serious Ember production app. All of them are open source — click through and read the source, the Ember community’s code tends to be remarkably clean and well-tested. 📚

  1. ember-concurrency — Tasks instead of promises. Cancellation, debouncing, dropping, restartable behavior, all declarative. Once you’ve used it, raw promises feel primitive.
    Source: github.com/machty/ember-concurrency

  2. ember-data — Already mentioned, but worth listing. JSON:API client, identity map, relationships, dirty tracking. The reference implementation for how to talk to a REST/JSON:API backend from an SPA.
    Source: github.com/emberjs/data

  3. ember-power-select — A flexible, accessible, fully-themed select/typeahead component. Sounds boring; replaces about 800 lines of hand-rolled select logic in every app.
    Source: github.com/cibernox/ember-power-select

  4. ember-simple-auth — Authentication and session management. Pluggable authenticators (OAuth2 — the second version of Open Authorization, JWT, custom), session persistence, route mixins to gate access. The de facto auth layer.
    Source: github.com/mainmatter/ember-simple-auth

  5. ember-cli-mirage — A client-side API mock you wire into your dev server and acceptance tests. Define your backend’s shape in JavaScript, get a fake server with realistic latency and relationships. Invaluable when the backend microservice you depend on is still in flight.
    Source: github.com/miragejs/ember-cli-mirage

So Should You Pick Ember in 2020?

If you’re a solo developer building a side project, probably not — the conventions feel heavy when there’s no team to coordinate. If you’re a 3-person startup shipping a Minimum Viable Product (MVP) and you might pivot the whole stack next quarter, also probably not. (And yes, that’s Minimum Viable Product — not Most Valuable Player, and not Most Valuable Person. Tech acronyms collide constantly. 🏀)

But if you’re building an application that needs to be maintained by dozens of engineers across multiple squads for the next five years, and you’d rather argue about features than about folder structure — Ember is still one of the strongest bets you can make. The TypeScript story is getting better, Octane (the modern component model) has fixed most of the historical ergonomics complaints, and the upgrade treadmill is gentler than anything else in this space. 🚀

Posted in javascript | Tagged , , | Comments Off on Why Ember.js Still Makes Sense for Big Teams Building Big Apps