Frontend Masters Boost RSS Feed

Introducing Drizzle

Adam Rackis — Mon, 17 Jun 2024 11:28:41 +0000

This is a post about an exciting new ORM tool (that’s “object relational mapper”) that is different than any ORM I’ve used before—and I’ve used quite a few! Spoiler: it’s Drizzle.

Wait, what’s an ORM?

Even if you’re using a non-relational database now (think MongoDB or Redis), sooner or later you’ll likely need a relational DB.

Knowing SQL is an essential skill for any software engineer, but writing SQL directly can be tricky! The tooling is usually primitive, with only minimal auto-complete to guide you, and you invariably go through a process of running your query, correcting errors, and repeating until you get it right.

ORMs try to help you with this process of crafting SQL. Typically, you tell the ORM about the shape of your DB and it exposes APIs to do typical things. If you have a books table in your DB, an ORM will give you an API for it where you can do stuff like:

const longBooks = books.find({ pages: { gt: 500 } });

Behind the scenes, the SQL created might be something like:

SELECT * FROM books WHERE pages > 500;

That may not look like a massive simplification, but as the parameters get more complex or strung together, the SQL can get a bit mindbending. Not to mention the ORM keeps that code in a language you’re likely already using, like JavaScript.

This ease of use may seem nice, but it can cause other problems. For example, you might struggle figuring out how to do non-trivial queries. And there are performance foot-guns, such as the infamous Select N + 1 problem, which you might cause without realizing it due to the abstracted away syntax.

Why Drizzle is Different

Drizzle takes a novel approach. Drizzle does provide you a traditional ORM querying API, like we saw above. But in addition to that, it also provides an API that is essentially a layer of typing on top of SQL itself. So rather than what we saw before, we might query our books table like this

const longBooks = await db
  .select()
  .from(books)
  .where(gt(books.pages, 500));

It’s more lines, but it’s closer to actual SQL, which provides us some nice benefits: it’s easier to learn, more flexible, and avoids traditional ORM footguns.

Let’s dive in and look closer. This post will take a brief overview of setting up Drizzle, and querying, and then do a deeper dive showing off some of its powerful abilities with this typed SQL querying API. The docs are here if you’d like to look closer at anything.

Using Drizzle in general, and some of the advanced things we’ll cover in this post requires a decent knowledge of SQL. If you’ve never, ever used SQL, you might struggle with a few of the things we discuss later on. That’s expected. Skim and jump over sections as needed. If nothing else, hopefully this post will motivate you to look at SQL.

Setting up the Schema

Drizzle can’t do much of anything if it doesn’t know about your database. There’s lots of utilities for showing Drizzle the structure (or schema) of your tables. We’ll take a very brief look, but a more complete example can be found here.

Drizzle supports Postgres, MySQL, and SQLite. The ideas are the same either way, but we’ll be using MySQL.

Let’s start to set up a table.

import { int, json, mysqlTable, varchar } from "drizzle-orm/mysql-core";

export const books = mysqlTable("books", {
  id: int("id").primaryKey().autoincrement(),
  userId: varchar("userId", { length: 50 }).notNull(),
  isbn: varchar("isbn", { length: 25 }),
  pages: int("pages"),
});

We tell Drizzle about our columns (we won’t show all of them here), and their data types.

Now we can run queries:

const result = await db
  .select()
  .from(books)
  .orderBy(desc(books.id))
  .limit(1);

This query returns an array of items which match the schema we provided Drizzle for this table.

Alternatively, as expected, we can also narrow our select list.

const result = await db
  .select({ id: books.id, isbn: books.isbn })
  .from(books)
  .orderBy(desc(books.id))
  .limit(1);

Note that the types of the columns match whatever we define in the schema. We won’t go over every possible column type (check the docs), but let’s briefly look at the JSON type:

export const books = mysqlTable("books", {
  id: int("id").primaryKey().autoincrement(),
  userId: varchar("userId", { length: 50 }).notNull(),
  isbn: varchar("isbn", { length: 25 }),
  pages: int("pages"),
  authors: json("authors"),
});

This adds an authors field to each book. But the type might not be what you want. Right now it’s unknown. This makes sense: JSON can have just about any structure. Fortunately, if you know your json column will have a predictable shape, you can specify it, like this:

export const books = mysqlTable("books", {
  id: int("id").primaryKey().autoincrement(),
  userId: varchar("userId", { length: 50 }).notNull(),
  isbn: varchar("isbn", { length: 25 }),
  pages: int("pages"),
  authors: json("authors").$type(),
});

And now, when we check, the authors property is of type string[] | null.

If you were to mark the authors column as notNull() it would be typed as string[]. As you might expect, you can pass any type you’d like into the $type helper.

Query Whirlwind Tour

Let’s run a non-trivial, but still basic query to see what Drizzle looks like in practice. Let’s say we’re looking to find some nice beach reading for the summer. We want to find books that belong to you (userId == “123”), and is either less than 150 pages, or was written by Stephan Jay Gould. We want the first ten, and we want them sort from most recently added to least recently added (the id key is auto-numbered, so we can sort on that for the same effect)

In SQL we’d do something like this:

SELECT *
FROM books
WHERE userId = '123' AND (pages < 150 OR authors LIKE '%Stephen Jay Gould%')
ORDER BY id desc
LIMIT 10

With Drizzle we’d write this:

const result = await db
  .select()
  .from(books)
  .where(
    and(
      eq(books.userId, userId),
      or(lt(books.pages, 150), like(books.authors, "%Stephen Jay Gould%"))
    )
  )
  .orderBy(desc(books.id))
  .limit(10);

Which works!

[
  {
    "id": 1088,
    "userId": "123",
    "authors": ["Siry, Steven E"],
    "title": "Greene: Revolutionary General (Military Profiles)",
    "isbn": "9781574889130",
    "pages": 144
  },
  {
    "id": 828,
    "userId": "123",
    "authors": ["Morton J. Horwitz"],
    "title": "The Warren Court and the Pursuit of Justice",
    "isbn": "0809016257",
    "pages": 144
  },
  {
    "id": 506,
    "userId": "123",
    "authors": ["Stephen Jay Gould"],
    "title": "Bully for Brontosaurus: Reflections in Natural History",
    "isbn": "039330857X",
    "pages": 544
  },
  {
    "id": 412,
    "userId": "123",
    "authors": ["Stephen Jay Gould"],
    "title": "The Flamingo's Smile: Reflections in Natural History",
    "isbn": "0393303756",
    "pages": 480
  },
  {
    "id": 356,
    "userId": "123",
    "authors": ["Stephen Jay Gould"],
    "title": "Hen's Teeth and Horse's Toes: Further Reflections in Natural History",
    "isbn": "0393311031",
    "pages": 416
  },
  {
    "id": 319,
    "userId": "123",
    "authors": ["Robert J. Schneller"],
    "title": "Cushing: Civil War SEAL (Military Profiles)",
    "isbn": "1574886967",
    "pages": 128
  }
]

The Drizzle version was actually a little bit longer. But we’re not optimizing for fewest possible lines of code. The Drizzle version is typed, with autocomplete to guide you toward a valid query, and TypeScript to warn you when you miss. The query is also a lot more composable. What do I mean by that?

Composability: Putting Queries Together

Let’s write something slightly more advanced and slightly more realistic. Let’s code up a function that takes any number of search filters, and puts together a query. Here’s what the filters look like

type SearchPacket = Partial<{
  title: string;
  author: string;
  maxPages: number;
  subjects?: number[];
}>;

Note the Partial type. We’re taking in any number of these filters—possibly none of them. Whichever filters are passed, we want them to be additive; we want them combined with and. We’ve seen and already, and it can take the result of calls to eq, lt, and lots of others. We’ll need to create an array of all of these filters, and Drizzle gives us a parent type that can hold any of them: SQLWrapper.

Let’s get started.

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
}

We’ve got our array of filters. Now let’s start filling it up.

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
  if (args.title) {
    searchConditions.push(like(books.title, `%${args.title}%`));
  }
}

Nothing new, yet. This is the same filter we saw before with authors.

Speaking of authors, let’s add that query next. But let’s make the author check a little more realistic. It’s not a varchar column, it holds JSON values, which themselves are strings of arrays. MySQL gives us a way to search JSON: the ->> operator. This takes a JSON column, and evaluates a path on it. So if you had objects in there, you’d pass string paths to get properties out. We just have an array of strings, so our path is $, which is the actual values in the array. And the string comparrisons when we’re filtering on JSON columns like this is no longer case insensitive, so we’ll want to use the LOWER function in MySQL.

Typically, with traditional ORM’s you’d scramble to the docs to look for an equivalent to the ->> operator, as well as the LOWER function. Drizzle does something better, and gives us a nice escape hatch to just write SQL directly in situations like this. Let’s implement our authors filter.

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
  if (args.title) {
    searchConditions.push(like(books.title, `%${args.title}%`));
  }
  if (args.author) {
    searchConditions.push(
      sql`LOWER(${books.authors}->>"$") LIKE ${`%${args.author.toLowerCase()}%`}`
    );
  }
}

Note the sql tagged template literal. It lets us put arbitrary SQL in for one-off operations that may not be implemented in the ORM. Before moving on, let’s take a quick peak at the SQL generated by this:

{
  "sql": "select `id`, `userId`, `authors`, `title`, `isbn`, `pages` from `books` where (`books`.`userId` = ? and LOWER(`books`.`authors`->>\"$\") LIKE ?) order by `books`.`id` desc limit ?",
  "params": ["123", "%gould%", 10]
}

Let’s zoom in on the authors piece. What we entered as…

sql`LOWER(${books.authors}->>"$") LIKE ${`%${args.author.toLowerCase()}%`}`

… gets transformed as:

LOWER(`books`.`authors`->>"$") LIKE ?

Our search term was parameterized; however, Drizzle was smart enough to not parameterize our column. I’m continuously impressed by small details like this. The maxPages piece is the same as before

if (args.maxPages) {
  searchConditions.push(lte(books.pages, args.maxPages));
}

Nothing new or interesting. Now let’s look at the subjects filter. We can pass in an array of subject ids, and we want to filter books that have that subject. The relationship between books and subjects is stored in a separate table, booksSubjects. This table simply has rows with an id, a book id, and a subject id (and also the userId for that book, to make other queries easier).

So if book 12 has subject 34, there’ll be a row with bookId of 12, and subjectId of 34.

In SQL when we want to see if a given row exists in some table, we use the exists keyword, and Drizzle has an exists function for this very purpose. Let’s move on with our function

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
  if (args.title) {
    searchConditions.push(like(books.title, `%${args.title}%`));
  }
  if (args.author) {
    searchConditions.push(
      sql`LOWER(${books.authors}->>"$") LIKE ${`%${args.author.toLowerCase()}%`}`,
    );
  }
  if (args.maxPages) {
    searchConditions.push(lte(books.pages, args.maxPages));
  }
  if (args.subjects?.length) {
    searchConditions.push(
      exists(
        db
          .select({ _: sql`1` })
          .from(booksSubjects)
          .where(
            and(
              eq(books.id, booksSubjects.book),
              inArray(booksSubjects.subject, args.subjects),
            ),
          ),
      ),
    );
  }

We can pass an exists() call right into our list of filters, just like with real SQL. This bit:

_: sql`1`

… is curious, but that’s just us saying SELECT 1 which is a common way of putting something into a SELECT list, even though we’re not pulling back any data; we’re just checking for existence. Lastly, the inArray Drizzle helper is how we generate an IN query. Here’s what the generated SQL looks like for this subjects query:

select `id`, `userId`, `authors`, `title`, `isbn`, `pages`
from `books`
where (`books`.`userId` = ? and exists (select 1
                                        from `books_subjects`
                                        where (`books`.`id` = `books_subjects`.`book` and
                                               `books_subjects`.`subject` in (?, ?))))
order by `books`.`id` desc
limit ?

That was our last filter. Now we can pipe our filters in to execute the query we put together.

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
  if (args.title) {
    searchConditions.push(like(books.title, `%${args.title}%`));
  }
  if (args.author) {
    searchConditions.push(
      sql`LOWER(${books.authors}->>"$") LIKE ${`%${args.author.toLowerCase()}%`}`
    );
  }
  if (args.maxPages) {
    searchConditions.push(lte(books.pages, args.maxPages));
  }
  if (args.subjects?.length) {
    searchConditions.push(
      exists(
        db
          .select({ _: sql`1` })
          .from(booksSubjects)
          .where(
            and(
              eq(books.id, booksSubjects.book),
              inArray(booksSubjects.subject, args.subjects)
            )
          )
      )
    );
  }

  const result = await db
    .select()
    .from(books)
    .where(and(eq(books.userId, userId), ...searchConditions))
    .orderBy(desc(books.id))
    .limit(10);
}

The ability to treat SQL queries as typed function calls that can be combined arbitratily is what makes Drizzle shine.

Digging deeper

We could end the post here, but let’s go further and see how Drizzle handles something fairly complex. You might never need (or want to) write queries like this. My purpose in including this section is to show that you can, if you ever need to.

With that out of the way, let’s write a query to get aggregate info about our books. We want our most and least popular subject(s), and how many books we have with those subjects. We also want to know any unused subjects, as well as that same info about tags (which we haven’t talked about). And also the total number of books we have overall. This data might be displayed in a screen like this.

Screenshot

To keep this section manageable we’ll just get the book counts and the most and least subjects. The other pieces are variations on that theme. You can see the finished product here.

Let’s look at some of the SQL for this and how to write it with Drizzle.

Number of books per subject

In SQL we can group things together with GROUP BY.

SELECT
    subject,
    count(*)
FROM books_subjects
GROUP BY subject

Now our SELECT list, rather than pulling items from a table, is now pulling from a (conceptual) lookup table. We (conceptually) have a bunch of buckets stored by subject id. So we can select those subject id’s, as well as aggregate info from the buckets themselves, which we do with the count(*). This selects each subject, and the number of books under that subject.

And it works:

But we want the most, and least popular subjects. SQL also has what are called window functions. We can, on the fly, sort these buckets in some order, and then ask questions about the data, sorted in that way. We basically want the subject(s) with the highest, or lowest number of books, including ties. It turns out RANK is exactly what we want. Let’s see how this works

SELECT
    subject,
    count(*) as count,
    RANK() OVER (ORDER BY count(*) DESC) MaxSubject,
    RANK() OVER (ORDER BY count(*) ASC) MinSubject
FROM books_subjects
WHERE userId = '123'
GROUP BY subject

We ask for the rank of each row, when the whole result set is sorted in whatever way we describe.

Subjects 79, 137 and 150 all have a minSubject rank of 1, which means they are the least used subject, which makes sense since there’s only one book with that subject.

It’s a little mind bendy at first, so don’t worry if this looks a little weird. The point is to show how well Drizzle can simplify SQL for us, not to be a deep dive into SQL, so let’s move on.

We want the subjects with a MaxSubject of 1, or a MinSubject of 1. We can’t use WHERE for this, at least not directly. The solution in SQL is to turn this query into a virtual table, and query that. It looks like this:

SELECT
    t.subject id,
    CASE WHEN t.MinSubject = 1 THEN 'MinSubject' ELSE 'MaxSubject' END as label,
    t.count
FROM (
    SELECT
        subject,
        count(*) as count,
        RANK() OVER (ORDER BY count(*) DESC) MaxSubject,
        RANK() OVER (ORDER BY count(*) ASC) MinSubject
    FROM books_subjects
    WHERE userId = '123'
    GROUP BY subject
) t
WHERE t.MaxSubject = 1 OR t.MinSubject = 1

And it works.

Moving this along

We won’t show tags, since it’s basically identical except we hit a books_tags table, instead of books_subjects. We also won’t show unused subjects (or tags), which is also very similar, except we use a NOT EXISTS query.

The query to get the total number of books looks like this:

SELECT count(*) as count
FROM books
WHERE userId = '123'

Let’s add some columns to get it in the same structure as our subjects queries:

SELECT
    0 id,
    'Books Count' as label,
    count(*) as count
FROM books
WHERE userId = '123'

Now, since these queries return the same structure, let’s combine them into one big query. We use UNION for this.

SELECT *
FROM (
    SELECT
        t.subject id,
        CASE WHEN t.MinSubject = 1 THEN 'MinSubject' ELSE 'MaxSubject' END as label,
        t.count
    FROM (
        SELECT
            subject,
            count(*) as count,
            RANK() OVER (ORDER BY count(*) DESC) MaxSubject,
            RANK() OVER (ORDER BY count(*) ASC) MinSubject
        FROM books_subjects
        GROUP BY subject
    ) t
    WHERE t.MaxSubject = 1 OR t.MinSubject = 1
) subjects
UNION
    SELECT
        0 id,
        'Books Count' as label,
        count(*) as count
    FROM books
    WHERE userId = '123';

And it works! Phew!

But this is gross to write manually, and even grosser to maintain. There’s a lot of pieces here and there’s no (good) way to break this apart and manage separately. SQL is ultimately text, and you can, of course, generate these various pieces of text with different functions in your code, and then concatenate them together.

But that’s fraught with difficulty too. It’s easy to get small details wrong when you’re pasting strings of code together. And believe it or not, this query is much simpler than much of what I’ve seen.

The Drizzle Way

Let’s see what this looks like in Drizzle. Remember that initial query to get each subject, with its count, and rank? Here it is in Drizzle

const subjectCountRank = () =>
  db
    .select({
      subject: booksSubjects.subject,
      count: sql`COUNT(*)`.as("count"),
      rankMin: sql`RANK() OVER (ORDER BY COUNT(*) ASC)`.as("rankMin"),
      rankMax: sql`RANK() OVER (ORDER BY COUNT(*) DESC)`.as("rankMax"),
    })
    .from(booksSubjects)
    .where(eq(booksSubjects.userId, userId))
    .groupBy(booksSubjects.subject)
    .as("t");

Drizzle supports grouping, and it even has an as function to alias a query, and enable it to be queried from. Let’s do that next.

const subjectsQuery = () => {
  const subQuery = subjectCountRank();

  return db
    .select({
      label:
        sql`CASE WHEN t.rankMin = 1 THEN 'MIN Subjects' ELSE 'MAX Subjects' END`.as(
          "label"
        ),
      count: subQuery.count,
      id: subQuery.subject,
    })
    .from(subQuery)
    .where(or(eq(subQuery.rankMin, 1), eq(subQuery.rankMax, 1)));
};

We stuck our query to get the ranks in a function, and then we just called that function, and queried from its result. SQL is feeling a lot more like normal coding!

The query for the total book count is simple enough.

db
  .select({ label: sql`'All books'`, count: sql`COUNT(*)`, id: sql`0` })
  .from(books)
  .where(eq(books.userId, userId)),

Hopefully we won’t be too surprised to learn that Drizzle has a union function, to union queries together. Let’s see it all together:

const dataQuery = union(
  db
    .select({
      label: sql`'All books'`,
      count: sql`COUNT(*)`,
      id: sql`0`,
    })
    .from(books)
    .where(eq(books.userId, userId)),
  subjectsQuery()
);

Which generates this SQL for us:

(select 'All books', COUNT(*), 0 from `books` where `books`.`userId` = ?)
union
(select CASE WHEN t.rankMin = 1 THEN 'MIN Subjects' ELSE 'MAX Subjects' END as `label`, `count`, `subject`
 from (select `subject`,
              COUNT(*)                             as `count`,
              RANK() OVER (ORDER BY COUNT(*) ASC)  as `rankMin`,
              RANK() OVER (ORDER BY COUNT(*) DESC) as `rankMax`
       from `books_subjects`
       where `books_subjects`.`userId` = ?
       group by `books_subjects`.`subject`) `t`
 where (`rankMin` = ? or `rankMax` = ?))

Basically the same thing we did before, but with a few more parens, plus some userId filtering I left off for clarity.

I left off the tags queries, and the unused subjects/tags queries, but if you’re curious what they look like, the code is all here and the final union looks like this:

const dataQuery = union(
  db
    .select({
      label: sql`'All books'`,
      count: sql`COUNT(*)`,
      id: sql`0`,
    })
    .from(books)
    .where(eq(books.userId, userId)),
  subjectsQuery(),
  unusedSubjectsQuery(),
  tagsQuery(),
  unusedTagsQuery()
);

Just more function calls thrown into the union.

Flexibility

Some of you might wince seeing that many large queries all union‘d together. Those queries are actually run one after the other on the MySQL box. But, for this project it’s a small amount of data and there’s not multiple round trips over the network to do it. Our MySQL engine executes those queries one after the other.

But let’s say you decide you’re better off breaking that union apart, and sending N queries, with each piece, and putting it all together in application code. These queries are already separate function calls. It would be fairly easy to remove those calls from the union, and instead invoke them in isolation (and then modify your application code).

This kind of flexibility is what I love the most about Drizzle. Refactoring large, complex stored procedure has always been a pain with SQL. When you code it through Drizzle, it becomes much more like refactoring a typed programming language like TypeScript or C#.

Debugging Queries

Before we wrap up, let’s take a look at how easily Drizzle let’s you debug your queries. Let’s say the query from earlier didn’t return what we expected, and we want to see the actual SQL being run. We can do that by removing the await from the query, and then calling toSQL on the result.

import { and, desc, eq, like, lt, or } from "drizzle-orm";

const result = db
  .select()
  .from(books)
  .where(
    and(
      eq(books.userId, userId),
      or(lt(books.pages, 150), like(books.authors, "%Stephen Jay Gould%"))
    )
  )
  .orderBy(desc(books.id))
  .limit(10);

console.log(result.toSQL());

This displays the following:

{
  "sql": "select `id`, `userId`, `authors`, `title`, `isbn`, `pages` from `books` where (`books`.`userId` = ? and (`books`.`pages` < ? or `books`.`authors` like ?)) order by `books`.`id` desc limit ?",
  "params": ["123", 150, "%Stephen Jay Gould%", 10]
}

result.toSQL() returned an object, with a sql field with our query, and a params field with the parameters. As any ORM would, Drizzle parameterized our query, so fields with invalid characters wouldn’t break anything. You can now run this query directly against your database to see what went wrong.

Wrapping Up

I hope you’ve enjoyed this introduction to Drizzle. If you’re not afraid of a little SQL, it can make your life a lot easier.

Prefetching When Server Loading Won’t Do

Adam Rackis — Wed, 15 May 2024 23:26:46 +0000

This is a post about a boring* topic: loading data.

(* Just kidding it will be amazing and engaging.)

Not how to load data, but instead we’ll take a step back, and look at where to load data. Not in any particular framework, either, this is going to be more broadly about data loading in different web application architectures, and paricularly how that impacts performance.

We’ll start with client-rendered sites and talk about some of the negative performance characteristics they may have. Then we’ll move on to server-rendered apps, and then to the lesser-known out-of-order streaming model. To wrap up, we’ll talk about a surprisingly old, rarely talked about way to effectively load slow data in a server-rendered application. Let’s get started!

Client Rendering

Application metaframeworks like Next and SvelteKit have become incredibly popular. In addition to offering developer conveniences like file system-based routing and scaffolding of API endoints, they also, more importantly, allow you to server render your application.

Why is server rendering so important? Let’s take a look at how the world looks with the opposite: client-rendered web applications, commonly referred to as “single page applications” or SPAs. Let’s start with a simplified diagram of what a typical request for a page looks like in an SPA.

The browser makes a request to your site. Let’s call it yoursite.io. With an SPA, it usually sends down a single, mostly empty HTML page, which has whatever script and style tags needed to run the site. This shell of a page might display your company logo, your static header, your copyright message in the footer, etc. But mostly it exists to load and run JavaScript, which will build the “real” site.

This is why these sites are called “single page” applications. There’s a single web page for the whole app, which runs code on the client to detect URL changes, and request and render whatever new UI is needed.

Back to our diagram. The inital web page was sent back from the web server as HTML. Now what? The browser will parse that HTML and find script tags. These script tags contain our application code, our JavaScript framework, etc. The browser will send requests back to the web server to load these scripts. Once the browser gets them back, it’ll parse, and execute them, and in so doing, begin executing your application code.

At this point whatever client-side router you’re using (i.e. react-router, Tanstack Router, etc) will render your current page.

But there’s no data yet!

So you’re probably displaying loading spinners or skeleton screens or the like. To get the data, your client-side code will now make yet another request to your server to fetch whatever data are needed, so you can display your real, finished page to your user. This could be via a plain old fetch, react-query, or whatever. Those details won’t concern us here.

SSR To The Rescue

There is a pretty clear solution here. The server already has has the URL of the request, so instead of only returning that shell page, it could (should) request the data as well, get the page all ready to go, and send down the complete page.

Somehow.

This is how the web always worked with tools like PHP or asp.net. But when your app is written with a client-side JavaScript framework like React or Svelte, it’s surprisingly tricky. These frameworks all have API’s for stringifying a component tree into HTML on the server, so that markup can be sent down to the browser. But if a component in the middle of that component tree needs data, how do you load it on the server, and then somehow inject it where it’s needed? And then have the client acknowledge that data, and not re-request it. And of course, once you solve these problems and render your component tree, with data, on the server, you still need to re-render this component tree on the client, so your client-side code, like event handlers and such, start working.

This act of re-rendering the app client side is called hydration. Once it’s happened, we say that our app is interactive. Getting these things right is one of the main benefits modern application meta-frameworks like Next and SvelteKit provide.

Let’s take a look at what our request looks like in this server-rendered setup:

That’s great. The user sees the full page much, much sooner. Sure, it’s not interactive yet, but if you’re not shipping down obscene amounts of JavaScript, there’s a really good chance hydration will finish before the user can manage to click on any buttons.

We won’t get into all this, but Google themselves tell you this is much better for SEO as well.

So, what’s the catch? Well, what if our data are slow to load. Maybe our database is busy. Maybe it’s a huge request. Maybe there is a network hiccup. Or maybe you just depend on slow services you can’t control. It’s not rare.

This might be worse than the SPA we started with. Even though we needed multiple round trips to the server to get data, at least we were displaying a shell of a page quickly. Here, the initial request to the server will just hang and wait as long as needed for that data to load on the server, before sending down the full page. To the user, their browser (and your page) could appear unresponsive, and they might just give up and go back.

Out of Order Streaming

What if we could have the best of all worlds. What if we could server render, like we saw. But if some data are slow to load, we ship the rest of the page, with the data that we have, and let the server push down the remaining data, when ready. This is called streaming, or more precisely, out-of-order streaming (streaming, without the out-of-order part, is a separate, much more limited thing which we won’t cover here).

Let’s take a hypothetical example where the data abd, and data xyz are slow to load.

With out-of-order streaming we can load the to-do data load on the server, and send the page with just that data down to the user, immediately. The other two pieces of data have not loaded, yet, so our UI will display some manner of loading indicator. When the next piece of data is ready, the server pushes it down:

What’s the catch?

So does this solve all of our problems? Yes, but… only if the framework you’re using supports it. To stream with Next.js app directory you’ll use Suspense components with RSC. With SvelteKit you just return a promise from your loader. Remix supports this too, with an API that’s in the process of changing, so check their docs. SolidStart will also support this, but as of writing that entire project is still in beta, so check its docs when it comes out.

Some frameworks do not support this, like Astro and Next if you’re using the legacy pages directory.

What if we’re using those projects, and we have some dependencies on data which are slow to load? Are we stuck rendering this data in client code, after hydration?

Prefetching to the rescue

The web platform has a feature called prefetching. This lets us add a tag to the section of our HTML page, with a rel="prefetch" attribute, and an href attribute of the URL we want to prefetch. We can put service endpoint calls here, so long as they use the GET verb. If we need to pre-fetch data from an endpoint that uses POST, you’ll need to proxy it through an endpoint that uses GET. It’s worth noting that you can also prefetch with an HTTP header if that’s more convenient; see this post for more information.

When we do this, our page will start pre-fetching our resources as soon as the browser parses the link tag. Since it’s in the , that means it’ll start pre-fetching at the same time our scripts and stylesheets are requested. So we no longer need to wait until our script tags load, parse, and hydrate our app. Now the data we need will start pre-fetching immediately. When hydration does complete, and our application code requests those same endpoints, the browser will be smart enough to serve that data from the prefetch cache.

Let’s see prefetching in action

To see pre-fetching in action, we’ll use Astro. Astro is a wonderful web framework that doesn’t get nearly enough attention. One of the very few things it can’t do is out-of-order streaming (for now). But let’s see how we can improve life with pre-fetching.

The repo for the code I’ll be showing is here. It’s not deployed anywhere, for fear of this blog posting getting popular, and me getting a big bill from Vercel. But the project has no external dependencies, so you can clone, install, and run locally. You could also deploy this to Vercel yourself if you really want to see it in action.

I whipped up a very basic, very ugly web page that hits some endpoints to pull down a hypothetical list of books, and some metadata about the library, which renders the books once ready. It looks like this:

The endpoints return static data, which is why there’s no external dependencies. I added a manual delay of 700ms to these endpoints (sometimes you have slow services and there’s nothing you can do about it), and I also installed and imported some large JavaScript libraries (d3, framer-motion, and recharts) to make sure hydration would take a moment or two, like with most production applications. And since these endpoints are slow, they’re a poor candidate for server fetching.

So let’s request them client-side, see the performance of the page, and then add pre-fetching to see how that improves things.

The client-side fetching looks like this:

useEffect(() => {
  fetch("/api/books")
    .then((resp) => resp.json())
    .then((books) => {
      setBooks(books);
    });

  fetch("/api/books-count")
    .then((resp) => resp.json())
    .then((booksCountResp) => {
      setCount(booksCountResp.count);
    });
}, []);

Nothing fancy. Nothing particularly resilient here. Not even any error handling. But perfect for our purposes.

Network diagram without pre-fetching

Running this project, deployed to Vercel, my network diagram looks like this:

Notice all of the script and style resources, which need to be requested and processed before our client-side fetches start (on the last two lines).

Adding pre-fetching

I’ve added a second page to this project, called with-prefetch, which is the same as the index page. Except now, let’s see how we can add some tags to request these resources sooner.

First, in the root layout, let’s add this in the head section

<slot name="head">slot>

this gives us the ability to (but does not require us to) add content to our HTML document’s . This is exactly what we need. Now we can make a PrefetchBooks React component:

import type { FC } from "react";

export const PrefetchBooks: FC<{}> = (props) => {
  return (
    <>
      <link rel="prefetch" href="/api/books" as="fetch" />
      <link rel="prefetch" href="/api/books-count" as="fetch" />
    
  );
};

Then render it in our prefetching page, like so

"head" />

Note the slot attribute on the React component, which tells Astro (not React) where to put this content.

With that, if we run that page, we’ll see our link tags in the head

Now let’s look at our updated network diagram:

Notice our endpoint calls now start immediately, on lines 3 and 4. Then later, in the last two lines, we see the real fetches being executed, at which point they just latch onto the prefetch calls already in flight.

Let’s put some hard numbers on this. When I ran a webpagetest mobile Lighthouse analysis on the version of this page without the pre-fetch, I got the following.

Note the LCP (Largest Contentful Paint) value. That’s essentially telling us when the page looks finished to a user. Remember, the Lighthouse test simulates your site in the slowest mobile device imagineable, which is why it’s 4.6 seconds.

When I re-run the same test on the pre-fetched version, things improved about a second

Definitely much better, but still not good; but it never will be until you can get your backend fast. But with some intelligent, targetted pre-fetching, you can at least improve things.

Parting thoughts

Hopefully all of your back-end data requirements will be forever fast in your developer journeys. But when they’re not, prefetching resources is a useful tool to keep in your toolbelt.