uncategorized

Streaming JSON Recordsets to Client in Express

Normally Express sends JSON objects to the client in a single chunk. You must create the entire object in memory on the server before the response is served. For large recordsets this is impractical.

By using streams we can avoid this. Normally we bypass express and send data to the client with socket.io and streams but in many cases you may prefer to just use Express and ajax calls.

Most connectors to databases both SQL and noSQL provide mechanisms to return recordsets as streams and output a chunk to the stream for every row. In the case of Mongo the cursor is a readable stream and the rows are javascript objects. For MS SQL Server, using tedious,it is easy to create a wrapper that produces a stream. We actually pass the tedious results through a transform stream because they are “heavy” containg metadata in addition to values.

We pass those streams through a transform stream which produces a chunk for each row which is a stringified JSON object (well really a buffer which represents the same) before we return to our Express router.

Rather than consume that stream and create a big array we just created a piece of middleware for Express that transforms the stream and pipes it directly to the response. The transform is very simple:

  1. push “[“ on to the stream before the first chunk
  2. if it is not the first chunk push a “,” onto the stream
  3. push the chunk
  4. Rinse and repeat
  5. When the last chunk has been pushed, push ‘]’ onto the stream and then
    push null to end the stream.

You could, of course, transform the stream as soon as you receive the stream from the database but when sending to a socket instead of the browser you don’t want them wrapped so adding the middleware to Express is more convient.

Most modern browsers will accept a chunked but you are out of luck with IE8 and prior. Some browsers will not give you access to the data on the client until the entire response is received but even in that case you still have the benefit of reduced memory on the server and lower latency because the response can start when the first row is received.

The code is embarassingly simple but who likes to read a blog post with no code:

1
'use strict';
let Stream = require("stream"),
	through2 = require("through2");
module.exports = (req, res, next)=> {
	if (!req.resultStream) {
		next();
		return;
	}
	if (!(req.resultStream instanceof Stream)) {
		next();
		return;
	}
	if (!req.resultStream.readable) {
		next();
		return;
	}
	let chunksSent = 0;
	req.resultStream.pipe(through2(function (chunk, enc, cb) {
		if (chunksSent === 0) {
			this.push(new Buffer("["));
		}
		if (chunksSent > 0) {
			this.push(new Buffer(","));
		}
		this.push(chunk);
		chunksSent++;
		cb();
	}, function () {
		this.push(new Buffer("]"));
		this.push(null);
	})).pipe(res);
}

The complete code is available on npm as express-stream-json

The bottom line(literally): Streams are your friends. Use them.

Share