Get All Results from DynamoDB Queries Easily!

·

2 min read

A DynamoDB query can only return 1MB of results. So, how can you get all of the results of a query if it's bigger than that? There are two ways: the simple way and the cool way.

How DynamoDB Query Pagination Works

Before we start getting an insane number of records, let's back up and talk about how pagination works in DynamoDB. When a query has more records than it can return, either because you set the Limit property or there's more than 1MB of results, Dynamo will return all of the records that it can along with a property LastEvaluatedKey.

As long as the LastEvaluatedKey property is set in the query results, it means that you still have results left to get. If you want to get those results, update the original query parameters so that ExclusiveStartKey is set to LastEvaluatedKey and run the query again.

The Simple Way

The simple way is the most obvious: use a do loop. Make the condition check to see if the ExclusiveStartKey was set. If it is set, then you know there are more records, and you should run the query again.

export const getAllResults = async <T>(
    db: AWS.DynamoDB.DocumentClient,
    query: AWS.DynamoDB.DocumentClient.QueryInput
  ): Promise<Array<T>> => {

  const result = [] as Array<T>;
  console.info(`Querying Dynamo`, query.ExpressionAttributeValues);

  let queryResult: AWS.DynamoDB.DocumentClient.QueryOutput;
  do {
    queryResult = await db.query(query).promise();
    console.info(`Query Result`, queryResult);
    query.ExclusiveStartKey = queryResult?.LastEvaluatedKey;

    queryResult.Items?.forEach(item => result.push(item as T));
  } while (query.ExclusiveStartKey);

  console.info(
    `Found ${result.length} results for ${query.KeyConditionExpression}`,
    inspect(query.ExpressionAttributeValues)
  );

  return result;
};

Using an Async Generator

The problem with a do-loop is that all of the results will be stored in memory. If you have a lot of records that could be a problem. The fancier way to do this is with a node.js async generator.

An async generator is a special type of iterator that allows you to yield values as they're retrieved. The following code sample creates the DynamoDB async generator.

export async function* getAllResults<T>(
    db: AWS.DynamoDB.DocumentClient,
    query: AWS.DynamoDB.DocumentClient.QueryInput
  ) {
  console.info(`Querying Dynamo`, query.ExpressionAttributeValues);

  let queryResult: AWS.DynamoDB.DocumentClient.QueryOutput;
  do {
    queryResult = await db.query(query).promise();
    console.info(`Query Result`, queryResult);
    query.ExclusiveStartKey = queryResult?.LastEvaluatedKey;

    if(queryResult.Items) {
      yield* queryResult.Items;
    }
  } while (query.ExclusiveStartKey);
};

Calling It

Now that the generator is created, you can call it using the following awkward syntax.

for await (const record of getAllResults(db, query)) {
  console.info(record);
}

Conclusion

To conclude, if you have a DynamoDB query that returns more than 1MB of results, the simple way to handle pagination is to use a do-loop, and the more efficient way is to use an async generator.

Did you find this article valuable?

Support Ben by becoming a sponsor. Any amount is appreciated!