How to efficiently paginate large query results in GridDB using the Java API?

16 hours ago 2
ARTICLE AD BOX

In GridDB, you generally don’t need to implement traditional pagination (like LIMIT/OFFSET) to handle large result sets efficiently. The Java API is designed to support streaming-style retrieval, which is both memory-efficient and scalable.

1. Use RowSet as a cursor (recommended)

When you execute a query:

Query<SensorData> query = container.query( "SELECT * FROM sensor_data WHERE deviceId = 'device_1'" ); RowSet<SensorData> rs = query.fetch();

RowSet works like a cursor, meaning it does not load all rows into memory at once. You can iterate through results safely and process them in batches at the application level:

int batchSize = 1000; List<SensorData> batch = new ArrayList<>(batchSize); while (rs.hasNext()) { batch.add(rs.next()); if (batch.size() == batchSize) { processBatch(batch); batch.clear(); } } if (!batch.isEmpty()) { processBatch(batch); }

This approach keeps memory usage bounded and is the recommended pattern in GridDB.

2. Avoid OFFSET-based pagination

Using queries like:

SELECT * FROM sensor_data LIMIT 1000 OFFSET 100000

is not efficient because the database must scan and skip rows internally. Performance degrades as the offset increases, so this pattern should be avoided for large datasets.

3. Prefer key-based (timestamp) pagination for large datasets

Since your container uses timestamp as the @RowKey, you can implement range-based pagination, which scales much better:

SELECT * FROM sensor_data WHERE deviceId = 'device_1' AND timestamp > ? ORDER BY timestamp

Example:

Timestamp lastTimestamp = null; int batchSize = 1000; while (true) { String sql = "SELECT * FROM sensor_data WHERE deviceId = 'device_1' " + (lastTimestamp != null ? "AND timestamp > ? " : "") + "ORDER BY timestamp"; Query<SensorData> query = container.query(sql); if (lastTimestamp != null) { query.setParameter(1, lastTimestamp); } RowSet<SensorData> rs = query.fetch(); int count = 0; while (rs.hasNext() && count < batchSize) { SensorData row = rs.next(); process(row); lastTimestamp = row.timestamp; count++; } if (count < batchSize) { break; // no more rows } }

This method:

Uses the index efficiently (RowKey)

Avoids scanning skipped rows

Provides predictable performance even with millions of records

4. Small optimization

If you don’t need row locking, use:

query.fetch(false);

This avoids unnecessary overhead during reads.

Read Entire Article