Equivalent of Python's collections.Counter in DolphinDB

2 weeks ago 18
ARTICLE AD BOX

The error in your groupby(count, myList, myList) approach occurs because count in DolphinDB is a system function/keyword with specific semantics that doesn't always behave like a standard aggregate function (such as sum or avg) when passed directly into certain higher-order mappings.

To achieve the same result as Python's collections.Counter, here are the idiomatic ways to do it in DolphinDB:

1. Using SQL Syntax (Most Efficient)

The most straightforward way to count occurrences in a vector is to treat it as a column in a table. This is highly optimized:

myList = [1, 2, 3, 2, 1, 3, 2, 1] t = table(myList as val) select count(*) from t group by val

2. Using group and each (Dictionary Output)

If you want the result in a dictionary format (which is exactly what Counter returns), use the group function. It creates a dictionary mapping unique values to their index positions, and then you count those positions:

myList = [1, 2, 3, 2, 1, 3, 2, 1] // .group() returns a dictionary of indices; .each(count) counts them counts = myList.group().each(count)

3. Using stat (Quick Summary)

If you just need a frequency distribution of a vector, you can also use the stat function, though the methods above are more flexible for further data manipulation:

stat(myList)

Summary of the functions:

group(): Best for creating a mapping of unique elements.

count: Works best when applied to the results of a grouped object or within a SQL select statement.

table(): Converting a list to a table allows you to leverage DolphinDB's high-performance analytical engine.

Read Entire Article