Efficiently generate all unique 4-integer ratios within a range up to scaling

2 weeks ago 15

ARTICLE AD BOX

Calculate all the primes up to and including half of your range, and their indices.

Make a "Boolean" bit array "primes" of size [range][prime count].

For each integer in your range, calculate if it is divisible by each prime, and if so, mark that boolean/bit as true/1. (Numbers should not be marked as multiples of themselves, or 1)

Now you finally iterate over the combinations. For each combination, look up the primes array for each of the values. If any index in that array is true/1 for all of the values in your combination, then all the values in this combination are a multiple of that prime, and thus of another combination, and thus can be skipped.

If the number of integers is a hardcoded constant, then you can use log2 bitwise-and operations to combine the bitmasks. ((A^B)^(C^D). If you really want to push the envelope, then the x64 AVX2 instruction set has a 256-bit logical-and operation VPAND.

Values up to ~625 would only have to calculate up to 64 primes, and thus each integers' primes row fits in a 64-bit integer, taking about 5kb total. 128 bits gets you ranges up to ~1417, about 23kb. 256 bits gets you ranges up to ~3237. You can very slightly increase the range by hard coding the "divisible by 2" check, but this usually only increases the range by ~2, so that's not worth the performance penalty.

The basic idea is that all numbers are multiples of primes, so to check if numbers have a common denominator, all you have to check are the primes. If two numbers are both divisible by the non-prime 15, then they're also divisible by both the prime 3 and the prime 5. And so if you know which primes multiply into each number as bits, then you can simply use logical-and to end up with a list of the primes that multiply into both numbers.

Mooing Duck

67.7k19 gold badges107 silver badges169 bronze badges

Are there any algorithms, data structures, or precomputed tables

Euclid's GCD algorithm seems pretty relevant here, on the face of it. Use it to discard candidate 4-tuples.

The sieve of Eratosthenes will often be relevant when separating primes from compound numbers. There's a rich literature surrounding it, suggesting ways to make it take less memory and less time.

Other data structures may be relevant, such as Bloom filters. You have not yet told us how to evaluate competing solutions, so it's categorically not possible to sensibly tune the Bloom parameters at this time.

Your "math for scale" clarification suggests you're interested in a solution space of on the order of a trillion 4-tuples. Presumably you're not going to use all of them in a single batch -- you will explore the space in some order, perhaps starting with smaller integers. To effectively apply any of these well known mathematical tools, you will need to first write down what your true business requirements are. Only then can we explore the solution space in a sensible way which adds value beyond the obvious "enumerate all tuples and naïvely discard the scaled ones".

You have constraints: time, money, RAM. What are they? How do you value them?

Counting to 1e12 takes a long time. The originally stated problem of counting to 1e4 is trivial, even for an Apple ][+.

Prefer to solve "the 2-tuple problem" first. Your experiences there will give insight into what would make sense for the 3-tuple and eventually the 4-tuple problem. It appears there are symmetries you could exploit.

J_H

21.5k5 gold badges29 silver badges52 bronze badges

3 Comments

The goal here is simply to see if it’s possible, not actually to generate a trillion tuples in RAM. Time and memory aren’t constraints for the 1–1000 example it’s meant to illustrate the math. The problem is solvable using theory, without brute-forcing or waiting forever. I’m a bit surprised the focus is on feasibility and on questioning why we’re doing it instead of the actual math. I understand my approach might not be conventional, but that doesn’t invalidate the question: is it possible?

2026-01-08T02:10:11.633Z+00:00

You choose not to write down a specification to an engineering problem. That's going to impact your ability to solve the problem within calendar and dollar constraints, no matter what exotic mathematical tools you bring to bear. As it stands you've not told us how to rank one solution (e.g. Euclid GCD) as "better" than another (e.g. Sieve). Absent a specification, there can be no "better', as there is no basis for comparison.

2026-01-08T02:18:40.783Z+00:00

I don’t see how the sieve would be effective here in practice. I’m not sure if GCD is the best possible solution, because you still need to run through all candidate tuples at least once. Are there algorithms or data structures that allow you to avoid examining the full dataset and only generate the relevant primitive patterns like how the Sieve of Eratosthenes automatically skips even numbers when looking for primes? I'm not sure.

2026-01-08T02:24:00.503Z+00:00

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Read Entire Article

LEFT SIDEBAR AD

Hidden in mobile, Best for skyscrapers.

Efficiently generate all unique 4-integer ratios within a range up to scaling

ARTICLE AD BOX

3 Comments

Related

Jupyter Kernel stuck on "Connecting" in Docker (CellOracle) on macOS Silicon (M2)

Flask API pagination and filtering works, but totalRecords count seems incorrect

Submit job without having access to worker code

LEFT SIDEBAR AD