What Is An Efficient Way To Find The Minimum Sum Of Multiple Dictionary Values, Given Keys Of Mutually Exclusive Integers?
Solution 1:
I tried solving this problem three different ways- an optimized brute force, a dynamic programming approach, and a greedy algorithm. The first two could not handle inputs for n > 17
, but generated optimal solutions, so I could use them to verify the average performance of the greedy method. I'll start first with the dynamic programming approach, and then describe the greedy one.
Dynamic Programming
First, note that we can if we determine that (1, 2, 3, 4)
and (5, 6, 7, 8)
sum to a smaller value than (3, 4, 5, 6)
and (1, 2, 7, 8)
, then your optimal solution absolutely cannot contain both (3, 4, 5, 6)
and (1, 2, 7, 8)
- because you could swap them out for the former, and have a smaller sum. Extending this logic, there will be one optimal combination of (a, b, c, d)
and (e, f, g, h)
that results in the minimal sum from all combinations of x0, x1, x2, x3, x4, x5, x6, x7
, and so we can rule out all of the others.
Using this knowledge, we can be a dictionary mapping of all x0, x1, x2, x3, x4, x5, x6, x7
combinations from the set [0, n)
, to their minimal sums, by brute forcing the sums of all combinations of x0, x1, x2, x3
. Then, we can use these mappings to repeat the process for x0, x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11
from x0, x1, x2, x3, x4, x5, x6, x7
and x0, x1, x2, x3
pairs. We repeat this process until we obtain all minimal sums for x0, x1 ... x_(4*m-1)
, which we then iterate over to find the minimal sum.
defdp_solve(const_dict, n, m):
lookup = {comb:(comb,) for comb in const_dict.keys()}
keys = set(range(n))
for size inrange(8, 4 * m + 1, 4):
for key_total in combinations(keys, size):
key_set = set(key_total)
min_keys = (key_total[:4], key_total[4:])
min_val = const_dict[min_keys[0]] + const_dict[min_keys[1]]
key1, key2 = min(zip(combinations(key_total, 4), reversed(list(combinations(key_total, size - 4)))), key=lambda x:const_dict[x[0]]+const_dict[x[1]])
k = tuple(sorted(x for x in key1 + key2))
const_dict[k] = const_dict[key1] + const_dict[key2]
lookup[k] = lookup[key1] + lookup[key2]
key, val = min(((key, val) for key, val in const_dict.items() iflen(key) == 4 * m), key=lambda x: x[1])
return lookup[key], val
Admittedly this implementation is pretty gnarly, because I kept micro-optimizing piece after piece hoping to make it fast enough without having to switch to a greedy approach.
Greedy
This is probably the one you care about, since it handles fairly large inputs quickly, and is quite accurate.
Start by constructing a list for partial sums, and begin iterating over your elements in your dictionary by increasing value. For each element, find all partial sums that don't create any collisions with their keys and "combine" them into a new partial sum, and append to the list. In doing so, you build a list of minimal partial sums that can be created from the smallest k
values in your dictionary. To speed this all up, I use hash sets to quickly check which partial sums contain pairs of the same key.
In the "fast" greedy approach, you would abort the moment you find a partial sum that has a key length of 4 * m
(or equivalently, of m
4-tuples). This usually nets fairly good results in my experience, but I wanted to add some logic to make it more accurate if need be. To do so, I add two factors-
extra_runs
- which dictates how many extra iterations to search for better solutions before breakingcheck_factor
- which specifies a multiple of current search "depth" to scan forward for a single new integer that creates a better solution with the current state. This is different from the above in that it does not "preserve" each new integer checked- it only does a quick sum to see if it creates a new min. This makes it significantly faster, at the cost that the otherm - 1
4-tuples must already exist in one of the partial sums.
Combined, these checks seem to always find the true minimal sum, at the cost of about 5x longer runtime (still quite fast though). To disable them, just pass 0
for both factors.
defgreedy_solve(const_dict, n, m, extra_runs=10, check_factor=2):
pairs = sorted(const_dict.items(), key=lambda x: x[1])
lookup = [set([]) for _ inrange(n)]
nset = set([])
min_sums = []
min_key, min_val = None, Nonefor i, (pkey, pval) inenumerate(pairs):
valid = set(nset)
for x in pkey:
valid -= lookup[x]
lookup[x].add(len(min_sums))
nset.add(len(min_sums))
min_sums.append(((pkey,), pval))
for x in pkey:
lookup[x].update(range(len(min_sums), len(min_sums) + len(valid)))
for idx in valid:
comb, val = min_sums[idx]
for key in comb:
for x in key:
lookup[x].add(len(min_sums))
nset.add(len(min_sums))
min_sums.append((comb + (pkey,), val + pval))
iflen(comb) == m - 1and (not min_key or min_val > val + pval):
min_key, min_val = min_sums[-1]
if min_key:
ifnot extra_runs: break
extra_runs -= 1for pkey, pval in pairs[:int(check_factor*i)]:
valid = set(nset)
for x in pkey:
valid -= lookup[x]
for idx in valid:
comb, val = min_sums[idx]
iflen(comb) < m - 1:
nset.remove(idx)
elif min_val > val + pval:
min_key, min_val = comb + (pkey,), val + pval
return min_key, min_val
I tested this for n < 36
and m < 9
, and it seemed to run fairly fast (couple of seconds to complete at worst). I'd imagine it should work for your case 12 <= n <= 24
pretty quickly.
Post a Comment for "What Is An Efficient Way To Find The Minimum Sum Of Multiple Dictionary Values, Given Keys Of Mutually Exclusive Integers?"