2021-03-19 Cybozu Labs Study Session Explain âsolving by attributing to a minimum cut problemâ that can be used to solve certain optimization problems.
Simple optimization problem
- There are n two choices.
- You get 100 yen for each choice you make.
- But it costs Ci yen to choose.
- What are the choices that maximize profit?
answer
- Profit 100 - Choose the one with positive Ci
Simple optimization problem 2
- There are n two choices.
- You get 100 yen for each choice you make.
- But it costs Ci yen to choose.
- If you choose M alternatives what choice maximizes profit?
answer
- Sort and take M pieces from the one with the largest profit.
A problem one step further.
- There are N choices of two options (N=20)
- You get 100 yen for each choice you make.
- But it costs Ci yen to choose.
- If one chooses option i and does not choose option j, it costs an additional Dij.
- M non-zero costs (M=100).
- What are the choices that maximize profit?
answer
- Search all streets
- to calculate Dij * 100 so about times
- Python3
11.3 s ± 173 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
- PyPy3
1.1 s ± 37 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
- spec
problem to be solved this time
- There are N choices of two options (N=100) âđ€
- You get 100 yen for each choice you make.
- But it costs Ci yen to choose.
- If one chooses option i and does not choose option j, there is an additional Dij cost
- Non-zero cost is M (M=100)
- What are the choices that maximize profit?
What if itâs a full search?
- Even if we use a machine that can calculate times in 1 second, it would take years.
answer
- Attribute to the minimum cut problem, transform it into a maximum flow problem with the maximum flow minimum cut theorem, and solve it with the Dinic algorithm.
- Quite fast.
- When N=M=100, it was less than one millisecond.
- Time repeated 1000 times
- Python3
858 ms ± 222 ms per loop (mean ± std. dev. of 5 runs, 1000 loop each)
- PyPy3
214 ms ± 139 ms per loop (mean ± std. dev. of 7 runs, 1000 loop each)
- N=M=10000, but about 0.2 seconds.
- Python3
243 ms ± 40 ms per loop (mean ± std. dev. of 5 runs, 1 loop each)
- PyPy3
114 ms ± 45 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
- Python3
- N=M=100000, PyPy would be about 1-2 seconds.
- PyPy3
1.4 s ± 172 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
- PyPy3
What is Cutting?
-
The bipartition (S, T) of a graph G(V, E) at vertex V is called a cut.
- Just split the âset of verticesâ in two.
- As a physical image, âpainting in two colorsâ fits.
- The word âcutâ is a confusing word to think of âcutting the graph in twoâ.
- I donât care what side youâre on.
- Hereâs another cut.
- Some descriptions in the world say âwhat about flowâ at this point, but I donât care here because it is fine to think about flow in the later phase of âmapping the minimum cut to the maximum flowâ.
- Thinking about flow at this stage is also confusing. Since there is no flow corresponding to the above cuts.
What is the minimum cut problem?
- (This time we are only talking about directed graphs.)
- Given a cut G = (V, E)], the edge [ of G that is is called a cut edge.
- The sum of the weights of the cut edges is called the size of the cut, and the cut with the smallest size is called the smallest cut.
- Often it is convenient to consider âvertex s that always goes into Sâ and âvertex t that always goes into T.â A cut such that s and t are so is sometimes called an âs-t cut.
- Q: Is an edge that has a T at the base and an S at the end not a cut edge?
- A: Good question, confirm understanding on next slide
Confirm understanding of minimum cuts
- What would be the s-t minimum cut on this graph?
- (Note: In the original figure, vertices were written in uppercase, but this was changed to lowercase to distinguish between vertex set S and vertex s.)
answer
-
when cut size is minimum at 3
-
Itâs easy to get confused when you think of a cut as âcutting the graphâ and not cutting the âmiddle 5 edgesâ.
- The size of the cut is irrelevant because the edge is such that the root is a T and the tip is an S
-
There are four ways to paint the vertices with two colors since s and t are fixed.
- when 15
- 6 when both are S or both are T.
- So 3 is the minimum
- Cutting the smallest cut is not âcutting an edgeâ.
- Cut = Two-color painting of apex
- when 15
-
reprint
-
Transformed into a maximum flow problem by the maximum flow-minimum cut theorem, attributed to the minimum cut problem and solved by the Dinic algorithm
-
-
I just finished explaining what a âminimum cut problem is.â
-
Next, I will explain how the following problem can be attributed to a minimum cut problem
-
There are N choices of two options (N=100) âđ€
-
You get 100 yen for each choice you make.
-
But it costs Ci yen to choose.
-
If one chooses option i and does not choose option j, there is an additional cost of Dij. M non-zero costs (M=100)
-
What choice maximizes profit?
-
-
Want to maximize profit = minimize cost
- Cost can be minimized with the smallest cut if cost is the weight of the edge
-
If âchoose option iâ is set to âput vertex i into S
-
What is âChoice and Cost Ciâ?
- Subtract an edge of weight Ci from vertex i to vertex t
-
What is âReward 100 if you chooseâ?
- Since the reward is negative cost, subtract the edge of weight -100 from vertex i to vertex t
- (Negative sides will need to be removed later, but donât think about that now.)
-
If you pick i and donât pick j, cost Dij.â
- Subtract an edge of weight Dij from vertex i to vertex j
-
-
Find the minimum cut in this graph and you will get the answer you want to get
concrete example
- There are three two-option choices a, b, and c
- There is a reward of 100 for choosing
- Each choice costs 10,70,150.
- If you donât choose c when you choose a, youâll have to pay an additional cost of 40.
- Graph to be constructed
- Q: âIs it okay to have the s separated? Is that the obvious place to make the smallest cut?â
- A: There is an edge with negative weight, so that is not necessarily the minimum solution.
- This confusion is caused by confusing âcutâ with âsplit the graph in twoâ.
- From the next slide, convert this minimum cut problem to a maximum flow problem. In the process, s is connected and becomes a graph suitable for considering flow
Whatâs next?
- reprint
-
Transformed into a maximum flow problem by the maximum flow-minimum cut theorem, attributed to the minimum cut problem and solved by the Dinic algorithm
-
- I was able to attribute it to the minimum cut issue.
- Next convert to maximum flow
maximum flow problem
- Given a graph with non-negative edge weights
- This weight is called âcapacityâ in the context of flow
- Consider the flow between two vertices s, t of this graph
- Image of water flowing from s to t
- How much will flow when the maximum flow? is the maximum flow problem
- A flow on G is a weighted graph , where:.
- The weight of each side does not exceed the capacity at G:
- The incoming and outgoing flows match for vertex i except s,t:
maximum flow minimum cut theorem
- It is known that the size of the s-t minimum cut of a graph with non-negative edge weights and the flow rate of the maximum flow from s to t coincide
- No proof this time.
- rough image
- Considering the graph (residual network) of the maximum flow F from s to t reduced from the capacity of G, this is always split into two or more connected components
- If not, it would further increase the flow and contradict the assumption (e in the figure).
- The âdivide the graph vertex into twoâ is a cut, and maximizing the flow results in a bottleneck in the area of the smallest cut.
Negative side removal
-
Convert a minimum cut problem to a maximum flow problem and solve it.
-
Negative edges need to be removed because the capacity of the edges must be non-negative in the maximum flow problem
-
If we focus on a vertex i, it can either be in S or in T
-
So make it so that the same amount of additional cost c is charged when you enter S and when you enter T
- Now the negative side disappears.
- The minimum cut size of the modified graph will increase by c
- Keep track of the sum (offset) of the values added in the process of negative side removal, and subtract it after the answer is obtained.
-
example
- Negative side removal
- Minimum cut is -80
- Negative side removal
-
(Negative edge removal between free vertices is described below.)
Dinic Algorithm
- Efficient algorithm for finding the maximum flow of a graph
- A slightly modified version of the classic Ford-Fulkerson
- (Although, Dinic is also 1970, so it is a classic.)
- I wonât explain in detail this time. see: Slide by Guo Zhixian
- If you put in a directed graph with non-negative edge weights and a start point s end point t, you get an s-t max flow
- This matches the size of the smallest cut
- After calculating the maximum flow, we can search on the residual network we have inside to find the vertices that can be reached from each vertexâs s. That will be the S of the minimum cut.
Dinicâs computational complexity
- Dinicâs computational complexity is assumed to be O(V^2E)
- But itâs a search-based method, so even 10^4 vertices and 10^4 edges are not always calculated 10^12 times
- When N=M=10000 in the previous problem, the graph has 10002 vertices and 20000 edges
- This could be calculated in 114 ms.
- Up to what size problem can you solve?
- Tried with N=10000 fixed and increasing M (horizontal axis M, vertical axis ms) data.
- Almost linear (a little worse?) in this range.
- This is fixed V and increasing E, so itâs linear in theory.
- Move N and observe under the condition N=M data
- N^3 in terms of computational complexity, but the actual measurement appears to be linear.
- Perhaps âthe graph produced from this optimization problemâ has âfeatures that are convenient for Dinicâ.
- Dinic speed : When the edge weights are integers, the computational complexity goes down under various conditions.
- In this experiment, each cost is a ârandom integer between 1 and 200â so the edge weights are integer and capped
- In this case, the upper bound is C.
application
- Constraints of type âWhen i is chosen, j must be chosen
- Just make iâj a large enough cost.
- and: âIf you chose i, you must choose j and k.â
- OR: âIf i or j is chosen, then k must be chosen.â
- With the knowledge so far, you can solve a type of problem called Project Selection Problem.
- There are N projects.
- Doing project i yields income [$ r_i
- There are m machines.
- Buying machine j costs [$ c_j
- In order to do a project, you have to buy the machinery needed for that project.
- Which project should I choose to maximize profit = revenue - cost?
- There are N projects.
Application Continued
- not
- When you pick i, donât pick j.â
- This is generally not always possible.
- Express choosing i by putting i in S and expressing choosing j by putting j in T
- This way we can express the above constraints, but we need to decide in advance âwhich way it will go when selectedâ.
- At this time, the above constraint can only be applied between âthe vertex that enters S when selectedâ and âthe vertex that enters T when selected
- In a word: âbipartite graph.â
- Adjacency of a single row or two-dimensional square
- Just alternate S and T.
- Use this for problems where painting squares and adjacent squares incur costs.
More than two choices
- be made
- Example of three choices
- Similarly, N choices can be represented by arranging N-1 vertices
- Similarly, for a real-valued variable, âif i is greater than or equal to x, then j must be greater than or equal to yâ can be done
Negative edge removal between free vertices
-
How do we remove these types of negative sides?
-
Maybe we canât keep this up.
-
Change j to the vertex jâ of âT when chosenâ.
-
It is even easier if the infinity edge is facing the opposite way.
- Since there is no problem with a larger âsufficiently large weightâ
What I donât know yet
- Can we have and of the form âif i and j are S, then k is also Sâ?
- Can we have an or of the form âif i is S, then j or k is Sâ?
- Canât we solve this by sandwiching an intermediate layer in the same way we used to represent the n choices?
- For example, in the case where there are two choices (0/1) for N objects (A, B), the pattern is âcost c if Ai=1 and Bi=0â, âcost d if Ai=1 and Bi=1â, and âcost e if Ai=0
- Iâm having a hard time figuring out how to realize AND if I think these are the two choices.
- If you treat it as one of the three options, itâs a straightforward form of cost for each option.
- Perhaps the problem lies in the fact that we, who are accustomed to expressing problems in logical expressions, first express them in logical expressions and then try to convert them to minimum cuts.
- I donât think there is a conversion from a general logical expression to a minimum cut.
- Rather, the problem needs to be expressed in the smallest possible cut from the beginning.
- This is similar to the composition of âthinking of a logical equation and then turning it into a neural netâ and âthinking of a logical equation and then turning it into a Hamiltonianâ.
This page is auto-translated from /nishio/æć°ă«ăăććŒ·äŒ using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. Iâm very happy to spread my thought to non-Japanese readers.