# Multi-Agent Path Finding with Deadlines

←

→

**Page content transcription**

If your browser does not render page correctly, please read the page content below

Multi-Agent Path Finding with Deadlines ∗ Hang Ma1 , Glenn Wagner2 , Ariel Felner3 , Jiaoyang Li1 , T. K. Satish Kumar1 , Sven Koenig1 1 University of Southern California 2 CSIRO 3 Ben-Gurion University hangma@usc.edu, glenn.wagner@data61.csiro.au, felner@bgu.ac.il jiaoyanl@usc.edu, tkskwork@gmail.com, skoenig@usc.edu Abstract [Ma et al., 2016b]. It can be solved with reductions to arXiv:1806.04216v1 [cs.AI] 11 Jun 2018 other well-studied combinatorial problems [Surynek, 2015; We formalize Multi-Agent Path Finding with Dead- Surynek et al., 2016; Yu and LaValle, 2013a; Erdem et lines (MAPF-DL). The objective is to maximize the al., 2013] and dedicated optimal [Standley and Korf, 2011; number of agents that can reach their given goal Goldenberg et al., 2014; Sharon et al., 2013; Wagner and vertices from their given start vertices within the Choset, 2015; Sharon et al., 2015; Boyarski et al., 2015; deadline, without colliding with each other. We Felner et al., 2018], bounded-suboptimal [Barer et al., 2014; first show that MAPF-DL is NP-hard to solve opti- Cohen et al., 2016], and suboptimal MAPF algorithms [Sil- mally. We then present two classes of optimal algo- ver, 2005; Sturtevant and Buro, 2006; Wang and Botea, 2011; rithms, one based on a reduction of MAPF-DL to Luna and Bekris, 2011; de Wilde et al., 2013] as described a flow problem and a subsequent compact integer in several surveys [Ma et al., 2016a; Felner et al., 2017]. linear programming formulation of the resulting MAPF has recently been generalized in different directions reduced abstracted multi-commodity flow network [Ma and Koenig, 2016; Hönig et al., 2016a; Ma et al., 2016a; and the other one based on novel combinatorial Hönig et al., 2016b; Ma et al., 2017b; Ma et al., 2017c] but search algorithms. Our empirical results demon- none of them capture an important characteristic of many ap- strate that these MAPF-DL solvers scale well and plications, namely the ability to meet deadlines. A MAPF each one dominates the other ones in different sce- variant, G-TAPF, assigns tasks with deadlines to agents but narios. does not directly maximize the number of agents that can fin- ish the tasks by the deadlines [Nguyen et al., 2017]. We thus formalize Multi-Agent Path Finding with Dead- 1 Introduction lines (MAPF-DL). The objective is to maximize the number In many robotics applications, for example, aircraft-towing of agents that can reach their given goal vertices from their vehicles [Morris et al., 2016], warehouse and office robots given start vertices within a given deadline, without collid- [Wurman et al., 2008; Veloso et al., 2015], game characters ing with each other. Since none of the existing results di- [Ma et al., 2017d], and other multi-robot systems [Ma et al., rectly transfers to MAPF-DL, we first show that MAPF-DL 2017a], robots need to finish tasks that have deadlines. For is NP-hard to solve optimally. We then present two families example, in applications that require long-term autonomy for of algorithms to solve MAPF-DL. The first family is based a team of robots, it is important to move as many robots as on a reduction of MAPF-DL to a flow problem and a subse- possible from a dangerous area to reach a shelter area before quent compact integer linear programming formulation of the a disaster occurs in inclement or adversarial conditions. resulting reduced abstracted multi-commodity flow network. One aspect of the problem, namely Multi-Agent Path Find- The second family is based on novel combinatorial search al- ing (MAPF), is to plan collision-free paths for multiple agents gorithms. We introduce three search-based MAPF-DL algo- in known environments from their given start vertices to rithms and conduct systematic experiments to compare them their given goal vertices [Ma and Koenig, 2017]. The ob- on a number of MAPF-DL instances. The results show that jective is to minimize the sum of the arrival times of the all algorithms scale well to large problem instances but each agents or the makespan. MAPF is NP-hard to solve op- one dominates the other ones in different scenarios. We study timally [Yu and LaValle, 2013b] and even to approximate their pros and cons and provide a set of guidelines for identi- within a small constant factor for makespan minimization fying when each one should be used. ∗ The research at the University of Southern California was sup- ported by the National Science Foundation (NSF) under grant num- 2 Multi-Agent Path Finding with Deadlines bers 1724392, 1409987, and 1319966 as well as a gift from Amazon. The research at Ben-Gurion University was supported by Israel Sci- In this section, we define MAPF-DL formally and prove its ence Foundation under grant number 844/17 and the Israel Ministry computational hardness. We then present an optimal MAPF- of Science. DL algorithm based on integer linear programming (ILP).

Problem Definition Algorithm 1: High Level of CBS-DL (and MA-DBS) We formalize MAPF-DL as follows: We are given a dead- Input: MAPF-DL instance line, denoted by a time step Tend , a finite undirected graph 1 Root.constraints ← ∅ 2 Root.plan ← path for each agent found by a low-level search G = (V, E), and M agents a1 , a2 . . . aM . Each agent ai 3 Root.cost ← 0 has a start vertex si and a goal vertex gi . In each time step, 4 OPEN ← {Root} each agent either moves to an adjacent vertex or stays at the 5 while true do 6 N ← arg minN 0 ∈OPEN N 0 .cost same vertex. Each agent can reach its goal vertex in Tend 7 OPEN ← OPEN \ {N } time steps in the absence of other agents (without loss of 8 Try to find a collision in N.plan 9 if N.plan has no collision then generality). Let li (t) be the vertex occupied by agent ai at 10 return N.plan time step t ∈ {0 . . . Tend }. We call an agent ai successful 11 C ← first vertex or edge collision (ai , aj , . . . ) in N.plan iff it occupies its goal vertex at the deadline Tend , that is, 12 // begin: below for MA-DBS only li (Tend ) = gi . A plan consists of a path li assigned to each 13 if shouldMerge(ai , aj ) then successful agent ai that satisfies the following conditions: (1) 14 a{i,j} ← merge(ai , aj ) 15 Update N.constraints (external constraints of a{i,j} ) li (0) = si [each successful agent starts at its start vertex]. (2) 16 Update N.plan by invoking a low-level search (with DBS) for (li (t − 1), li (t)) ∈ E or li (t − 1) = li (t) [each successful a{i,j} agent always either moves to an adjacent vertex or does not 17 N.cost ← Munsucc in N.plan move]. Each unsuccessful agent ai is removed at time step 18 OPEN ← OPEN ∪ {N } 19 continue // go to [Line 5] zero, and the plan thus contains no path assigned to it, that is, 20 // end: above for MA-DBS only li = ∅.1 We define a collision between two different success- 21 foreach ai involved in C do ful agents ai and aj to be either a vertex collision (ai , aj , v, t) 22 N 0 ← new node 23 N 0 .plan ← N.plan iff v = li (t) = lj (t) [both successful agents occupy the same 24 N 0 .constraints ← N.constraints ∪ {(ai , . . . )} vertex simultaneously] or an edge collision (ai , aj , u, v, t) iff 25 Update N 0 .plan by invoking a low-level search (with A* or DBS) u = li (t) = lj (t+1) and v = lj (t) = li (t+1) [both success- for ai ful agents traverse the same edge simultaneously in opposite 26 N 0 .cost ← Munsucc in N 0 .plan 27 OPEN ← OPEN ∪ {N 0 } directions]. A solution is a plan without collisions. The objective of MAPF-DL is to maximize the number of successful agents Msucc = |{ai |li (Tend ) = gi }|, that is, the number of paths in the solution, or, equivalently, minimize which is similar to the reductions of MAPF and a MAPF the number of unsuccessful agents Munsucc = M − Msucc . variant, TAPF, to multi-commodity flow problems [Yu and The cost of a plan is thus the number of unsuccessful agents LaValle, 2013a; Ma and Koenig, 2016]. It then encodes the Munsucc . It can also be defined as the sum of the costs of latter problem using a compact integer linear programming all agents since a Boolean cost can be defined for each agent (ILP) formulation on a reduced abstracted multi-commodity where each successful agent incurs cost 0 and each unsuc- flow network. See our preliminary work [Ma et al., 2018] for cessful agent incurs cost 1. Obviously, every MAPF-DL in- more details on this algorithm. stance has a trivial solution where all agents are unsuccessful, namely with cost Munsucc = M . 3 Search-Based MAPF-DL Algorithms Intractability In this section, we present a spectrum of optimal combi- Theorem 1. It is NP-hard to compute a MAPF-DL solution natorial search algorithms for solving MAPF-DL: Conflict- with the maximum number of successful agents. Based Search with Deadlines (CBS-DL), an adapted version The proof of the theorem reduces the ≤3,=3-SAT prob- of Conflict-Based Search (CBS) [Sharon et al., 2015]; Death- lem [Tovey, 1984], an NP-complete version of the Boolean Based Search (DBS), which reasons about sets of success- satisfiability problem, to MAPF-DL. The reduction is similar ful agents; and Meta-Agent DBS (MA-DBS), which incorpo- to the one used for proving the NP-hardness of approximat- rates the advantages of CBS-DL and DBS. ing the optimal makespan for MAPF [Ma et al., 2016b]. It constructs a MAPF-DL instance with deadline Tend = 3 that 3.1 CBS-DL has a zero-cost solution iff the given ≤3,=3-SAT instance is (Standard) CBS is a two-level MAPF algorithm that mini- satisfiable. Also see our preliminary work [Ma et al., 2018]. mizes the sum of the arrival times of all agents at their goal vertices. CBS-DL is an adaptation of CBS for MAPF-DL. ILP-Based MAPF-DL Algorithm Algorithm 1 shows its high-level search. Lines in red are Our ILP-based MAPF-DL algorithm first reduces MAPF-DL used in MA-DBS (presented in Section 3.3) only. CBS-DL to the maximum (integer) multi-commodity flow problem, uses the same framework as CBS but uses Munsucc as cost. 1 On the high level, CBS-DL performs a best-first search to re- Depending on the application, the unsuccessful agents can be removed at time step zero, wait at their start vertices, or move out of solve collisions among the agents and thus builds a constraint the way of the successful agents. We choose the first option in this tree (CT). Each CT node contains a set of constraints and a paper. If the unsuccessful agents are not removed, they can obstruct plan that obeys these constraints. CBS-DL always expands other agents. However, our proof of NP-hardness does not depend the CT node with the smallest cost Munsucc of its plan. The on this assumption, and our MAPF-DL algorithms can be adapted root CT node has no constraints [Line 1]. CBS-DL performs to other assumptions. a low-level search to find a path for each agent (without any

constraints). The plan of the root CT node thus contains paths to the induction assumption and the fact that N 0 .plan inherits for all agents [Line 2], and its cost is zero [Line 3]. When the paths of all agents different from agent ai from N.plan on CBS-DL expands a CT node N , it checks whether the CT Line 23. node contains a plan that has no collisions [Line 8]. If this is Lemma 4. CBS-DL chooses CT nodes on Line 6 in non- the case, N is a goal node and CBS-DL terminates success- decreasing order of their costs. fully [Line 10]. Otherwise, CBS-DL chooses a collision to Proof. CBS-DL performs a best-first search and the cost of a resolve [Line 11] and generates two child nodes N1 and N2 parent CT node N is at most the cost of any of its child CT that inherit all constraints and the plan from N [Line 21-23]. nodes N 0 since N 0 .plan contains at most as many paths as If the collision to resolve is a vertex collision (ai , aj , v, t), N.plan contains because (1) the plan of a CT node contains CBS-DL adds the vertex constraint (ai , v, t) to N1 to prohibit the largest possible number of paths (one for each success- agent ai from occupying v at time step t and similarly adds ful agent) that obey its constraints according to Lemma 3, the vertex constraint (aj , v, t) to N2 . If the collision to re- and thus (2) the set of all plans that obey N 0 .constraints is a solve is an edge collision (ai , aj , u, v, t), CBS-DL adds the subset of the set of all plans that obey N.constraints (since edge constraint (ai , u, v, t) to N1 to prohibit agent ai from N.constraints ⊂ N 0 .constraints due to Line 24). moving from u to v at time step t and similarly adds the edge Lemma 5. The cost of a CT node is at most the cost of any constraint (aj , v, u, t) to N2 [Line 24]. For each child CT solution that obeys its constraints. node, say N1 , CBS-DL performs a low-level search for agent Proof. The cost of the CT node is the cost Munsucc of its ai to compute a new path from its start vertex to its goal vertex plan, which in turn is the minimum among the costs of all within deadline Tend that obeys the constraints of N1 relevant plans that obey its constraints according to Lemma 3, which to agent ai and replaces the old path of agent ai in N1 .plan in turn is at most the cost of any solution that obeys its con- with the new path returned by the low-level search (it deletes straints since every solution that obeys its constraints is also the old path if no path is returned) [Line 25]. CBS-DL thus a plan that obeys its constraints. updates the cost of N1 accordingly and inserts it into OPEN Theorem 2. CBS-DL is complete and optimal. [Lines 26-27]. Proof. A solution always exists, for example, where all On the low level, CBS-DL performs an A* search to find agents are unsuccessful. Now assume that the cost of an a path for the agent from its start vertex to its goal vertex by optimal solution is x and, for a proof by contradiction, that pruning all nodes with time step > Tend . If it finds a path from CBS-DL does not terminate with a solution of cost x. There- the start vertex to the goal vertex of length exactly Tend time fore, whenever CBS-DL chooses a CT node with cost x on steps that obeys the constraints imposed by the high level, it Line 6, its plan has collisions (because otherwise CBS-DL returns the path for the agent and cost 0. Otherwise, it returns would correctly terminate with a solution of cost x according no path and cost 1. to Lemma 2 since it chooses CT nodes on Line 6 in non- decreasing order of their costs according to Lemma 4). Pick Theoretical Analysis an arbitrary optimal solution. A CT node whose constraints We now prove that CBS-DL is complete and optimal. the optimal solution obeys has cost ≤ x according to Lemma Lemma 1. CBS-DL generates only finitely many CT nodes. 5. The root CT node is such a node since the optimal solution Proof. The constraint added on Line 24 to a child CT node is trivially obeys its (empty) constraints. Whenever CBS-DL different from the constraints of its parent CT node since the chooses such a CT node on Line 6, its plan has collisions (as paths of its parent CT node do not obey it. The depth of the shown directly above since its cost is ≤ x). CBS-DL thus (binary) CT is finite because all paths are not longer than Tend generates the child CT nodes of this parent CT node, the con- and only finitely many different vertex and edge constraints straints of at least one of which the optimal solution obeys and exist. which CBS-DL thus inserts into OPEN with cost ≤ x. Since Lemma 2. Whenever CBS-DL chooses a CT node on Line CBS-DL chooses CT nodes on Line 6 in non-decreasing or- 6 and the plan of the node has no collisions, then CBS-DL der of their costs according to Lemma 4, it chooses infinitely terminates with a solution with finite cost. many CT nodes on Line 6 with costs ≤ x, which contradicts Proof. The cost of the CT node is Munsucc of its plan, which Lemma 1. is bounded by M . Lemma 3. The plan of a CT node has the largest possible 3.2 Death-Based Search number of paths (one for each successful agent) that obey its Death-Based Search (DBS) is also a two-level algorithm. constraints. Conceptually, instead of imposing vertex or edge constraints Proof (by induction). The statement holds for the root CT on agents, DBS marks individual agents as unsuccessful and node because its plan contains one path for each agent (since then searches for the minimal set of unsuccessful agents nec- each agent can reach its goal vertex in Tend time steps in the essary to produce a solution. absence of other agents). Assume that the statement holds for We define a group γ of agents to be consistent iff all agents the parent CT node N of any child CT node N 0 . When CBS- in it can simultaneously be successful, that is, the sub-MAPF- DL updates the plan of N 0 on Line 25, it changes the path DL instance with the agents in γ has a zero-cost solution (an for one agent only, say agent ai , by performing a low-level empty group is consistent). This condition is verified by a search with the constraints of N 0 (including the newly added special call to CBS-DL with deadline Tend , which reports that constraint hai , . . . i). Therefore, CBS-DL correctly updates the condition holds if all agents in γ are successful or reports the path for agent ai , and the statement holds also for N 0 due that the condition does not hold once CBS-DL expands a CT

Algorithm 2: High Level of DBS agent unsuccessful for the child DT nodes), which requires a Input: MAPF-DL instance procedure that can determine all consistent subgroups of the 1 Root.live ← {{ai }|i = 1 . . . M } (inconsistent) group γ of agents efficiently and might result 2 Root.cost ← 0 in a larger branching factor. In this paper, we chose to present 3 OPEN ← {Root} 4 while true do the version that is the easiest to understand and analyze. 5 N ← arg minN 0 ∈OPEN N 0 .cost 6 OPEN ← OPEN \ {N } Theoretical Analysis 7 Check whether all groups in N.live are consistent by calling CBS-DL 8 if all groups in N.live are consistent then We now prove that DBS is complete and optimal. 9 if |N.live| = 1 then Lemma 6. DBS generates only finitely many DT nodes. 10 return the zero-cost solution for the single group γ in N.live Proof. The branching factor of a DT node is bounded by M 11 else 0 due to Line 19. Due to Lines 14 and 21, when we consider 12 N ← new node 13 γ ← γ1 ∪ γ2 (γ1 and γ2 are the smallest groups in N.live) each DT node in a downward traversal of any branch of DT 14 N 0 .live ← (N.live \ {γ1 , γ2 }) ∪ {γ} from the root DT node, its set contains either one less group 15 N 0 .cost ← N.cost 16 OPEN ← OPEN ∪ {N 0 } (when merging two groups) or one less agent (when declaring an unsuccessful agent) than that of its parent CT node. Its set 17 else 18 γ ← first group in N.live that does not have a zero-cost solution is thus different from the sets of all its ancestor DT nodes. 19 foreach ai ∈ γ do Therefore, the depth of DT is also finite since there are finitely 20 N 0 ← new node 21 N 0 .live ← (N.live \ {γ}) ∪ {γ \ {ai }} many possible sets of disjoints groups of the M agents. 22 N 0 .cost ← N.cost + 1 Lemma 7. Whenever DBS chooses a DT node on Line 5 23 OPEN ← OPEN ∪ {N 0 } whose set contains one single consistent group of live agents, then DBS correctly terminates with a solution of finite cost. Proof. Its cost is the number of agents that have been de- clared unsuccessful, which is bounded by M . node with non-zero cost (that is, at least one agent in γ is not Lemma 8. DBS chooses DT nodes on Line 5 in non- successful). decreasing order of their costs. On the high level, DBS performs a best-first search on the Proof. DBS performs a best-first search, and the cost of a death tree (DT). Each DT node N contains a set N.live of parent DT node is at most the cost of any of its child DT disjoint groups of live agents (agents that have not been de- nodes due to Lines 15 and 22. clared unsuccessful) and a cost N.cost equal to the number Theorem 3. DBS is complete and optimal. of agents that have been declared unsuccessful. Algorithm 2 Proof. A solution always exists, for example, where all shows the high-level search of DBS. The root DT node con- agents are unsuccessful. Now assume that the cost of an opti- tains a set of M groups of live agents, each group containing mal solution is x and, for a proof by contradiction, that DBS a single unique agent [Lines 1] and its cost is zero [Line 2]. does not terminate with a solution of cost x. Therefore, when- DBS chooses the DT node N with the smallest cost N.cost ever DBS chooses a DT node with cost x on Line 5, its set and checks if all groups in its set N.live are consistent [Line does not contain one single consistent group (because other- 5-7]. If N.live contains a single consistent group γ, the DT wise DBS would correctly terminate with a solution of cost node N is a goal node, and DBS returns the zero-cost solu- x according to Lemma 7 since it chooses DT nodes on Line tion for γ [Line 10]. If all (more than one) groups in N.live 5 in non-decreasing order of their costs according to Lemma are consistent, DBS merges the two smallest groups γ1 and 8). Pick an arbitrary optimal solution with the set γunsucc of γ2 in N.live to form a new group γ and adds a child DT node x unsuccessful agents. Trivially, a DT node that has declared whose set contains all the groups in N.live but replaces γ1 and the agents in a subset of γunsucc unsuccessful has cost ≤ x. γ2 with γ [Lines 12-16]. Otherwise, there is an inconsistent The root DT node is such a node since it has not declared group γ in N.live [Line 18]. We know that at least one agent any agents unsuccessful. Whenever DBS chooses such a DT in γ must be declared unsuccessful, forcing a split. In this node on Line 5, its set does not contain one single consistent case, DBS adds |γ| child nodes, one for each agent ai ∈ γ, to group (as shown directly above since its cost is ≤ x). Its set DT, where each of these nodes declares its own unique agent thus contains (1) more than one consistent group or (2) an ai ∈ γ unsuccessful, and its cost is thus one larger than that inconsistent group (in which case the DT node has declared of its parent [Lines 20-23]. the agents in a strict subset of γunsucc unsuccessful). In case (1), DBS thus generates the only child DT node of this parent Other Versions of DBS DT node, which has declared the same agents unsuccessful as DBS could have started with a root DT node [Line 1] whose the parent DT node and which DBS thus inserts into OPEN set contains only a single group of all M agents, which does with cost ≤ x. In case (2), DBS thus generates the child DT not require merging groups of live agents but results in a nodes of this parent DT node, at least one of which has still larger branching factor for the root DT node. DBS could have declared the agents (including one additional agent) in a sub- chosen different groups to merge [Line 14], which might re- set of γunsucc unsuccessful and which DBS thus inserts into sult in an inconsistent group of larger size. Whenever DBS OPEN with cost ≤ x. Since DBS chooses DT nodes on Line splits a parent DT node [Lines 20-23], it could have generated 5 in non-decreasing order of their costs according to Lemma child DT nodes whose sets contain only consistent additional 8, it chooses infinitely many DT nodes on Line 5 with costs groups (and thus possibly declare more than one additional ≤ x, which contradicts Lemma 6.

1.0 60 3.3 Meta-Agent DBS 0.9 0.8 50 CBS may perform poorly when an environment contains 0.7 40 Success Rate many possible, but colliding, paths for the agents since the 0.6 Time(s) 0.5 ILP 30 ILP size of CT is exponential in the number of collisions resolved. 0.4 CBS-DL CBS-DL On the other hand, DBS may perform poorly for MAPF-DL 0.3 DBS 20 DBS MA-DBS(0) if the conflicting agents are not added to the same group early 0.2 10 MA-DBS(0) MA-DBS(10) 0.1 MA-DBS(10) in the search. We thus combine the power of CBS for weakly 0.0 MA-DBS(100) 0 MA-DBS(100) coupled agents and the power of DBS for identifying unsuc- 10 20 30 40 50 60 70 80 90 100 Number of Agents 10 20 30 40 50 60 70 80 90 100 Number of Agents cessful agents in a tightly coupled subset of agents using the agents #instances ILP CBS-DL DBS MA-DBS(0) MA-DBS(10) MA-DBS(100) 10 50 0.86 0.24 0.65 0.25 0.30 0.25 Meta-Agent CBS [Sharon et al., 2015] framework, which re- 20 50 2.18 1.31 2.80 1.07 0.58 0.91 30 42 3.76 1.57 4.24 3.23 1.76 1.85 sults in a new optimal MAPF-DL algorithm, called Meta- 40 26 6.74 3.02 7.16 8.54 2.82 2.35 50 14 11.88 9.23 11.76 19.43 4.82 5.46 Agent DBS (MA-DBS). 60 5 29.26 4.82 10.82 27.07 4.77 5.88 MA-DBS is a two-level algorithm: It uses the high-level search of CBS-DL on the high level and DBS on the low Figure 1: Success rates (top left), averaged running times over all in- level. Algorithm 1 shows its high-level search. MA-DBS stances (top right), and averaged running times over instances solved by all six algorithms (bottom) for different numbers of agents. is similar to CBS-DL but also keeps track of the number of times collisions between every pair of (simple) agents that it has considered thus far during the search in a conflict ma- Theorem 4. MA-DBS is complete and optimal. trix CM . Before MA-DBS expands a CT node N for the Proof. MA-DBS only merges two agents into one agent on colliding agents ai and aj , if the number of collisions be- Line 14 but never splits any agent. Therefore, MA-DBS tween the two agents exceeds a user-defined merge thresh- does the merge operation [Lines 14-19] finitely many times old B, MA-DBS merges them into a composite meta agent (bounded by M ) for each CT node. The rest of the proof is a{i,j} . To do so, whenever MA-DBS considers a collision the same as the proof of Theorem 2. between (meta) agents ai and aj [Line 11], because two sim- ple agents ak ∈ ai and ak0 ∈ aj collide, it increases the 0 4 Experiments value of CM [{k, Pk }] by one. Function shouldMerge(a i , aj ) returns true iff ak ∈ai ,ak0 ∈aj CM [{k, k 0 }] > B [Line 13]. In this section, we describe our experimental results on a Since DBS uses a low-level search that finds a plan for a 2.50 GHz Intel Core i5-2450M laptop with 6 GB RAM. We meta agent without any internal collisions between all (sim- tested six optimal MAPF-DL algorithms: the ILP-based algo- ple) agents in the meta agent, it only needs to store external rithm, CBS-DL, DBS, and MA-DBS with merge thresholds constraints resulting from (external) collisions between any 0, 10, and 100 (labeled as MA-DBS(0), MA-DBS(10), and two (simple) agents in different meta agents. Therefore, if MA-DBS(100), respectively). The ILP-based algorithm uses MA-DBS decides to merge ai and aj into a{i,j} , it updates CPLEX V12.7.1 [IBM, 2011] as the ILP solver. We experi- the constraints N.constraints of the CT node N accordingly mented on instances where the start and goal vertices of each [Line 15]. It then calls DBS to find new paths (without inter- agent are placed randomly so that the distance between them nal collisions) for all agents in a{i,j} subject to the constraints is close to the deadline. An instance becomes much easier to in N.constraints relevant to a{i,j} (by solving a MAPF-DL solve if this distance is much smaller than the deadline (since instance with agents in a{i,j} ) and updates the plan N.plan there is more leeway to plan a path for the agent). Specif- and cost N.cost of the CT node N according to the new paths ically, we use three sets of randomly generated MAPF-DL returned by DBS [Lines 16-17]. Then, instead of expanding instances with different numbers of agents (varied from 10 to N , MA-DBS inserts N back into OPEN [Line 18]. When 100 in increments of 10) labeled as SMALL, MEDIUM, and MA-DBS generates a new child CT node, it also calls DBS LARGE on 40 × 40, 80 × 80, and 120 × 120 4-neighbor 2-D to find an optimal solution for a meta agent that obeys the grids with deadlines Tend = 50, 100, and 150, respectively. constraints of the child CT node [Line 25]. The cells in each grid are blocked independently at random with 20% probability each. We generate 50 MAPF-DL in- Theoretical Analysis stances for each number of agents for each set. The start and Lemmas 1 and 4 hold for MA-DBS without change. Since goal vertices of each agent are randomly placed at distance the low-level search of MA-DBS, namely DBS, returns the 48, 49, or 50 for SMALL, 98, 99, or 100 for MEDIUM, and maximum number of paths for a meta agent that obey the 148, 149, or 150 for LARGE. Each algorithm is given a time constraints of a CT node, Lemma 3 also holds for MA-DBS limit of 60 seconds to solve each instance. We did not run an because (1), when it updates the plan of a CT node on Line 16, algorithm for some number of agents if it solved none of the the resulting plan contains the maximum number of paths for 50 instances for a smaller number of agents. the new meta agent and the original paths of the other agents, The 40×40 SMALL domain Figure 1 (top left) plots the suc- and (2), when it updates the plan of a child CT node on Line cess rates (numbers of instances solved within the time limit 25, the resulting plan contains the maximum number of paths divided by 50) for all algorithms for SMALL. ILP has the for the meta agent and inherits paths of other agents from the highest success rates, and they start to drop only at 50 agents. plan of the parent CT node, and thus the induction argument The success rates for the search-based algorithms start to drop for the proof of Lemma 3 holds. Consequently, Lemma 4 also at 30 agents. DBS and MA-DBS(0) have the highest success holds for MA-DBS. rates among all search-based algorithms. Figure 1 (top right)

1.0 60 0.9 The 120 × 120 LARGE domain Figure 3 reports the same 50 0.8 statistics for LARGE in the same format as reported for 0.7 40 SMALL. Figure 3 (top left) plots the success rates. CBS- Success Rate 0.6 Time(s) 0.5 ILP 30 DL, MA-DBS(10), and MA-DBS(100) have the best success ILP 0.4 CBS-DL CBS-DL 0.3 DBS 20 DBS rates. ILP has the worst success rates. Figure 3 (top right) 0.2 MA-DBS(0) 10 MA-DBS(0) plots the average running times over all 50 instances. MA- MA-DBS(10) 0.1 MA-DBS(10) DBS(10) seems to perform the best. ILP performs the worst. MA-DBS(100) 0.0 0 MA-DBS(100) 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 The table in Figure 3 reports the average running times over Number of Agents Number of Agents agents #instances ILP CBS-DL DBS MA-DBS(0) MA-DBS(10) MA-DBS(100) instances that are solved by all six algorithms. MA-DBS(10) 10 20 49 48 3.80 8.87 0.89 1.71 2.95 6.97 0.69 1.75 0.82 1.71 0.90 1.75 and CBS-DL perform the best (very close to each other) and 30 40 43 42 17.56 26.92 2.65 3.84 12.19 17.96 3.01 6.04 2.86 4.08 3.02 4.24 outperform ILP by up to a factor of 9. 50 60 37 22 37.46 48.80 5.92 7.26 25.57 31.89 14.33 21.31 6.36 7.29 6.12 7.33 Summary of Experimental Results For the same number 70 3 47.76 8.70 35.06 26.37 7.74 7.97 of agents, SMALL has higher agent density, more tightly- coupled agents, and shorter planning horizons than MEDIUM Figure 2: Results for MEDIUM. and LARGE. ILP outperforms the search-based algorithms 1.0 60 for SMALL because the size of the ILP formulation is small. 0.9 When Tend increases, the size of the ILP formulation and the 50 0.8 running time required to solve it increase significantly. 0.7 40 Success Rate 0.6 On the other hand, among all search-based algorithms, Time(s) ILP 0.5 CBS-DL 30 ILP there seems to be a spectrum where DBS and CBS-DL sit 0.4 CBS-DL 0.3 DBS 20 DBS at two extremes. DBS has higher success rates than CBS- MA-DBS(0) 0.2 MA-DBS(10) 10 MA-DBS(0) DL for SMALL. CBS-DL has significantly higher success 0.1 MA-DBS(10) 0.0 MA-DBS(100) 0 MA-DBS(100) rates than DBS for MEDIUM and LARGE. CBS-DL uses 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 much less times than DBS for instances that are solved by Number of Agents Number of Agents all algorithms. MA-DBS seems to balance between CBS-DL agents #instances ILP CBS-DL DBS MA-DBS(0) MA-DBS(10) MA-DBS(100) 10 50 9.52 1.38 5.31 1.55 1.33 1.36 and DBS: (a) MA-DBS(0) is more similar to DBS because it 20 48 25.49 3.04 14.15 2.98 2.91 3.03 merges agents into meta agents more frequently, which can 30 38 42.43 4.84 24.88 5.67 4.86 4.80 40 15 55.44 7.20 37.23 8.61 6.88 6.67 result in a large meta agent containing many agents that need to be solved by DBS on the low level; and, on the other Figure 3: Results for LARGE. hand, (b) MA-DBS(10) and MA-DBS(100) are more simi- lar to CBS-DL because they merge agents less frequently and plots the average running times over all 50 instances. 60 sec- their searches mostly remain in the CBS-DL framework. onds are used for an instance that is not solved. Therefore, the data points in the chart are lower bounds on the running times 5 Conclusions and Future Work in those cases when not all instances are solved. ILP performs We formalized MAPF-DL, a new variant of MAPF. Theoret- the best. Finally, the table in Figure 1 reports the average run- ically, we proved that MAPF-DL is NP-hard to solve opti- ning times over those “easy” instances that are solved by all mally. We presented two families of optimal MAPF-DL al- six algorithms (it also reports the numbers of those instances gorithms, one based on an ILP formulation and one based but does not show the rows where no instance is solved by on combinatorial search techniques. Our experimental results all the algorithms). The best entry in each row is shown in show that each of them performs the best in different scenar- bold. The search-based algorithms use less time to solve these ios. We suggest the following future directions: (1) develop “easy” instances than ILP. CBS-DL, MA-DBS(10), and MA- and compare new MAPF-DL algorithms, for example, A*-, DBS(100) seem to use the least times and outperform ILP ASP-, and SAT-based algorithms; (2) study important gen- by up to a factor of 6. In some cases, the running times are eralizations of MAPF-DL (for example, when agents have smaller for larger numbers of agents because fewer (and “eas- different priorities) more deeply; (3) study the combinatorial ier”) instances are solved by all algorithms. difference between MAPF-DL and MAPF; and (4) explore The 80 × 80 MEDIUM domain Figure 2 reports the same different merge criteria for MA-DBS. statistics for MEDIUM in the same format as reported for SMALL. Figure 2 (top left) plots the success rates. ILP has the highest success rates for small numbers (≤ 50) of agents References but the lowest success rates for large numbers of agents. MA- [Barer et al., 2014] M. Barer, G. Sharon, R. Stern, and A. Felner. DBS(10) seems to perform the best for large numbers of Suboptimal variants of the conflict-based search algorithm for the agents. Figure 2 (top right) plots the average running times multi-agent pathfinding problem. In SoCS, pages 19–27, 2014. over all 50 instances. ILP has the longest running times. MA- [Boyarski et al., 2015] E. Boyarski, A. Felner, R. Stern, G. Sharon, DBS(10) seems to perform the best in general. Finally, the D. Tolpin, O. Betzalel, and S. E. Shimony. ICBS: Improved table in Figure 2 reports the average running times over in- conflict-based search algorithm for multi-agent pathfinding. In stances that are solved by all six algorithms. ILP performs IJCAI, pages 740–746, 2015. the worst. CBS-DL seems to have the smallest running times [Cohen et al., 2016] L. Cohen, T. Uras, T. K. S. Kumar, H. Xu, and outperforms ILP by up to a factor of 7. N. Ayanian, and S. Koenig. Improved solvers for bounded-

suboptimal multi-agent path finding. In IJCAI, pages 3067–3074, congested video game environments. In AIIDE, pages 270–272, 2016. 2017. [de Wilde et al., 2013] B. de Wilde, A. W. ter Mors, and C. Wit- [Ma et al., 2018] H. Ma, G. Wagner, A. Felner, J. Li, T. K. S. Ku- teveen. Push and rotate: Cooperative multi-agent path planning. mar, and S. Koenig. Multi-agent path finding with deadlines: In AAMAS, pages 87–94, 2013. Preliminary results. In AAMAS, 2018. [Erdem et al., 2013] E. Erdem, D. G. Kisa, U. Oztok, and [Morris et al., 2016] R. Morris, C. Pasareanu, K. Luckow, W. Ma- P. Schueller. A general formal framework for pathfinding prob- lik, H. Ma, S. Kumar, and S. Koenig. Planning, scheduling and lems with multiple agents. In AAAI, pages 290–296, 2013. monitoring for airport surface operations. In AAAI-16 Workshop [Felner et al., 2017] A. Felner, R. Stern, S. E. Shimony, E. Bo- on Planning for Hybrid Systems, 2016. yarski, M. Goldenberg, G. Sharon, N. R. Sturtevant, G. Wagner, [Nguyen et al., 2017] V. Nguyen, P. Obermeier, T. C. Son, and P. Surynek. Search-based optimal solvers for the multi-agent T. Schaub, and W. Yeoh. Generalized target assignment and path pathfinding problem: Summary and challenges. In SoCS, pages finding using answer set programming. In IJCAI, pages 1216– 29–37, 2017. 1223, 2017. [Felner et al., 2018] A. Felner, J. Li, E. Boyarski, H. Ma, L. Cohen, [Sharon et al., 2013] G. Sharon, R. Stern, M. Goldenberg, and T. K. S. Kumar, and S. Koenig. Adding heuristics to conflict- A. Felner. The increasing cost tree search for optimal multi-agent based search for multi-agent path finding. In ICAPS, 2018. pathfinding. Artificial Intelligence, 195:470–495, 2013. [Goldenberg et al., 2014] M. Goldenberg, A. Felner, R. Stern, [Sharon et al., 2015] G. Sharon, R. Stern, A. Felner, and N. R. G. Sharon, N. R. Sturtevant, R. C. Holte, and J. Schaeffer. En- Sturtevant. Conflict-based search for optimal multi-agent hanced partial expansion A*. Journal of Artificial Intelligence pathfinding. Artificial Intelligence, 219:40–66, 2015. Research, 50:141–187, 2014. [Silver, 2005] D. Silver. Cooperative pathfinding. In AIIDE, pages [Hönig et al., 2016a] W. Hönig, T. K. S. Kumar, L. Cohen, H. Ma, 117–122, 2005. H. Xu, N. Ayanian, and S. Koenig. Multi-agent path finding with [Standley and Korf, 2011] T. S. Standley and R. E. Korf. Complete kinematic constraints. In ICAPS, pages 477–485, 2016. algorithms for cooperative pathfinding problems. In IJCAI, pages [Hönig et al., 2016b] W. Hönig, T. K. S. Kumar, H. Ma, N. Aya- 668–673, 2011. nian, and S. Koenig. Formation change for robot groups in oc- [Sturtevant and Buro, 2006] N. R. Sturtevant and M. Buro. Improv- cluded environments. In IROS, pages 4836–4842, 2016. ing collaborative pathfinding using map abstraction. In AIIDE, [IBM, 2011] IBM. IBM ILOG CPLEX Optimization Studio CPLEX pages 80–85, 2006. User’s Manual, 2011. [Surynek et al., 2016] P. Surynek, A. Felner, R. Stern, and E. Bo- [Luna and Bekris, 2011] R. Luna and K. E. Bekris. Push and Swap: yarski. Efficient SAT approach to multi-agent path finding under Fast cooperative path-finding with completeness guarantees. In the sum of costs objective. In ECAI, pages 810–818, 2016. IJCAI, pages 294–300, 2011. [Surynek, 2015] P. Surynek. Reduced time-expansion graphs and [Ma and Koenig, 2016] H. Ma and S. Koenig. Optimal target as- goal decomposition for solving cooperative path finding sub- signment and path finding for teams of agents. In AAMAS, pages optimally. In IJCAI, pages 1916–1922, 2015. 1144–1152, 2016. [Tovey, 1984] C. A. Tovey. A simplified NP-complete satisfiability [Ma and Koenig, 2017] H. Ma and S. Koenig. AI buzzwords ex- problem. Discrete Applied Mathematics, 8:85–90, 1984. plained: Multi-agent path finding (MAPF). AI Matters, 3(3):15– [Veloso et al., 2015] M. Veloso, J. Biswas, B. Coltin, and S. Rosen- 19, 2017. thal. CoBots: Robust symbiotic autonomous mobile service [Ma et al., 2016a] H. Ma, S. Koenig, N. Ayanian, L. Cohen, robots. In IJCAI, pages 4423–4429, 2015. W. Hönig, T. K. S. Kumar, T. Uras, H. Xu, C. Tovey, and [Wagner and Choset, 2015] G. Wagner and H. Choset. Subdimen- G. Sharon. Overview: Generalizations of multi-agent path find- sional expansion for multirobot path planning. Artificial Intelli- ing to real-world scenarios. In IJCAI-16 Workshop on Multi- gence, 219:1–24, 2015. Agent Path Finding, 2016. [Wang and Botea, 2011] K. Wang and A. Botea. MAPP: A scalable [Ma et al., 2016b] H. Ma, C. Tovey, G. Sharon, T. K. S. Kumar, multi-agent path planning algorithm with tractability and com- and S. Koenig. Multi-agent path finding with payload transfers pleteness guarantees. Journal of Artificial Intelligence Research, and the package-exchange robot-routing problem. In AAAI, pages 42:55–90, 2011. 3166–3173, 2016. [Wurman et al., 2008] P. R. Wurman, R. D’Andrea, and [Ma et al., 2017a] H. Ma, W. Hönig, L. Cohen, T. Uras, H. Xu, M. Mountz. Coordinating hundreds of cooperative, au- T. K. S. Kumar, N. Ayanian, and S. Koenig. Overview: A hi- tonomous vehicles in warehouses. AI Magazine, 29(1):9–20, erarchical framework for plan generation and execution in multi- 2008. robot systems. IEEE Intelligent Systems, 32(6):6–12, 2017. [Yu and LaValle, 2013a] J. Yu and S. M. LaValle. Planning optimal [Ma et al., 2017b] H. Ma, T. K. S. Kumar, and S. Koenig. Multi- paths for multiple robots on graphs. In ICRA, pages 3612–3617, agent path finding with delay probabilities. In AAAI, pages 3605– 2013. 3612, 2017. [Yu and LaValle, 2013b] J. Yu and S. M. LaValle. Structure and [Ma et al., 2017c] H. Ma, J. Li, T. K. S. Kumar, and S. Koenig. intractability of optimal multi-robot path planning on graphs. In Lifelong multi-agent path finding for online pickup and delivery AAAI, pages 1444–1449, 2013. tasks. In AAMAS, pages 837–845, 2017. [Ma et al., 2017d] H. Ma, J. Yang, L. Cohen, T. K. S. Kumar, and S. Koenig. Feasibility study: Moving non-homogeneous teams in

You can also read