Skip to main content

ORIGINAL RESEARCH article

Front. Neurorobot., 20 May 2021
Volume 15 - 2021 | https://doi.org/10.3389/fnbot.2021.674949

Dynamic Task Allocation in Multi-Robot System Based on a Team Competition Model

Kai Jin1* Pingzhong Tang2 Shiteng Chen3 Jianqing Peng4
  • 1Department of Computer Science and Engineering, The Hong Kong University of Science and Technology (HKUST), Hong Kong, China
  • 2Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
  • 3Institute of Software, Chinese Academy of Sciences, Beijing, China
  • 4School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China

In recent years, it is a trend to integrate the ideas in game theory into the research of multi-robot system. In this paper, a team-competition model is proposed to solve a dynamic multi-robot task allocation problem. The allocation problem asks how to assign tasks to robots such that the most suitable robot is selected to execute the most appropriate task, which arises in many real-life applications. To be specific, we study multi-round team competitions between two teams, where each team selects one of its players simultaneously in each round and each player can play at most once, which defines an extensive-form game with perfect recall. We also study a common variant where one team always selects its player before the other team in each round. Regarding the robots as the players in the first team and the tasks as the players in the second team, the sub-game perfect strategy of the first team computed via solving the team competition gives us a solution for allocating the tasks to the robots—it specifies how to select the robot (according to some probability distribution if the two teams move simultaneously) to execute the upcoming task in each round, based on the results of the matches in the previous rounds. Throughout this paper, many properties of the sub-game perfect equilibria of the team competition game are proved. We first show that uniformly random strategy is a sub-game perfect equilibrium strategy for both teams when there are no redundant players. Secondly, a team can safely abandon its weak players if it has redundant players and the strength of players is transitive. We then focus on the more interesting case where there are redundant players and the strength of players is not transitive. In this case, we obtain several counterintuitive results. For example, a player might help improve the payoff of its team, even if it is dominated by the entire other team. We also study the extent to which the dominated players can increase the payoff. Very similar results hold for the aforementioned variant where the two teams take actions in turn.

1. Introduction

In the past two decades, intelligent multi-robot systems are more and more widely used in industrial manufacturing, agriculture, hospital, fire rescue, cargo handling, entertainment, and many other places. The efficiency of the systems is crucial to their applications and it highly depends on the collaboration between robots. One of the primary problem that occurs to the designer of multi-robot systems is how to assign tasks to robots such that the most suitable robot is selected to execute the most appropriate task, which is usually referred to as the task allocation problem and which arises in all kinds of real-life applications.

It is well-known that game theory lays the mathematical foundation for the research of collaboration in multi-robot system, and it is a trend to integrate the ideas and theoretical results in game theory into the research of multi-robot system. For example, market-based approaches to task allocation are proposed in Botelho and Alami (1999), Gerkey and Mataric (2002), Wang et al. (2004), Dias et al. (2006), Zlot and Stentz (2006), and Wu and Shang (2020). In this paper, a team-competition model, which is of interest by itself in game theory, is proposed to solve a Dynamic Multi-Robot Task Allocation (DMRTA) problem.

In our DMRTA problem, there are m robots and n pre-described tasks, and the tasks are coming in T rounds, where T ≤ min{m, n}. One task will come in every round, and it would like to be assigned to one robot immediately. Be aware that when T < n, only T tasks, but not all the n tasks, will be assigned (as there are only T rounds), and this means the set of tasks to be assigned are not fully determined at the beginning in our problem. For simplicity, assume that each pre-described task comes at most once (this constraint can easily be removed by replicate the task) and that each robot can take at most one task (as well, this constraint can be removed by replicate the robot). Different robots have different performances in solving different tasks, and to describe this diversity we make the following assumption: if the robot with index r (i.e., robot r) is to execute the task with index s (i.e., task s), there is a probability pr, s that robot r succeeds to complete its job—and a probability 1 − pr, s that it fails to do its job. The m × n probability matrix pr, s ∣ 1 ≤ rm, 1 ≤ sn is prior information—it is given to us before the allocation mission starts. Roughly, our objective is that the number of successful robots is as high as possible—in other words, the number of tasks that have been solved successfully is as high as possible. Details of the DMRTA problem will be elaborated right after we introduce our team competition model in what follows.1

We investigate a type of team competitions where there are two teams, each with a number of players, competing against each other. The competition proceeds in a fixed number of rounds. In each round, each team simultaneously sends out a player to a match (We also consider a variant where one team, say Team 2 without loss of generality, always takes actions before the other team in each round. This variant will mainly be discussed in section 3). The result of the match is then revealed according to a probabilistic strength matrix between players. The selected players cannot compete in the subsequent rounds. The competition proceeds to the next round if there is one; or terminated otherwise. The format of the competition and the strength matrix are common knowledge to both teams. The final payoff of each team is the number of matches it wins. To make it more general, we also investigate another commonly seen form where each team gets payoff 1 if it wins strictly more matches than the other team, 0 if ties, and −1 if it wins less matches. Clearly, this competition between two teams defines a standard extensive-form game, or more precisely, a stacked matrix game (Lanctot et al., 2014). We are interested in the sub-game perfect equilibria of the game, i.e., a strategy profile that specifies for each team which player to play at each round. A formal description of our team competition model is given in the next section.

Our team competition model is first motivated by the Chinese horse race story described in Tang et al. (2009) (see also Wikipedia, 2015b). It represents one of the most popular forms of horse races where each team ranks its horses to match sequentially. Moreover, the Swaythling Cup, as known as World Table Tennis Championships, follows the same model described in our paper: each team adaptively selects a ranking of three players and brings two additional substitutes. In fact, this has been one of the most popular formats of team competition in table tennis. In addition, the card game Goofspeil (Lanctot et al., 2014; Wikipedia, 2015a) also falls into nearly the same model as described in our paper. Last but not least, many military engagements (like fighting between two groups of drones) may also have this type of structure.

Obviously, we can regard the m robots as the players in the first team and the n tasks as the players in the second team, and the probability matrix pr, s ∣ 1 ≤ rm, 1 ≤ sn can serve as the probabilistic strength matrix in the team competition. Then, the (sub-game perfect) strategy profile of the first team we obtained via solving the team competition gives us a solution for allocating the tasks to the robots—it specifies how to select the robot (randomly according to some probability distribution given by the strategy) to execute the upcoming task in each round, based on the results of the matches in the previous rounds.

At this place, it is necessary to point out a feature of our DMRTA problem inherited from the team competition model: As the two teams shall send out their players simultaneously in each round, in our DMRTA problem we shall select the robot before the task in the corresponding round is revealed to us. Nevertheless, if we would like to handle the case where we can select the robot after the task in the corresponding round is revealed, we only need to deal with the variant where Team 2 reacts before Team 1. Gladly, we will see in section 3 that many results for this non-simultaneous variant are aligned with the results for the original simultaneous case (In particular, most of the results are the same. See Table 1 for a comparison). We first discuss the simultaneous case because this case is typical and more difficult.

TABLE 1
www.frontiersin.org

Table 1. Summarize of the results for simultaneous case and non-simultaneous case.

Despite the aforementioned application in the DMRTA problem, our team competition model is interesting by itself in game theory. We are particularly interested in a situation where at least one team has more players than the number of rounds in the competition. As a result, some players will never have chance to participate in any match. A main agenda of this paper is to understand to what extent can the presence of additional players affect the payoff of both teams. In particular, we ask the following questions: (1) Can the presence of additional weakest teammate, a teammate whose row in the strength matrix is strictly dominated by any other row, help increase the payoff of the team? (2) Can the presence of additional dominated teammate, a teammate that always loses to any player in the opponent team, help increase the payoff of the team? It might appear intuitive that the answers to both questions are negative. For the first question, it seems that the weakest teammate will never have a chance to participate in any match since one can always replace him by a better teammate and increase payoff. For the second question, it might seem more obvious since the dominated teammate will lose any matches thus must be replaced by a better teammate. To our surprise, we find that the answers to both questions are affirmative.

Our contributions to the team competition model are summarized in the following. We first show that uniformly random strategy is a sub-game perfect equilibrium strategy for both teams when there are no redundant players (i.e., the number of players in each team equals the number of rounds). The uniformly random strategy always picks the unmatched player uniformly at random in each round. Then, we consider the general case where at least one team has redundant players. We first study the case where the strength of players is transitive (see Definition 2), which means that the players can be rearranged in a queue so that each of them is weaker than its successor. We prove that, a team can safely abandon its weak players if it has redundant players and the strength of players is transitive. Therefore, this case reduces to the case where there are no redundant players. Finally, we focus on the case where there are redundant players and the strength of players is not transitive. In this case, we obtain a number of counterintuitive results. Most importantly, a player might help improve its team's payoff, even if it is dominated by the entire opposing team. We give a necessary condition for a dominated player to be useful, which alternatively suggest that a particular utility function (named UE below) is more reasonable in team competition. Our results imply that a team can increase its utility by recruiting additional dominated players. We further show that, the optimal number of dominated players to recruit can scale with the number of rounds. More precisely, this number can be Θ(T) if there are T rounds. Last but not least, we study the limitation of dominated players. These results bring insights into playing and designing general team competitions.

Team competition has been studied for years. Tang et al. (2009, 2010) study a team competition setting where the number of players equals the number of rounds and both teams must determine the ordering of players upfront, before the competition starts. They put forward competition rules that are truthful while satisfy other desirable properties. The main difference between their work and ours is that we do not design new mechanisms but study game theoretical properties of commonly used competition rules. The differences also lie in that the strategies are adaptive in our setting and each team can have more players than the rounds. The strategic aspects of team competition have also been under scrutiny of computer scientists due to a recent Olympic scandal in badminton, where several teams deliberately throw matches in order to avoid a strong opponent in the next round. The phenomenon has been discussed in depth in a series of algorithmic game theory blogposts by Kleinberg (2012) and Procaccia (2013). A parallel literature has been concerned with the strategic aspect of tournament seeding (Hwang, 1982; Rosen, 1986; Knuth, 1987; Schwenk, 2000; Altman et al., 2009; Vu et al., 2009). It is well-known that there are cases where by strategic seeding and structuring, any player can be winner in knockout tournament. Various game theoretical questions, such as player-optimal seeding, complexity of manipulation and incentives to guarantee strategyproofness, have been investigated in this literature.

The study of various kinds of multi-robot task allocation problem using game theory dates back to early 2000's. Botelho and Alami (1999), Gerkey and Mataric (2002), Wang et al. (2004), Dias et al. (2006), Zlot and Stentz (2006), and Wu and Shang (2020) make use of the theory of market economies to determine how to allow robots to negotiate on responsibilities in task allocation. In particular, they discuss how to manage bids (robots communicate to bid for tasks according to their expected contribution to the tasks), how to handle bids in parallel and how to handle multiple tasks at once, and so on. Usually, heuristic assignments are made by assigning every task to the robot that can execute it with the highest utility.

2. Materials and Methods

Team 1 has a set of m players {A1, …, Am}. Team 2 has a set of n players {B1, …, Bn}. A competition between team 1 and 2 is a tuple G(T, P, U) where,

1. T is the number of rounds.

2. In each round, each team simultaneously selects one of its players that have not been selected yet.

3. P is a probabilistic matrix that describes the relative strength between players, with Pi, j denoting the probability that Ai wins against Bj and 1 − Pi, j the probability for Ai to lose to Bj.

4. U:[T] → R denotes the utility function of each team. The utility function only depends on the number of rounds t each team wins, i.e., it can be represented by U(t). This also implies that both teams have the same utility functions.

5. The parameters n, m, T, P, U are common knowledge to both teams, and historical plays are perfectly observable. It is assured that Pi, j ∈ [0, 1] for all i, j and that mT and nT so that there are enough players to complete the competition.

The following utility functions UE and UM are two commonly seen ones:

UE(t)=t-T/2.UM(t)={1t>T/20t=T/2-1t<T/2.    (1)

In other words, UE describes a competition where a team's utility is exactly the number of rounds it wins (minus some constant T/2); while UM(t) describes a competition where a team's utility is whether it wins more than its opponent. Notice that, when U = UE or U = UM, we have U(t) + U(Tt) = 0, hence both utility functions define a zero-sum game. In this paper we always assume that U(T) + U(Tt) = 0.

2.1. Example: Simultaneous Card Games

The models above formulates the standard team competitions as commonly seen under the context of sports, but shall not be limited to sports. The following is an instance of card games that fall into our framework. Suppose that Alice and Bob each has a deck of three cards. In each deck one card is in suit ♡ and two cards are in suit ♠. They play three rounds; in each round Alice and Bob select one card and they reveal the cards simultaneously. If they select cards in same suit (both in ♡ or both in ♠), Alice wins this round; otherwise Bob wins this round. The one who wins two or three rounds gets utility 1; the other one wins zero or one rounds and it gets utility −1.

This game can be conveniently represented in our model using the following parameters:

m=n=T=3,P=(100011011),U=UM.

For this game, applying our first theorem, a (sub-game perfect) equilibrium strategy for both players exists, and it is to just to play uniformly random. It follows that Alice and Bob have utilities −1/3 and +1/3.

2.2. Extensive-Form Game With Perfect Recall

Any particular instance G(T, P, U) of our team competition is an extensive-form game. In this game, a history can be described by a tuple (k, a, b, c) where:

k indicates the number of rounds that has been played;

a is a k-dimensional vector which stores the players selected by Team 1 in the past k rounds;

b stores the players selected by Team 2;

c is a k-dimensional 0–1 vector which stores the results in the first k rounds, where 0 corresponds to a lose by Team 1 and 1 corresponds to a win by Team 1.

A behavioral strategy in this game is a mapping from every history to a probability distribution over actions. That is, at each history, the strategy of each team is to pick the next player according to a probability distribution. By Kuhn' Theorem (Kuhn, 1953; Osborne and Rubinstein, 1994), there is a sub-game perfect equilibrium (SPE), in which both teams use behavioral strategies. According to the SPE, a value V(H) can be defined for each history H, which indicates the expected utility that Team 1 would get at the end of the game if it is now at history H. Note that each history is the root of a sub-game and so the value of a history is the same as the value of the sub-game.

2.3. Computing SPE

By backward induction, one can easily get

LEMMA 1. Two histories have the same value if they have selected the same players to play (but may be in different orders) and Team 1 won the same number of rounds.

Based on the above lemma, the histories can be partitioned to equivalence classes, such that each equivalence class corresponds to a four-tuple (k, X, Y, w): k is a number in [T] which denotes the number of past rounds; X is a subset of A of size k; Y is a subset of B of size k; X, Y denote the players that have played; w is a number in [k] which denotes how many rounds Team 1 has won so far.

In the following, we show in detail how to compute the value of each equivalence class via dynamic programming.

Let V[k, X, Y, w] denote the expected utility of Team 1 when the history belongs to class (k, X, Y, w).

Clearly, we have V[k, X, Y, w] = U(w) when k = T.

When k < T, computing V[k, X, Y, w] reduces to computing the value of the matrix game M(k, X, Y, w) where the matrix M(k, X, Y, w) is defined as follows:

It consists of mk rows and nk columns. Each row corresponds to a player in AX, and each column corresponds to a player in BY. The cell corresponding to Ai, Bj equals to the expect utility of Team 1 when Team 1 and Team 2, respectively make action Ai and Bj on the current state (k, X, Y, w), which equals

V[k+1,X+{Ai},Y+{Bj},w+1]·Pi,j+V[k+1,X+{Ai},Y+{Bj},w]·(1-Pi,j).

The reason behind the above definition of M(k, X, Y, w) is as follows. If the two teams select Ai, Bj in this round, Team 1 has probability Pi, j to win this round and hence the history becomes (k + 1, X + {Ai}, Y + {Bj}, w + 1); besides, Team 1 has probability (1 − Pi, j) to lose this round and hence the history becomes (k + 1, X + {Ai}, Y + {Bj}, w).

We can compute the value of all equivalent classes of histories according to the above induction. In fact, by computing these values, we also find a sub-game perfect behavior strategy for both players. To see this, suppose that (k, X, Y, w) is a non-terminal equivalent class of history. On solving the matrix game M[k, X, Y, w] we find the strategies for all the histories in the history class (k, X, Y, w).

2.4. Uniformly Random Strategies

The next theorem states that, if there are no redundant players, uniformly random is an equilibrium strategy for the teams. It holds for arbitrary utility function including UE and UM.

DEFINITION 1. The uniformly random strategy is a behavioral strategy, in which a team always selects from the remaining players uniformly at random in each round.

THEOREM 1. When both teams have no redundant players (i.e., n = m = T), then it is a SPE when both teams apply the uniformly random strategy.2

We apply the following lemma for proving Theorem 1.

LEMMA 2. Suppose that there are no redundant players. Let 𝕊 denote the set of all perfect matchings between {A1, …, AT} and {B1, …, BT}. If Team 1 or Team 2 applies the uniformly random strategy, then the probability that the competition ends with any fixed matching in 𝕊 is exactly 1/(T!).

PROOF OF LEMMA 2: We only prove that the statement holds when Team 1 applies the uniformly random strategy. Symmetrically, the statement holds when Team 2 applies the uniformly random strategy.

First, suppose that Team 1 applies the uniformly random strategy while Team 2 applies an arbitrary pure strategy.3 In this case, we claim that the probability that the competition ends with any fixed matching is exactly 1/(T!). This can be proved by induction on the number of remaining τ rounds. In the stage with τ remaining rounds, let M be any fixed matching between the τ unused players in Team 1 and the τ unused players in Team 2. In the next round, it occurs with probability 1/τ that some edge e of M is chosen, because Team 1 will assign the player Bi selected by Team 2 to a random player among the unused players in Team 1. If this occurs, denote by M′ = M − {e} the matching obtained by deleting e from M, which is chosen with probability 1/((τ − 1)!) by induction hypothesis. Thus, M is chosen with probability 1/(τ!).

Finally, since a mixed strategy is a linear combination of the pure strategies, our job is done.

PROOF OF THEOREM 1: Let 𝕊 be the same set as in Lemma 2. As there are no redundant players, a game will always end with some matching in 𝕊. For any matching s ∈ 𝕊, let Zs denote the event that the game ends with this matching. If Team 1 applies the uniformly random strategy, it has expected utility

sSE(the utility of Team 1Zs)/PrTeam 1 uniformly random(Zs)        =sSE(the utility of Team 1Zs)/(T!)

The second equation is according to Lemma 2, which states PrTeam 1 uniformly random(Zs)=1/(T!).

Similarly, if Team 2 applies the uniformly random strategy, it will get

sSE(the utility of Team 2Zs)/(T!)        =sS-E(the utility of Team 1Zs)/(T!)

Therefore, it is a Nash Equilibrium if both teams apply the uniformly random strategy. The argument can be similarly extended to show that it is SPE.

In the remainder of this paper, we focus on the case where there are redundant players.

CLAIM 1. If there are redundant players, then the uniformly random strategy may not be a SPE strategy (For any team, a SPE strategy for this team requires that it is optimal in each subgame).

This claim is obvious for a team with redundant players; but less obvious for a team without redundant players. Here we give an example in which the uniformly random strategy is not a SPE strategy for a team with no redundant players.

EXAMPLE 1. Let m = T = 2, n = 3, U = UE = UM (UE = UM when T = 2), P=(001110).

According to method given in subsection 2.3, we can compute that M(0,,,0)=(111001). Therefore, in the SPE, the behavior for Team 1 on the initial state should be select A1 with probability 1/3 and select A2 with probability 2/3. This guarantees (expected) utility −1/3. If to the contrary that Team 1 adopts the uniformly random strategy, it should select A1, A2 with probability 1/2, which would only guarantees (expected) utility −1/2.

2.5. Transitive Strength

DEFINITION 2. Player Ai is weaker than its teammate Aj, denoted by Ai ≤ Aj, if for any opponent Bk, the probability of “Ai wins against Bk” is less than or equal to the probability of “Aj wins against Bk.” Similar for Team 2. Team 1 {A1, …, Am} are transitive if there is a permutation π of 1, …, m, such that Aπ(1) ≤ … ≤ Aπ(m). Similar for Team 2.

DEFINITION 3. A utility function U is monotone if U(t + 1) ≥ U(t) for t ∈ 0, … T − 1.

THEOREM 2. Assume monotone utility function. Then, (1) If Am ≤ … ≤ A1, Team 1 has a SPE strategy which only selects, in each round, one of the players in A1, …, AT. (2) Symmetrically, if Bn ≤ … ≤ B1, Team 2 has a SPE strategy which only selects, in each round, one of the players in B1, …, BT.

By combining Theorem 1 and Theorem 2, we can immediately get the following

COROLLARY 1. When players in each team are transitive and U is monotone, there is a simple SPE strategy for both teams as follows. Assume that Am ≤ … ≤ A1 and Bn ≤ … ≤ B1. Then, the SPE strategy for Team 1 is to select an unused player in A1, …, AT uniformly random in each round; a SPE strategy for Team 2 is to select an unused player in B1, …, BT uniformly random in each round.

Theorem 2 and Corollary 1 have many applications. In the real word, the utility function is monotone, and in many situations, such as in board or sport games, it is indeed the case that the players are transitive.

We prove Theorem 2 (1) in the next; the claim (2) is symmetric.

We first provide two basic terminologies which are necessary for understanding the subsequent proof. Suppose that Am ≤ … ≤ A1 and that A′ is a subset of A and A=(Ai[1],Ai[|A|]), where i[1] < … < i[|A′|]. Then, for any 0 ≤ C ≤ |A′|, the top C players of A′ refers to {Ai[1], …, Ai[C]}, and the rank C player of A′ refers to Ai[C].

LEMMA 3. Suppose that U() is monotone.

1. Consider a pair of history classes H1 = (k, X1, Y, w1) and H2 = (k, X2, Y, w2). We claim that, if the top Tk players of A − X1 and the top T − k players of A − X2 are the same and w1w2, then V(H1) ≥ V(H2).

2. Let H = (k, X, Y, w) be a non-terminal history class. Let Au be the rank T − k player in AX and Av be any player in AX that is not a top T − k player. Then, the row in M(k, X, Y, w) that corresponds to Au dominates the row that corresponds to Av. As a result, there is an equilibrium strategy at history H (for Team 1) which only selects the top T − k unmatched players to play.

PROOF. We prove it by backward induction. When k = T, Claim 1 holds according to the monotone property of U(); and Claim 2 naturally holds since it is a terminal history.

Now, we argue that, for 0 ≤ k < T, if the lemma holds for k + 1, it also holds for k.

First, we prove Claim 2. Let us compare the two rows corresponding to Au and Av. Let us fix a column, say the one corresponding to Br. The cell corresponding to (Au, Br) is

M[u,r]=V(k+1,X+{Au},Y+{Br},w+1)a·Pu,r                +V(k+1,X+{Au},Y+{Br},w)b·(1-Pu,r)

The cell corresponding to (Av, Br) is

M[v,r]=V(k+1,X+{Av},Y+{Br},w+1)a·Pv,r                +V(k+1,X+{Av},Y+{Br},w)b·(1-Pv,r)

Notice that the top Tk − 1 players in AX − {Au} and AX − {Av} are the same. So, from the induction hypothesis, a′ ≥ aa′ ≥ bb′ ≥ b, i.e., a = a′ ≥ b = b′.

Since that Au is the top Tk player while Av is not, player Av is weaker than Au, which means that Pu, rPv, r.

Combining the above arguments, we get that

M[u,r]-M[v,r]=(a-b)·(Pu,r-Pv,r)0.

Therefore, M[u, r] ≥ M[v, r], and thus Claim 2 holds.

Then, we prove Claim 1. Let M1 denote M(k, X1, Y, w1) and M2 denote M(k, X2, Y, w2) for short. Suppose that Au is a top Tk player in AX1 (which is also a top Tk player in AX2) and that Br is any player in BY.

We know

M1[u,r]=V(k+1,X1+{Au},Y+{Br},w1+1)·Pu,r+                     V(k+1,X1+{Au},Y+{Br},w1)·(1-Pu,r)M2[u,r]=V(k+1,X2+{Au},Y+{Br},w2+1)·Pu,r+                     V(k+1,X2+{Au},Y+{Br},w2)·(1-Pu,r)

By induction hypothesis, it follows that M1[u, r] ≥ M2[u, r].

Now, let σ denote the equilibrium strategy at H2 that only selects the top Tk unmatched players to play (Such a strategy exists according to Claim 2). Note that σ is also a legal strategy at H1. Let μ(H1, σ) and μ(H2, σ), respectively denote the utility of Team 1 when it applies strategy σ on H1 and H2. Then,

μ(H1,σ)=minr:BrYu:AuA-X1σ(Au)·M1(u,r),μ(H2,σ)=minr:BrYu:AuA-X2σ(Au)·M2(u,r).

From the inequality M1(u, r) ≥ M2(u, r), we get μ(H1, σ) ≥ μ(H2, σ). Moreover, we also have V(H1) ≥ μ(H1, σ) and V(H2) = μ(H2, σ) (the equality is since that σ is the equilibrium strategy on H2). Together, V(H1) ≥ V(H2).

Finally, Claim 2 of Lemma 3 implies Theorem 2 (1).

2.6. Non-transitive Strength

DEFINITION 4. A player is said to be weakest, if it is weaker than all its teammates; and is said to be dominated, if it has 0 probability to win against any player in the opponent team.

Assume that the utility function is monotone. In the previous section, we show that if there are redundant players in Team 1 and if the strength of players of Team 1 are transitive, then there is a SPE strategy for Team 1 which does not select the weakest player. In other words, Team 1 can abandon the weakest one without decreasing its utility. In this section, we show that the transitivity is essential for this to hold. We start by the following claim.

CLAIM 2. Suppose that Team 1 has redundant players, and some player Au in Team 1 is weaker than all its teammate, and yet the players in Team 1 are not transitive. Then, Team 1 might decrease its utility by abandoning Au.

This is somewhat counterintuitive; it might be intuitive that the weakest player has no chance to participate in any match since one can always replace him by a better teammate and increase utility.

Perhaps even more surprisingly, we have the following claim:

CLAIM 3. Suppose that Team 1 has redundant players, and some player Au in Team 1 is dominated by the other team (i.e., has no chance to win at all), and the players in Team 1 are not transitive. Then, Team 1 might decrease its utility by abandoning the dominated player Au.

The above claims confirm that, the weakest player or even dominated player could help its team.

We would now like to state the organization of the remainder of the section. In subsection 2.6.1, we give examples that verify Claim 3, and we briefly explain the reason why we need dominated players. In subsection 2.6.2, we identify a special case where the weakest player can be abandoned without changing the utility. In subsection 2.6.3, we consider the optimal number of dominated players that we may need to achieve maximum utility. In subsection 2.6.4, we discuss the limitations of the dominated players.

2.6.1. Dominated Teammates Can Be Helpful

Let V(T, P, U) denote the value of game G(T, P, U). Let P* denote the sub-matrix of P by deleting the last row (thus G(T, P*, U) is the game where Team 1 has abandoned Am).

EXAMPLE 2. Let n = m = 3, T = 2, U = UE (recall that UE(t) = tT/2). P=(100010000).

In Example 2, there are redundant players and the players in each team are not transitive. Besides, A3 is a dominated player. We argue the follows: (I) If A3 is abandoned, Team 2 can win both rounds and hence V(T, P*, U) = −1. (II) If A3 is in the team, Team 2 cannot win both rounds with certainty and that means V(T, P, U) > −1. Combining (I) and (II), we get V(T, P, U) > V(T, P*, U), which implies Claim 3.

PROOF OF (I): If A3 is abandoned, Team 2 can play as follows. It chooses B3 to win the first round. If B3 defeated A1, it chooses B1 in the second round to beat A2; otherwise, it chooses B2 in the second round to beat A1.

PROOF OF (II): If Team 2 wants to win with certainty in both rounds, it must select B3 to play the first round. However, if Team 1 selects the dominated player A3 to play the first round, Team 2 cannot win the second round with certainty anymore.

From this example, we see why a dominated player might be helpful for its team. The reason behind is similar to the horse race story described at the beginning of Tang et al. (2010).

In the next, we give one more example. It gives, to our best knowledge, the largest decrease of the value of the game by abandoning a dominated player.

EXAMPLE 3. Let m = 4, n = T = 3. Let U = UE or U = UM. Let P=(100010001000).

According the method shown in subsection 2.3, we can compute that4

V(T,P,UM)=0;V(T,P*,UM)=-2/3;V(T,P,UE)=-1/2;V(T,P*,UE)=-1/2.

So, for the game G(T, P, UM), we will lose utility as much as 2/3 if we abandon the dominated player.

In the following we explicitly state a SPE strategy for Team 1. In the first round it selects the dominated player A4. Without loss of generality, assume that it loses to B1. In the second round, it selects A2, A3 uniformly random. So, there is 1/2 chance that Team 1 wins this round. Furthermore, if Team 1 wins the second round (say A2 beats B2) it can also wins the next (let A3 beat B3) and thus gets utility 1. By this strategy, there is 1/2 chance to get utility 1 and 1/2 chance to get utility −1, so the expected utility is 0.

However, for the game (T, P, UE), we do not lose any utility by abandoning the dominated player. This is not a coincidence. In fact, this example belongs to a special case where the weakest player can indeed be abandoned. We show this in the next theorem.

2.6.2. A Case Where the Weakest Player Can Be Abandoned

We have seen that the weakest redundant player is useless when the players are transitive (as proved in Theorem 2) but might be useful when the players are not transitive (as shown in the previous subsection). So, the next question is:

QUESTION 1. If the players are not transitive, in what cases can the weakest player be abandoned?

We get the following result.

THEOREM 3. Suppose that Team 1 has redundant players but Team 2 does not. So, m > T and n = T. Moreover, suppose that each player in AT+1, …, Am is weaker than each player in A1, …, AT. Let U = UE. Then, Team 1 can abandon all the players in AT+1, …, Am without losing its utility.

REMARK 1. According to Example 3, the claim in Theorem 3 fails when U = UM. As a comparison, by recruiting extra dominated players, a team can gain more utility when U = UM, but cannot when U = UE. This may suggest that UE is more reasonable than UM in team competition.

The condition m > n = T is important. If both team got redundant players, the claim in Theorem 3 fails.

We need the following lemma in proving Theorem 3. It is a technical statement of probability theory.

LEMMA 4. Assume that n = T and Ai, Bj are any pair of players from the two teams. Let Qi,jσ denote the probability that Ai meets Bj in the game when Team 2 applies the uniformly random strategy and Team 1 applies some strategy σ. Then, Qi,jσ1/T.

PROOF. We prove it by induction on n. The case n = 1 is trivial. Suppose that the lemma holds for n − 1, and we now argue that it also holds for n. Assume that by applying σ, Team 1 has probability p to selects Ai in the first round. Then, the probability that Ai meats Bj in the game is at most p1n+(1-p)(1-1n)·1n-1 (the term 1n-1 is due to the induction hypothesis). Therefore, Qi,jσp1n+(1-p)1n=1n=1T.

PROOF OF THEOREM 3: We call AT+1, …, Am the weak players. When the weak players are abandoned, there are T remaining players for each team. By Theorem 1, the uniformly random strategy is a SPE strategy for Team 2. To prove Theorem 3, the key idea is to show that even if Team 1 is allowed to select the weak player, it will not gain more utility if Team 2 keep using the uniformly random strategy. On the other hand, it is obvious that Team 2 can't gain more utility (when Team 1 is allowed to select more players). Therefore, the value of game does not change when the weak players are allowed to play.

First, we compute the utility U* of Team 1 when this team abandons its weak players. As an application of Lemma 4, for any pair of two players Ai, Bj (1 ≤ i, jT), they meet with a probability no more than 1/T. It follows that this probability equals 1/T, as the sum of all these T·T probabilities equals T. Because Ai meets Bj with probability 1T and Ai wins Bj with probability Pi, j when they meet, it follows that the number of rounds t that Team 1 wins equals j=1..Ti=1..T1TPi,j in expectation. Therefore,

U*=(j=1..Ti=1..T1TPi,j)-T2.

We now state a formula of the utility Uσ of Team 1 when it does not abandon its weak players and it applies some strategy σ against the uniformly random strategy of Team 2. Let Qσ(i, j) be defined as Lemma 4. Similar as above, the number of rounds t that Team 1 wins equals j=1..Ti=1..mQi,jσPi,j in expectation. Therefore,

Uσ=(j=1..Ti=1..mQi,jσPi,j)-T2

We only need to prove that UσU*, and it reduces to showing that for any fixed j in 1..T,

i=1..mPi,jQi,jσi=1..T1TPi,j    (2)

To prove (2), consider the following optimization problem:

{Variables:x=(x1,,xm)Parameters:c=(c1,,cm)Guarantee:cici((i,i) such that iT<i)Constraint 1:0xi1T(1im)Constraint 2:i=1mxi=1Objective:maxf(x)=i=1mcixi

Clearly, f(x) is maximized at x*, where xi*={1TiT0i>T.

Noticing the following facts, we see that inequality (2) is just an application of the above problem.

Qi,jσ1T(Applying Lemma4)i=1mQi,jσ=1(According to the definition)iT<i,Pi,jPi,j(SinceAiis weaker thanAi)

2.6.3. Optimal Number of Dominated Players

Here we study the power of dominated players in another direction. As we see in subsection 2.6.1, by abandoning a redundant dominated player, Team 1 may decrease its utility. In other words, Team 1 may increase its utility by recruiting more dominated players. Note that the utility of Team 1 will not decrease by recruiting more dominated players. However, it is unclear that the utility will strictly increase by doing so. For example, recruiting T dominated players is the same as recruiting T − 1 players — in any case, if any team uses T dominated players in a competition, it gets the lowest utility! So, a natural question is:

QUESTION 2. In order to maximize the expected utility of Team 1 (i.e., the value of the game), how many dominated players should we recruit at least? Is it possible that we need as many as Θ(T) such players?

The theorem below answers this question.

THEOREM 4.

1. Suppose U = UE. Recruiting T − 1 dominated players can be better than T − 2, but recruiting T dominated players is the same as T − 1. So, to achieve optimal utility, one may require T − 1 dominated players. This number is tight.

2. Suppose U = UM. RecruitingT/2⌋ dominated players can be better thanT/2⌋ − 1, but recruitingT/2⌋ + 1 dominated players cannot be better thanT/2⌋. So, to achieve optimal utility, one may requireT/2⌋ dominated players, and this number is tight as well.

One direction in these claims are rather trivial; we should never use T dominated players when U = UE or ⌊T/2⌋ + 1 players when U = UM. To prove the other direction, we need to construct some examples in which recruiting T − 1 (resp. ⌊T/2⌋) could be better than T − 2 (resp. ⌊T/2⌋−1) when U = UE (resp. U = UM). To construct such examples, an intuition is that we should make the current players in Team 1 as weak as possible. Our construction is as follows:

EXAMPLE 4. T ≥ 1, m = T, n = T + (T − 1), U = UE, Pi,j={1i=j0ij.

EXAMPLE 5. T ≥ 1, m = T, n = T + ⌊T/2⌋, U = UM, Pi,j={1i=j0ij.

The following claims together prove Theorem 4.

C1. In Example 4, if Team 1 only recruit T − 2 dominated players, it can win no rounds and thus can get utility −T/2.

C2. In Example 4, if Team 1 recruit T − 1 dominated players, it can win a positive number of rounds in expected and thus gain utility more than −T/2.

C3. In Example 5, if Team 1 only recruit ⌊T/2⌋ − 1 dominated players, it will always lose at least ⌊T/2⌋ + 1 rounds and thus can only get utility −1.

C4. In Example 5, if Team 1 recruit ⌊T/2⌋ dominated players, it can sometimes win at least ⌈T/2⌉ rounds and thus can gain utility more than −1.

PROOF OF C1: In this case Team 2 can win all the rounds by playing as follows: in the first T − 1 rounds, it selects the players BT+1, …, B2T − 1 to play; and they all win. Then, since Team 1 only has T − 2 dominated players, at least one player in A1, …, AT has already played, denote it by Ai. In the last round, Team 2 select Bi and it definitely wins.

PROOF OF C3: In this case, by applying a strategy similar to C1, Team 2 can win all the first ⌊T/2⌋ + 1 rounds.5

PROOF OF C2: For convenience, we denote the T − 1 dominated players by AT+1, …, A2T−1. We argue that, if Team 1 applies the uniform random strategy (that is, select one unused player in A1, …, A2T−1 uniformly random in each round), then, Team 2 has no strategy to win all rounds all the time. Suppose to the contrary that Team 2 can do it, it must select a player from BT+1…, B2T−1 to play in the first round; otherwise there is a chance that it loses the first round. Note that, since Team 1 apply the uniform random strategy, there is a chance that Team 1 select a dominated player in the first round. If this happens, Team 2 must again select a player from BT+1…, B2T−1 to play in the second round. Once again, Team 1 might still select a dominated player in the second round. By induction, there is chance that Team 1 select all the dominated players in the first T − 1 rounds while Team 2 consumes all its T − 1 invincible players in BT+1…, B2T−1. Then, Team 2 cannot win with certainty in the last round.

The claim C4 is the most non-trivial. To prove it we first state the following lemma.

DEFINITION 5. For integers a, b, C such that

C1,0aC/2,0bC/2,    (3)

let Γa,bC denote the following instance of team competition:

m=n=(C-a)+(C/2-b),T=C-a-b,Pi,j={1i=jC-a0otherwise.

The utility is as follows6: if Team 1 wins at least ⌈C/2⌉−a rounds, it gets utility 1 and Team 2 gets −1; otherwise, Team 1 gets utility −1 and Team 2 gets 1.

LEMMA 5. For integers a, b, C satisfying condition (3), Team 1 can win utility larger than −1 in the game Γa,bC.

PROOF. Consider three cases.

Case 1 a = ⌈C/2⌉. In this case, Team 1 always get utility 1, and so Γa,bC has value 1, which is larger than −1.

Case 2 b = ⌊C/2⌋. The game Γa,bC can be restated as follows.

m = n = Ca, T = ⌈C/2⌉ − a.

Player Ai can only defeat Bi for i in 1..m.

Team 1 gets utility 1 if it wins all the rounds; and −1 otherwise.

We argue that the uniformly random strategy guarantees Team 1 an expected utility larger than −1. Equivalently speaking, by applying the uniformly random strategy, Team 1 has a chance to win all the rounds. The proof is as follows. In the first round, there is a positive chance that Ai meets Bi for some i. Then, in the second round, same thing happens with a positive chance. This could happen for each round. When these coincidences happen, Team 1 wins all the rounds.

Case 3 a < ⌈C/2⌉ and b < ⌊C/2⌋.

We use induction. Assume that Γa+1,bC and Γa,b+1C both have value larger than −1, we argue that so does Γa,bC. The following facts follow from the definition of Γa,bC.

Fact 1. If the two teams select Ai and Bi for iCa in the first round, it becomes a sub-game that is equivalent to Γa+1,bC.

Fact 2. If the two teams select Ai and Bi for i > Ca in the first round, it becomes a sub-game that is equivalent to Γa,b+1C.

Combining them with the induction hypothesis, we get

Fact 3. If the two teams select players under the same index, it becomes a sub-game whose value is larger than −1.

The value of Γa,bC is equal to the value of the matrix game M, where M(i, j) indicate the value of the sub-game when Team 1 select Ai and Team 2 select Bj in the first round. Fact 3 implies that all the utilities on the diagonal of matrix M are larger than −1. So, by using uniformly random strategy over its players, Team 1 can win a utility larger than −1. Therefore, Γa,bC has value larger than −1.

PROOF OF C4: Let G denote the revised game of Example 5, in which Team 1 has recruited ⌊T/2⌋ dominated players. We could observe that game G is almost the same as Γ0,0T. To be more specific, when T is odd, G is exactly Γ0,0T; when T is even, the parameters m, n, T, P in G and Γ0,0T are the same; but the utility U is slightly different.

Suppose that the value of G is −1. Then, Team 2 has a strategy which guarantees a expected utility −1. It means that Team 2 has a strategy which can always win T2+1 or more rounds. When Team 2 applies this strategy, Team 1 can never win T2 rounds. It further implies that the value of Γ0,0T is also −1. However, this contradicts with Lemma 5. Therefore, the value of G must be larger than −1.

2.6.4. Limitations of the Dominated Players

Although the presences of dominated players can affect the value of the game, we conjecture that it will not be too much. A question is then,

QUESTION 3. By abandoning a dominated player, how much value might be lost in the worst case? In other words, how much extra (expected) utility can a team gain by recruiting dominated players?

According to our simulations, we have the following conjecture that we cannot prove at the moment.

CONJECTURE 1. If U = UM, we can gain at most 2/3 extra (expected) utility (in other words, the value of the game increases by at most 2/3) by recruiting arbitrary number of dominated players. If U = UE, we can gain at most 1 extra (expected) utility by recruiting arbitrary number of dominated players.

2.6.5. Throwing a Match and Discarding a Player

Recall the card game between Alice and Bob in subsection 2.1. We shall point out that, recruiting a dominated player in this context can be thought of applying a cheating action, which is to throw a match by not placing any card in that round. In the mentioned card game, if Alice and Bob are not allowed to throw a match, Alice can get expected utility −1/3; if Alice is allowed to throw a match, she can get expected utility 1/3. This can be computed according to the method shown in subsection 2.3. Therefore, throwing a match is profitable if permitted.

It may seem unnatural to let a team throw a match like this. The following alternative cheating action called discarding, which may seem more natural, is still profitable for the team.

Discarding is defined as follows. Alice (Team 1) is allowed to discard one of its cards and agrees to lose in that round; however the discarded card is never revealed to Bob (Team 2). By discarding, Alice do not gain one more card at hand, unlike the case of throwing a match.

However, if discarding is allowed, it may still be beneficial. We give an instance in which one may gain extra utility by discarding. Formally, we have the following result.

CLAIM 4. For every integer K > 0, there exists a game G such that VK(G) > VK−1(G), where VK(G) denotes the value of game G in which Team 1 is allowed to discard at most K players.

EXAMPLE 6. m = K + 1, n = 2K + 1, T = K + 1, U = UE, Pi,j={1i=j0ij..

The following claims together imply Claim 4.

C5 In Example 6, if Team 1 is only allowed to discard K − 1 times, it cannot win in any round.

C6 In Example 6, if Team 1 is allowed to discard K times, it can win some rounds in expectation.

PROOF OF C5: Suppose that Team 1 is only allowed to use discarding K − 1 times. Observer that, for any i in 1…m, after Ai has played and revealed by Team 1, player Bi becomes invincible that he would win with certainty if he plays in the next rounds. Notice that there are K invincible players at beginning (which are BK+2B2K+1) and Team 1 has only K − 1 chances to hide a player by discarding. So, in any round, Team 2 has an invincible player at hand. Therefore, Team 2 can win all the rounds.

PROOF OF C6: Consider the following strategy for Team 1. First, Team 1 randomly chooses an order of the players (say, each order with possibility 1/m!), and then randomly chooses exactly K of its players so that these players will be discarded while playing. We argue that this strategy guarantees Team 1 to win positive rounds in expectation. It reduces to proving that no strategy of Team 2 can win all the rounds against this strategy. Suppose that Team 2 can do so. In the first round, it must select an invincible player (i.e., a player in BT+1BT+K). Otherwise, there is a chance that it loses this round. And then, we know that there is a chance Team 1 discards in the first round. If this happens, Team 2 must also select an invincible player in the second round. Again, it is possible that Team 1 still discards in this round. Continuing this process we see that, it could happen that, in the first K rounds Team 2 use all its K invincible players, while Team 1 uses discarding K times. Then, Team 2 could lose in the K + 1-th round.

3. Materials and Methods (for a Variant): Take Actions in Turn

We now turn to the aforementioned non-simultaneous variant where Team 2 sends its player before Team 1 in each round. As mentioned in section 1, our techniques for solving the original simultaneous variant easily extends to the non-simultaneous variant, and most of our results for the non-simultaneous variant are aligned with the results stated in the previous section; see the small difference in Table 1.

First, consider the easiest case where n = m = T. Recall that S denotes the set of all perfect matchings between the T players in Team 1 and the T players in Team 2 (defined in Lemma 2). A key observation is that no matter what Team 2 does, Team 1 can make sure that the matching result between the two teams equals the one in S that benefits Team 1 the most (for a given utility function U). Thus a simple SPE can be described easily based on this particular matching (However, we do not declare that there is always an efficient algorithm for computing this perfect matching. For example, we are not aware of any good algorithms for computing it when U = UM. Yet there are efficient algorithms for U = UE).

THEOREM 5. When both teams have no redundant players (i.e., n = m = T), then it is a SPE when Team 1 applies the strategy so that the matching result is the same as the one that benefits Team 1 the most.

For the case of transitive strength, we have the following result which aligns with Theorem 2.

THEOREM 6. Assume monotone utility function. Then, (1) If Am ≤ … ≤ A1, Team 1 has a SPE strategy which only selects, in each round, one of the players in A1, …, AT. (2) If Bn ≤ … ≤ B1, Team 2 has a SPE strategy which only selects, in each round, one of the players in B1, …, BT. (Be aware that (1) is not symmetric to (2) for the non-simultaneous variant as Team 2 is no longer symmetric to Team 1.)

Recall the history classes below Lemma 1 and the terminologies introduced above Lemma 3. In addition, when H = (k, X, Y, w) is a history class (as in the simultaneous case), let H(σ) (σ ∈ BY) denote the history class (in the non-simultaneous case) indicating that Team 2 have sent player σ after arriving at H.

Proving Theorem 6 (1) reduces to proving the following lemma which is similar to Lemma 3.

LEMMA 6. 1. Consider a pair of history classes H1 = (k, X1, Y, w1) and H2 = (k, X2, Y, w2), where the top Tk players of AX1 and the top Tk players of AX2 are the same. If w1w2, we have V(H1) ≥ V(H2).

2. Consider a non-terminal history class H(σ) where H = (k, X, Y, w) and k < T. Let Au be the rank Tk player in AX, and let Av be any player in AX that is not a top Tk player. Then, for Team 1, selecting Au is at least as good as selecting Av to play against σ at H(σ).

Our proof of Lemma 6 is analogous to our proof of Lemma 3.

PROOF. We prove it by backward induction. For k = T, claim 1 holds obviously and claim 2 holds naturally. Assume the lemma holds for k + 1, we now prove that it also holds for k.

Proof of claim 1. Clearly, V(H1)=minσB-YV(H1(σ)) and V(H2)=minσB-YV(H2(σ)). Therefore, it reduces to proving that V(H1(σ)) ≥ V(H2(σ)) for any σ that belongs to BY.

Applying claim 2 on H2(σ), we obtain that there exists a top Tk player Au in AX2 such that

V(H2(σ))=V((k+1,X2+{Au},Y+{σ},w1+1))a·Pu,σ                        +V((k+1,X2+{Au},Y+{σ},w1))b·(1-Pu,σ)

As the top Tk players in AX1, AX2 are the same, Au is also a player in AX1, and thus

V(H1(σ))V((k+1,X1+{Au},Y+{σ},w1+1))a·Pu,σ                        +V((k+1,X1+{Au},Y+{σ},w1))b·(1-Pu,σ)

By the induction hypothesis, a = a′ and b = b′. Altogether, V(H1(σ)) ≥ V(H2(σ)).

Proof of Claim 2. The utilities of selecting Au and Av at the history class H(σ) are respectively

V=V((k+1,X+{Au},Y+{σ},w+1))a·Pu,σ    +V((k+1,X+{Au},Y+{σ},w))b·(1-Pu,σ)
V=V((k+1,X+{Av},Y+{σ},w+1))a·Pv,σ    +V((k+1,X+{Av},Y+{σ},w))b·(1-Pv,σ)

Notice that a = a′ and b = b′ and ab according to the induction hypothesis. Therefore,

V-V=a(Pu,σ-Pv,σ)+b(Pv,σ-Pu,σ)=(a-b)(Pu,σ-Pv,σ)0.

 

Briefly, (for the current round) it is by the definition that the player Au with rank Tk performs better than any player Av with rank bigger than Tk, and (for the remaining rounds) the set of the top T − (k + 1) players are the same regardless of who we choose between Au and Av. Thus, Au dominates Av.

Theorem 6 (2) can be proved by a similar argument; we only state a key lemma and omit its proof.

LEMMA 7. 1. Consider a pair of history classes H1 = (k, X, Y1, w1) and H2 = (k, X, Y2, w2), where the top Tk players of BY1 and the top Tk players of B − Y2 are the same. If w1w2, we have V(H1) ≥ V(H2).

2. Consider a non-terminal history class H = (k, X, Y, w) (where k < T). Let Bu be the rank T − k player in B − Y, and let Bv be any player in BY that is not a top T − k player. Then, for Team 2, selecting Bu is at least as good as selecting Bv in the subsequent (k + 1)-th round.

We now move on to the more challenging case where the strength of the players is not transitive. For this case, our first theorem is analogous to Theorem 3 in that it demonstrates a special condition under which we can abandon some weak players.

THEOREM 7. Assume the utility function is monotone.

1. Suppose m > T and each player in AT+1, …, Am is weaker than each player in A1, …, AT. If n = T, Team 1 can abandon all the players in AT+1, …, Am without losing its utility. However, if n > T, abandoning these weaker players may decrease the utility of Team 1.

2. Suppose n > T and each player in BT+1, …, Bn is weaker than each player in B1, …, BT. No matter m equals T or not, abandoning the players BT+1, …, Bm may decrease the utility of Team 2.

According to Theorem 7, when a team has redundant weaker players and its opponent team has no redundant players, whether the weaker players can be abandoned depends on which team takes action first.

PROOF OF THEOREM 7: 1. First, assume n = T. Among all possible matching results between the m players in Team 1 and the T players of Team 2 that give Team 1 the highest (expected) utility, there exists a matching result s that matches A1, …, AT to the T players of Team 2 (because AT+1, …, Am are weaker than A1, …, AT). Team 1 can gain the same utility (implied by s) even if AT+1, …, Am are abandoned.

If n > T and U ∈ {UE, Um}, abandoning the weaker players may decrease the utility of Team 1. We prove this by constructing an example in the following (this is basically Example 2 yet U is more general).

EXAMPLE 7. m = n = 3, T = 2. U ∈ {UE, UM}. P=(100010000).

For this example, Team 1 can win exactly 1 round and will lose all rounds if A3 is abandoned.

2. We give two examples to prove this claim (one for m = T and the other for m > T).

EXAMPLE 8. m = 2, n = 3, T = 2. U ∈ {UE, UM}. P=(011101).

In this example, Team 2 can win exactly 1 round and will lose all rounds if B3 is abandoned.

EXAMPLE 9. m = 4, n = 5, T = 3. U = UE. P=(10011010110011100011).

In this example, Team 2 can win exactly 1 round and will lose all rounds if B4, B5 are abandoned.

We now study the optimal number of dominated players. The following theorem is a counterpart of Theorem 4. It says that for U = UE we need T − 1 in the worst case, and for U = UM we need ⌊T/2⌋ in the worst case. Interestingly, the same bounds hold for Team 1 and Team 2 and for the simultaneous case.

THEOREM 8. The following hold claims for Team 1 and Team 2.

1. Suppose U = UE. Recruiting T − 1 dominated players can be better than T − 2, but recruiting T dominated players is the same as T − 1. So, to achieve optimal utility, one may require T − 1 dominated players. This number is tight.

2. Suppose U = UM. RecruitingT/2⌋ dominated players can be better thanT/2⌋ − 1, but recruitingT/2⌋ + 1 dominated players cannot be better thanT/2⌋. So, to achieve optimal utility, one may requireT/2⌋ dominated players, and this number is tight as well.

PROOF. The proof of the two claims on Team 1 is easy and is very similar to the proof of Theorem 4. Recall Example 4 and 5 in the proof of Theorem 4. It can be observed that for Example 4 (where U = UE), Team 1 can win nothing when it recruits T − 2 dominated players, and can win exactly one round when it recruits T − 1 dominated players. This means that it needs T − 1 dominated players to achieve the optimum utility (and more than T − 1 dominated players is obviously not needed). For Example 5 (where U = UM), Team 1 can win nothing when it recruits ⌊T/2⌋−1 dominated players, and can win as many as T − ⌊T/2⌋ rounds when it recruits ⌊T/2⌋ dominated players. This means that it needs ⌊T/2⌋ dominated players to achieve the optimum utility (and more than ⌊T/2⌋ dominated players is clearly not needed).

The proof of the claims on Team 2 is also easy but have to use different examples (Note that these examples are not symmetric to the examples given in Example 4 and 5).

EXAMPLE 10. T ≥ 1, m = T+T − 2, n = T, U = UE, Pi,j={1i=j0ij.

EXAMPLE 11. T ≥ 1, m = T + ⌊T/2⌋ − 1, n = T, U = UM, Pi,j={1i=j0ij.

For Example 10 (where U = UE), Team 2 can win nothing when it recruits T − 2 dominated players, and can win exactly one round when it recruits T − 1 dominated players. Therefore it needs T − 1 dominated players to achieve the optimum utility (and more than T − 1 dominated players is obviously not needed).

For Example 11 (where U = UM), Team 2 can win nothing when it recruits ⌊T/2⌋ − 1 dominated players, and can win as many as T − ⌊T/2⌋ rounds when it recruits ⌊T/2⌋ dominated players. Therefore it needs ⌊T/2⌋ dominated players to achieve the optimum utility (and more than ⌊T/2⌋ is clearly not needed).

4. Discussion

In this paper, we study a novel game-theoretic model of situations where two teams make sequential decisions about which of a set of exhaustible actions to select in each round. These actions can be interpreted as team members, cards in a hand, etc. This model has applications in solving the DMRTA problem we introduced at the beginning of this paper. We present a simple SPE for the case where there are no redundant players or the strength of players is transitive. For the other case, we exhibit evidence that the redundant dominated players cannot be easily discounted in their contribution to team performance, which may appear counterintuitive. We investigate the power of the dominated players in three directions: (1) When do they influence the value of the competition? (2) If additional dominated players can be recruited, how many should be required to attain the maximum utility? (3) How much utility might be lost at most if we abandon them? We obtain several non-trivial results that fully or partially answer these questions. We believe that our results are of particular interests to both designers and players of team competitions.

Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

Author Contributions

KJ did the key analysis and experiments, wrote the paper, and generalized several preliminary results obtained during his discussions with the other authors. SC joined KJ in several important discussion, which helped KJ in obtaining many preliminary results. PT joined the discussion later, polished the paper, and especially helped in writing the introduction, including the related works. JP helped in finding some applications and joined after we made our conference version. All authors contributed to the article and approved the submitted version.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 61972434, and the Fundamental Research Funds for the Central Universities, Sun Yat-sen University, under Grant 19LGPY292.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer VC declared a shared affiliation, with no collaboration, with one of the authors SC to the handling editor at the time of the review.

Footnotes

1. ^This team competition model was introduced in the conference version of this paper, which was accepted by the 2016 International Conference on Autonomous Agents and Multiagent Systems (Jin et al., 2016) [Yet the conference version did not discuss (1) its application in DMRTA and (2) the react-in-turn variant].

2. ^Note that there could be other SPEs. For example, when players in Team 1 always lose, any strategies for the two teams form a SPE.

3. ^A pure strategy does NOT mean it determines the entire order of players at the beginning. Instead, it means that at each possible history, some unmatched player will be selected deterministically in the upcoming round. In this way, any mixed strategy is a linear combination of the pure strategies.

4. ^The value of G(T,P*,UM) and G(T,P*,UM) can be simply computed according to Theorem 1 since there are no redundant players in these games.

5. ^In this case Team 2 can actually win all the T rounds.

6. ^Here the utility functions for two teams are not identical. However, since it is still a zero-sum game, SPE strategies for the teams exists as before. The requirement that the utility functions are identical is not necessary in our model.

References

Altman, A., Procaccia, A. D., and Tennenholtz, M. (2009). “Nonmanipulable selections from a tournament,” in IJCAI, ed C. Boutilier (Pasadena, CA), 27–32.

Google Scholar

Botelho, S., and Alami, R. (1999). “M+: a scheme for multi-robot cooperation through negotiated task allocation and achievement,” in Proceedings 1999 IEEE International Conference on Robotics and Automation (Detroit, MI), Vol. 2, 1234–1239. doi: 10.1109/ROBOT.1999.772530

CrossRef Full Text | Google Scholar

Dias, M. B., Zlot, R., Kalra, N., and Stentz, A. (2006). Market-based multirobot coordination: a survey and analysis. Proc. IEEE 94, 1257–1270. doi: 10.1109/JPROC.2006.876939

CrossRef Full Text | Google Scholar

Gerkey, B., and Mataric, M. (2002). Sold! Auction methods for multirobot coordination. IEEE Trans. Robot. Autom. 18, 758–768. doi: 10.1109/TRA.2002.803462

CrossRef Full Text | Google Scholar

Hwang, F. K. (1982). New concepts in seeding knockout tournaments. Am. Math. Monthly 89, 235–239. doi: 10.1080/00029890.1982.11995420

CrossRef Full Text | Google Scholar

Jin, K., Tang, P., and Chen, S. (2016). “On the power of dominated players in team competitions,” in Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, AAMAS 2016 (Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems), 14–22.

Google Scholar

Kleinberg, R. (2012). Olympic Badminton Is Not Incentive Compatible, Turing's Invisible Hand. Technical report, Cornell University.

Knuth, D. E. (1987). A random knockout tournament. Am. Math. Monthly 93, 127–129. doi: 10.1137/1029011

CrossRef Full Text | Google Scholar

Kuhn, H. W. (1953). Extensive Games and the Problem of Information. Princeton University Press.

Google Scholar

Lanctot, M., Lisý, V., and Winands, M. (2014). “Monte Carlo tree search in simultaneous move games with applications to goofspiel,” in Computer Games, Volume 408 of Communications in Computer and Information Science (Beijing), 28–43. doi: 10.1007/978-3-319-05428-5_3

CrossRef Full Text | Google Scholar

Osborne, M. J., and Rubinstein, A. (1994). A Course in Game Theory. MIT Press.

Google Scholar

Procaccia, A. (2013). Olympic Badminton Is Not Incentive Compatible-Revisited, Turing's Invisible Hand. Technical report, Carnegie Mellon University.

Rosen, S. (1986). Prizes and incentives in elimination tournaments. Am. Econ. Rev. 76, 701–715. doi: 10.3386/w1668

CrossRef Full Text | Google Scholar

Schwenk, A. J. (2000). What is the correct way to seed a knockout tournament? Am. Math. Monthly 107, 140–150. doi: 10.1080/00029890.2000.12005171

CrossRef Full Text | Google Scholar

Tang, P., Shoham, Y., and Lin, F. (2009). “Team competition,” in Proceedings of AAMAS, (Budapest).

Tang, P., Shoham, Y., and Lin, F. (2010). Designing competitions between teams of individuals. Artif. Intell. 174, 749–766. doi: 10.1016/j.artint.2010.04.025

CrossRef Full Text | Google Scholar

Vu, T., Altman, A., and Shoham, Y. (2009). “On the complexity of schedule control problems for knockout tournaments,” in AAMAS '09 (Budapest), 225–232.

Google Scholar

Wang, G., Yu, H., Xu, J., and Huang, S. (2004). “A multi-agent model based on market competition for task allocation: a game theory approach,” in IEEE International Conference on Networking, Sensing and Control, 2004 (Taipei), Vol. 1, 282–286. doi: 10.1109/ICNSC.2004.1297449

CrossRef Full Text | Google Scholar

Wikipedia (2015a). Goofspiel. Technical report, Wikipedia.

Wikipedia (2015b). Sun bin. Technical report, Wikipedia.

Wu, H., and Shang, H. (2020). Potential game for dynamic task allocation in multi-agent system. ISA Trans. 102, 208–220. doi: 10.1016/j.isatra.2020.03.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Zlot, R., and Stentz, A. (2006). Market-based multirobot coordination for complex tasks. Int. J. Robot. Res. 25, 73–101. doi: 10.1177/0278364906061160

CrossRef Full Text | Google Scholar

Keywords: team competition, task allocation, multi-robot system, dominated players, sub-game perfect equilibrium

Citation: Jin K, Tang P, Chen S and Peng J (2021) Dynamic Task Allocation in Multi-Robot System Based on a Team Competition Model. Front. Neurorobot. 15:674949. doi: 10.3389/fnbot.2021.674949

Received: 02 March 2021; Accepted: 13 April 2021;
Published: 20 May 2021.

Edited by:

Zheng Wang, Southern University of Science and Technology, China

Reviewed by:

Xiaodong Li, The University of Hong Kong, Hong Kong
Arkadi Predtetchinski, Maastricht University, Netherlands
Vincent Chau, Chinese Academy of Sciences (CAS), China

Copyright © 2021 Jin, Tang, Chen and Peng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Kai Jin, cscjjk@gmail.com

Download