Genbby Team January 24, 2018 Genbby Technical Paper Rating System and Matchmaking 1. Introduction The rating system estimates the level of players skills involved in the game. This allows the teams to be suitably matched, so that the opposing teams have similar odds of winning the game, the games become balanced and exciting where each player is at a competitive level during the game. The performance of a rating system is evaluated on the confidence that exists in the rating estimated of the players. The matchmaking based on the players rating is an efficient way to form teams with similar skills, producing balanced games. Currently games such as DOTA2 and League of Legends (LOL) use variants of the Elo rating system, which was used successfully in individual games (1vs1), but which is not suitable for game modes between teams (for example 2vs2 and 5vs5), despite the variations and adaptations made in recent years, causing constant complaints in the global community of gamers. In addition to the Elo rating system, other rating systems have two main deficiencies: Low confidence in the estimation of players rating: Producing unbalanced matching, requiring a large number of games necessary to know the real rating of each player, demotivating and causing many of them to stop playing. No consider the parameters time, zone and player s performance history: This produces uncertainty in how to update each player s rating when a non-ideal scenario is presented for a rating system, for example, a low-level player who does not officially compete for several months, but constantly training, it will play much better than its current rating, also when two players or teams with similar rating are faced, one of them may be much better than the other due to the higher average rating of the players or teams that previously faced. These two main deficiencies are constantly criticized in the world by millions of gamers, which as a result are matched in unbalanced teams, making it very easy to predict the winning team, so millions of gamers get discouraged every day and in some cases stop playing. To solve this problem Genbby makes use of the Genbby rating system and Genbby Matchmaking to form more balanced games, where each player is competitive and is motivated to play, improve and aspire to be one of the best gamers in the world. The Genbby rating system is inspired by the Elo rating system, taking advantage of its best features and adding parameters 1
to have a dynamic and adaptive system for each player, furthermore, Genbby Matchmaking matches the teams as balanced as possible according to the estimated ratings of each player. Experiments were performed to verify the superiority of the Genbby rating system over the Elo rating system using as the metric the average convergence of the players towards their real ratings after a certain number of games played. 2. Methodology The Elo rating system is used as a baseline for the different game modes (for example 1vs1, 2vs2 and 5vs5). Then the Genbby rating system together with Genbby Matchmaking are used to solve the two main deficiencies of the Elo rating system and other rating systems. These two main deficiencies are: Low confidence in the estimation of players rating: Producing unbalanced matching, requiring a large number of games necessary to know the real rating of each player, demotivating and causing many of them to stop playing. No consider the parameters time, zone and player s performance history. Genbby rating system handles the following dynamic parameters: Uncertainty of the player s rating (which goes from 0 to 200). Number of games played by the player. Player rating that goes in the range of 0 to 4000 points (theoretical maximum). Each of these dynamic parameters allows the Genbby rating system to have greater confidence in the estimation of the players ratings and to consider parameters such as the time, zone and history of the player s performance. Furthermore Genbby Matchmaking forms teams only among players of the same category. It is established as maximum difference of rating of 200 in each category, which means in terms of the Genbby rating system that the strongest player could have at most a 75% chance of winning against the weakest player in the same category. The maximum rating difference of 200 is established as initial value, being dynamic with the number of games played by each player, by using learning algorithms that after observing the result of each confrontation, choose a maximum difference between players in each category that maximizes the probability of obtaining more balanced games. The metric used for the comparison of the rating methods is the average convergence of the players towards their real rating, after a certain number of games played. For the matchmaking system the average of the balance of the teams matched during a game is used. 2
3. Experiments The experiments were performed for the game modes 1vs1, 2vs2 and 5vs5. Furthermore, all results are obtained on the average of 50000 simulations performed. The balance of the matches obtained by the matchmaking system is calculated according to the probability of a team s victory based on the average real rating of the players. If a team has a 50% chance of winning, we say that it is in a game with perfect balance, that is, a 100% balance. For the rating system, games were simulated among several players with real ratings evenly distributed between 0 and 4000. The players initially have a random rating between a specific range, in addition the players are classified into different categories that go every 200 rating points, it is, category 1 includes players with a rating between 0 and 199, the category 2 players with a rating between 200 and 399, the category 3 players with a rating between 400 and 599, and so on, then to predict the winner of a game is used the probability of a team s victory based on the average real rating of the players, using a logistic function. Once the matchmaking system forms the teams among players of the same category and knowing the result of the game, the rating system updates the players ratings. Below are the specific configurations for each game mode: Configuration for game mode 1vs1: Games were simulated among 20 players, players have a random score between 1990 and 2010. Configuration for game mode 2vs2: Games were simulated among 40 players, players have a random score between 1980 and 2020. Configuration for game mode 5vs5: Games were simulated among 100 players, players have a random score between 1950 and 2050. 3
4. Results and discussions 1 VS 1 GAME MODE For the 1vs1 game mode, Genbby Rating System presents an average improvement of 2.32% compared to the Elo Rating System, for a maximum of 90 games disputed by a player. Initially, it is observed that Elo Rating takes the advantage slightly, but as a player disputes more games, Genbby Rating System clearly is superior. Thus after 90 games played by a player, the convergence towards his real rating using the Genbby rating system is 66.96% compared to the Elo Rating System with 64.16%. Genbby Rating System (1vs1) Elo Rating System (1vs1) GRAPHIC N 1. Convergence towards the real rating of the players according to the number of games played for the 1vs1 game mode. 4
2 VS 2 GAME MODE For the 2vs2 game mode, Genbby Rating System presents a remarkable improvement of 5.23% on average over the Elo Rating System, for a maximum of 90 games disputed by a player. It is also observed that to a greater amount of games disputed by a player, the rating of the players in the Genbby rating system is getting closer and closer to their real rating, while in the Elo rating System they remain more or less constant (below 63%). Furthermore, after 90 games disputed by a player, the convergence to the real rating using Genbby rating system is 72.32% compared to the Elo Rating System with 62.93%, it is, Genbby rating System improves the performance of the Elo Rating System by 9.39% for 90 games disputed by a player. Genbby Rating System (2vs2) Elo Rating System (2vs2) GRAPHIC N 2. Convergence towards the real rating of the players according to the number of games played for the 2vs2 game mode. 5
5 VS 5 GAME MODE For the 5vs5 game mode, Genbby Rating System presents an improvement of 1.63% on average over the Elo Rating System, for a maximum of 90 games disputed by a player. It is also observed that to a greater amount of games disputed by a player, the rating of the players in the Genbby rating system is getting closer and closer to their real rating, while in the Elo rating System they remain more or less constant (below 63%). Thus, after 90 games disputed by a player, the convergence towards his real rating using Genbby rating system is 65.26% compared to the Elo Rating System with 62.92%, it is, Genbby rating System improves the performance of the Elo Rating System by 2.34% for 90 games disputed by a player. Genbby Rating System (5vs5) Elo Rating System (5vs5) GRAPHIC N 3. Convergence towards the real rating of the players according to the number of games played for the 5vs5 game mode. 6
SUMMARY OF THE RATING SYSTEM The following graph summarizes the results of the experiments performed for the Genbby and Elo rating systems, for the 1vs1, 2vs2 and 5vs5 game modes, for a maximum of 90 games disputed by a player.clearly, Genbby rating system outperforms the Elo Rating System in all game modes. Genbby Rating System (1vs1) Genbby Rating System (2vs2) Genbby Rating System (5vs5) Elo Rating System (1vs1) Elo Rating System (2vs2) Elo Rating System (5vs5) GRAPHIC N 4. Convergence towards the real rating of the players according to the number of games disputed for the game modes 1vs1, 2vs2 and 5vs5. 7
BALANCE OF GAMES USING GENBBY MATCHMAKING There is an initial balance of 74.02%, 85.99% and 94.27% for the game modes 1vs1, 2vs2 and 5vs5 respectively. As players play more games, the balance falls slowly to a global minimum, after which the balance increases again. There is better results for the 5vs5 game mode, in which after having played 90 games the balance of a game that a player disputed is 75.21%, it is, a player after having played 90 games will have at least the probability of win 37.61%, when is matched again in the 5vs5 game mode. Similarly, for the 1vs1 and 2vs2 game modes, there is a balance of 70.32% and 71.23% respectively. Genbby Matchmarking (1vs1) Genbby Matchmarking (2vs2) Genbby Matchmarking (5vs5) GRAPHIC N 5. The balance of the games according to the number of games played by the player for a maximum of 90 games. 8
5. Competitive platform version 3: technical description Our rating system is responsible for updating the rating of players based on their previous and current results and other parameters. On the other hand, our matchmaking system divides players with dynamic parameters according to current needs, within the created partitions it starts to filter first and then search for the best balanced equipment possible taking into account, mainly, the rating. And due to the fast convergence of our rating system to the real rating of each player, the pairings begin to be balanced from practically the first games. 5.1 Components interaction diagram 5.2 Competitive platform process diagram 9
Now, with the input JSON, the Matching_System API --version 2.0 can temporarily store the IDs in the way it looks in the image (although there is no need to store Level 1 as the calls for Dota2 and LOL will be different routes). So, what we need is to have a function like this: Where: n_team_members will indicate the number of members each team must have. Players are in some mode of Level 3. In this way, if the Challenge of uniting players from the same mode with different bet is solved, we can call form_teams giving players the union of certain Level 3 modes and everything would work well. For this, first let s see this image: 10
This image is like a zoom to the modes of Level 4 and 5 that come from mode of the arbitrary Level 3 (note that when joining modes, they will still have this structure). Basically, it tells us that we are partitioning the players of each role according to their score in intervals [0, 2 δ>, [2 δ, 4 δ>,... [n δ, MAX_SCORE]. Then, we need to have a function like this: Where: Partition is an interval of the partition (e.g. [2 δ, 4 δ>) containing player IDs in role_1, role_2,... role_5 In this way, the IDs of the players that are within this partition have close scores. Then, the possible teams will have players with a score in a range [n δ, (n + 2) δ> or [n δ, MAX_SCORE]. Thus, the closer this player s score to his real score, the better the matching will be, but since the rating is Genbby s rating, which is calculated in the API Rating System, and it converges quickly, then we can consider that the matching are balanced (the problem occurs when they are new to Genbby and their rating is still converging, but we can consider this as inevitable). Now we consider the problem of how to form teams with the players of an arbitrary partition. For this, we see the following image: 11
If we consider that A = {ID1, ID2, ID3} belong to Role_1, B = {ID4, ID5, ID6} belong to Roel_3. Then the possible teams that we can form are: Teams = {(a, b, c, d, e) / a ε A, b ε B,..., e ε E} That is, teams = A x B x C x D x E (Cartesian product) We have considered that A, B, C, D, E have 3 elements each for simplicity, in fact, the number of elements of each set is variable, it can even be null. To visualize them better we see the following code: 12
So, we already have teams trained, but how do we determine which ones are better balanced? What Matches are more likely to tie? In order to answer this, we need a function: But this works for 2 players and we need it for 2 teams, so, we need a function: Thus, we have a relation of order: team_x> team_y if and only if abs (0.5 - team_x.score) <abs (0.5 - teams_y.score) In this way we can order matches from highest to lowest probability of a tie And with the matchings ordered, we can iterate over them to filter and get only valid equipment: To do this, we create a set with players who are already in a matching, let s call vis: So, for a matching to be valid we need: No player of that matching belongs to vis. Two players of the same team do not have, even if one of them, in their banned list If a matching is valid, we add all the IDs of that matching to vis and continue iterating the ordered matchings. 13
6. Conclusions Genbby Rating System presents an improvement of 2.32%, 5.23% and 1.63% on average for the game modes 1vs1, 2vs2 and 5vs5 respectively, compared to the Elo Rating System for a maximum amount of 90 games disputed by the player. In all game modes, the rating of players using Genbby rating system converges rapidly towards the real rating, to a greater amount of games disputed. Thus we have 66.96%, 72.32% and 65.26% convergence towards the real rating, for the game modes 1vs1, 2vs2 and 5vs5 respectively, after 90 games disputed by the player. The Elo rating system is not suitable for the game modes 2vs2 and 5vs5, since the convergence towards the real rating of the players is almost constant, below 63% even after 90 games disputed. On the other hand, Genbby Rating System maintains a fast growth to a greater amount of games disputed for all the game modes 1vs1, 2vs2 and 5vs5. Genbby matchmaking produces for the game modes 1vs1, 2vs2 and 5vs5 an average balance of 70.32%, 71.23% and 75.21% respectively, after 90 games disputed by the player. It means that a player will have at least one probability of winning the game of 35.16%, 36.62%, and 37.61% respectively when it is matched in a new game. Genbby Rating system and Genbby Matchmaking are suitable to use in game modes 1vs1, 2vs2 and 5vs5, and in general for all game modes. 14