Minimize Repeated Pairs in 8 Player KO

RobertJasiek · **#21**

Ok, now we have two theories: NP-complete versus trivial:) Who can prove his theory - Joaz or Harleqin or neither?

Joaz Banbeck · **#22**

Harleqin wrote:

RobertJasiek wrote:

Harleqin wrote:

If you do not have a fixed tree, but instead re-pair each round after the previous has been completed, the proceeding is very simple: just do not repeat a pairing. The rounds are independent then, so you only have to look at the players in the round at hand.

This does not necessarily minimize.

I think it does, because the players' probability to reach a certain round is equal.

But the player's probability of reaching a certain round with optimally non-repetitive colleagues is not. And it is affected by pairing choices in the prior round.

The Q-final and the S-final rounds are not truly independent. For a player Z, starting in the Q-finals, it is possible to have three players, W, X, and Y, such that they are the only three players that Z has not played. So to try to be optimal, in the Q-final, Z is paired with one of them - let's choose W.
For the set of all possible Q-final pairings that include the Z vs W pairing, there are two possibilities: either X and Y are paired or they are not paired. If X plays Y, there is a 100% probability that there will be a non-repeat player to pair with Z if he makes it to the S-finals; if X does not play Y, there is a 75% probability that there will be such a person for Z to play.
The rounds are not independent, QED.

RobertJasiek · **#23**

From your proof about the implied lemma of the rounds being dependent, I learn that my assumptions were still insufficient. It does not suffice to ask for minimal numbers but rather one must "multiply" each pairing strategy's winning case by conditional probabilities of wins to get the probabilities for each pairing strategy to cause a certain expected number of repetitions in both rounds. Only then can one choose the overall minimum.

Cassandra · **#24**

Hi, Robert,

you have got E-Mail.

Thomas

Joaz Banbeck · **#25**

There are 105 possible first round pairings. They generate 70 possible quartets for for the semi-finals, leading to 28 possible finals pairs. Thus the upper bound for the total number of possibilities is 205800. That is well within brute-force range.

Cassandra · **#26**

Joaz Banbeck wrote:

There are 105 possible first round pairings. They generate 70 possible quartets for for the semi-finals, leading to 28 possible finals pairs. Thus the upper bound for the total number of possibilities is 205800. That is well within brute-force range.

The room to work with is not that big.

You have those 105 combinations for the pairing of the first round with 16 combinations of possible outcomes each.

The four players in each of this outcome-combination could be paired in 3 different ways in round 2. What does not matter, because you are watching for the minimum of the number of doubled pairs.

Take the average of the 16 second round (minimum) numbers and add it to the number of doubled pairs in the respective first round combination.

Check the minimum of the 105 concluding sums and chose one of the first round combinations with this minimum value.

willemien · **#27**

I am doubtfull about the number of 105 combinations possible

in my idea there are 315 different combinations:

There are 35 ( 8!/(2 x 4! x 4!) ways to split 8 players in two groups of 4

In every group of 4 there are 3 different ways to do the tournament

giving a total of 35 * 3 * 3 = 315 combinations.

but maybe I am wrong: "What makes 2 combinations different ?"

Matti · **#28**

willemien wrote:

:study: I am doubtfull about the number of 105 combinations possible

in my idea there are 315 different combinations:

There are 35 ( 8!/(2 x 4! x 4!) ways to split 8 players in two groups of 4

In every group of 4 there are 3 different ways to do the tournament

giving a total of 35 * 3 * 3 = 315 combinations.

but maybe I am wrong: "What makes 2 combinations different ?"

You are wrong. You count pairings multiple times.
(a-b, c-d), (e-f, g-h) is equivivalent to
(a-b, e-f), (c-d, g-h) and (a-b, g-h), (c-d, e-f)

willemien · **#29**

Matti wrote:

willemien wrote:

:study: I am doubtfull about the number of 105 combinations possible

in my idea there are 315 different combinations:

There are 35 ( 8!/(2 x 4! x 4!) ways to split 8 players in two groups of 4

In every group of 4 there are 3 different ways to do the tournament

giving a total of 35 * 3 * 3 = 315 combinations.

but maybe I am wrong: "What makes 2 combinations different ?"

You are wrong. You count pairings multiple times.
(a-b, c-d), (e-f, g-h) is equivivalent to
(a-b, e-f), (c-d, g-h) and (a-b, g-h), (c-d, e-f)

But (a-b, c-d), (e-f, g-h) , (a-b, e-f), (c-d, g-h), (a-b, g-h), (c-d, e-f) ARE different.

IF you see it as an immutable tree and think about the games that will happen after the first round.

Supposing the first mantioned player wins:

in (a-b, c-d), (e-f, g-h) the hext games will be a-c and e-g
in (a-b, e-f), (c-d, g-h) the hext games will be a-e and c-g
in (a-b, g-h), (c-d, e-f) the hext games will be a-g and c-e .
(and these are all different)

But if you see the rounds as independent of each other you are right. :salute:

Joaz Banbeck · **#30**

You guys are over-complicating it.

You have 8 people on 4 boards. Pick one player; there are 7 people that he can be paired with. Then there are 6 people left. Pick one of them; there are 5 people that he can be paired with. Now there are 4 people left. Pick one of them; there are 3 people that he can be paired with. You are now at the last board, with 2 people left and there is only 1 way to pair them.

7 * 5 * 3 * 1 = 105

Laman · **#31**

actually, if i assume fixed pairing tree from the first round to the finals, then i think there are 315 (= 3 * 105) possible pairings. 105, as was many times repeated, and 3 because after making four pairs, you have three different ways how to put them into the tree. correct me if i am wrong

then, from the 315 starting positions, 6 games are played to determine finalists (4 in quarterfinals and 2 in semifinals), so it makes 2^6 = 64 ways of possible tournament progress. 315 * 64 = 20 160 ways to be evaluated. again, correct me if i am wrong

third, i am not sure what were Robert Jasiek's preferences for avoiding repetitions, but i think it is natural to weigh repetition in quarterfinals (1st round) with 1 and in semifinals with 1/4 (repetition in finals is unavoidable), because for latter rounds the players causing repetition have to first win their games in former rounds (with 0.5 likelihood each). then we can simply take weighted sum of the repetitions in worst case scenario for each starting position and choose the tree with smallest value

fourth, it is not difficult to write one-purpose program that can search through all initial pairings and for each search through all possible scenarios and then choose the pairing (pairings) with lowest score. (i believe that even if i made some mistake with number 20 160)

fifth, as i checked last year EGC results, it was pretty easy to evaluate the situation even only with paper and pencil. i didn't save it, but if i recall correctly, taking situation after 7 rounds of main tournament, the best solution was 1 possible repetition in semifinal pairing. with 8 rounds it became slightly more complicated but still pretty manageable. i admit i should check also results few years back to see if it is usually easier of more difficult.

last, i don't fully understand why it is so important to avoid repeated pairings. in swiss and MM tournaments it is clear, because repeated pairing provides less information about the players performance than a not-repeated one. but in this case you use macmahon tournament only as qualification for the otherwise independent knock-out. repetitions are surely less fun for both players and fans (ok, this argument should be important enough), but in my opinion doesn't hurt this tournament system. actually, games of the main tournament seems to be even less important because the standings of 8 qualified players are not even mentioned as a seeding criterion for pairing of the knock-out. or, if they are considered as such, they are of less importance than avoiding repetitions.

sorry if i messed up terms quarterfinals and semifinals somewhere, i got confused several times during writing. otherwise i hope my long post is meaningful and will help the discussion

RobertJasiek · **#32**

Players prefer not to have repeated opponents. From an abstract POV, giving a player a second chance to win against the same opponent is undesirable (but this argument has counter-arguments).

Joaz Banbeck · **#33**

Cassandra wrote:

Joaz Banbeck wrote:

There are 105 possible first round pairings. They generate 70 possible quartets for for the semi-finals, leading to 28 possible finals pairs. Thus the upper bound for the total number of possibilities is 205800. That is well within brute-force range.

The room to work with is not that big.
...

True, but really not relevant. Once we have established an upper bound, and we conclude that this means that Jasiek should just solve it by brute force, then refining the upper bound doesn't add anything to the conclusion. ;-)

Laman wrote:

actually, if i assume fixed pairing tree from the first round to the finals, then i think there are 315 (= 3 * 105) possible pairings. 105, as was many times repeated, and 3 because after making four pairs, you have three different ways how to put them into the tree. correct me if i am wrong

then, from the 315 starting positions, 6 games are played to determine finalists (4 in quarterfinals and 2 in semifinals), so it makes 2^6 = 64 ways of possible tournament progress. 315 * 64 = 20 160 ways to be evaluated. again, correct me if i am wrong

third, i am not sure what were Robert Jasiek's preferences for avoiding repetitions, but i think it is natural to weigh repetition in quarterfinals (1st round) with 1 and in semifinals with 1/4 (repetition in finals is unavoidable), because for latter rounds the players causing repetition have to first win their games in former rounds (with 0.5 likelihood each). then we can simply take weighted sum of the repetitions in worst case scenario for each starting position and choose the tree with smallest value

fourth, it is not difficult to write one-purpose program that can search through all initial pairings and for each search through all possible scenarios and then choose the pairing (pairings) with lowest score. (i believe that even if i made some mistake with number 20 160)
...

Most of the comments above have assumed that elements on the tree can be be swapped around. But it can be solved with a fixed tree too, and your final answer is correct. :bow:

@Jasiek: Again, just brute force it.

RobertJasiek · **#34**

Cassandra appears to have written an Excel file for brute force minimization. Any1 to do it for OpenOfficeCalc?

Laman · **#35**

RobertJasiek wrote:

Players prefer not to have repeated opponents. From an abstract POV, giving a player a second chance to win against the same opponent is undesirable (but this argument has counter-arguments).

ok, i can agree with that. i was only under impression that repetitions are thought to affect results in a negative way (and i couldn't understand that)

brute force is always a solution ;-)

Joaz Banbeck · **#36**

Laman wrote:

...

brute force is always a solution ;-)

All too true. Unfortunately, it is not always a good solution.

Cassandra · **#37**

Joaz Banbeck wrote:

Laman wrote:

...
brute force is always a solution ;-)

All too true. Unfortunately, it is not always a good solution.

Fortunately, there will be more than 1 solution in many cases.

emeraldemon · **#38**

Here's a script to calculate the best by brute force, the website won't let me attach the file. The hard part is enumerating the combinations in a way with no redundancies. I wrote it so it should in theory work with any size, but for s=16 it takes too long to finish. s=8 finishes basically instantly. just change the list that says "previous" to include whatever pairs have already played each other, it will return all possible combos sorted by score.

Code:

from itertools import combinations

#how many ways are there to partition a set into 2 equal subsets?  n choose (n/2).
#then we just continually push into smaller & smaller sub-partitions
# At the top level, we break into a number of lists, i.e. lists of pairs [(brkt1, brkt2)]
# then for each sub bracket we do the same, i.e. [([(sub1, sub2)], [(sub3, sub4)]]
# finally at each pairing we need to do the cartesian product, i.e. for each comb. of sub1 & sub2, create a bracket.
def subpairs(parts):
   if len(parts) == 1:
      return [parts]
   l = []
   t = list(combinations(parts, len(parts)//2))
   for j in range(0, len(t)//2):
      for a in subpairs(t[j]):
         for b in subpairs(t[len(t) - 1 -j]):
            l.append(a + b)
   return l

players = range(0,8)

previous = [(0,1), (3,7), (0,2), (2,7), (5,6), (2,3)]

permutation_score = []

for perm in subpairs(players):
   rematch = 0.0
   s = map(lambda x : [x], perm)

   while len(s) > 2:
      t = []
      for i in range(0,len(s),2):
         for j in s[i]:
            for k in s[i+1]:
               if (j,k) in previous or (k,j) in previous:
                  rematch += 1.0/(len(s[i])**2)
         t.append(s[i] + s[i+1])
      s = t

   permutation_score.append((rematch, perm))

permutation_score.sort(reverse = True)
for out in permutation_score:
   print out

Cassandra · **#39**

RobertJasiek wrote:

Cassandra appears to have written an Excel file for brute force minimization. Any1 to do it for OpenOfficeCalc?

I suppose that it would be sufficient to let someone with OpenOffice convert the Excel file I have sent you.

May be that it would even work to just open the Excel file with OpenOffice. Because it is said that OpenOffice is compatible to Microsoft Office, so that one can at least open Microsoft Office files.

HermanHiddema · **#40**

I decided to also give it a shot.

The code at the bottom (hidden) is Python 3. Save it as e.g. "egcknockout.py"to use it. (BTW: Can an admin please relax the restrictions on attachments? I couldn't even attach it as txt).

The program takes as input a wall list as produced by common pairing programs (in a format compatible with the EGD), e.g. the top 8 Europeans from the EGC 2010 after round 7 (courtesy Asparagus):

Code:

 2    Ilja Shikshin    7d    RU    Kaza    16    105        6    11+/w    4+/b    9+/b    12+/w    1+/w    8+/b    3-/w    ~    ~    ~   12013342   2709 +13
 3    Artem Kachanovskyj    6d    UA    Rivn    16    103        6    13+/w    31+/w    7+/b    1-/b    15+/w    5+/b    2+/b    ~    ~    ~   12662870   2642 +29
 5    Cornel Burzo    6d    RO    BaMa    15    101        5    6+/b    1-/w    21+/b    36+/w    30+/w    3-/w    11+/b    ~    ~    ~   10325249   2599 +6
 6    Alexandre Dinerchtein    7d    RU    Kaza    15    100    1    5    5-/w    43+/b    31+/b    14+/w    7+/w    1-/b    13+/w    ~    ~    ~   10313237   2742 -19
 8    Csaba Mero    6d    HU    BuPe    15    99        5    38+/w    14+/b    1-/w    64+/b    11+/w    2-/w    16+/b    ~    ~    ~   10225875   2608 =0
 9    Ali Jabarin    5d    IL    TAv    15    97        5    30+/w    32+/b    2-/w    13+/b    57+/w    4-/b    17+/b    ~    ~    ~   14637810   2514 +43
 10    Cristian Pop    7d    RO    Bucu    15    95        5    14-/w    61+/b    22+/b    11-/w    28+/b    29+/w    18+/b    ~    ~    ~   10333389   2676 -1
 11    Antti Tormanen    6d    FI    Helsinki    14    103        4    2-/b    25+/w    27+/w    10+/b    8-/b    24+/w    5-/w    ~    ~    ~   12949310   2542 +24

Note: I have listed Antti Tormanen here, but with the current rules it could also be Catalin Taranu, depending on the result of a play-off

As output, it returns all possible pairings, ranked by desirability. Desirability is determined by three factors:

1. Minimize the number of repeat pairings in the Quarter Final
2. Minimize the average number of repeat pairings in the Semi Final
3. Try to stay as close as possible to a fold pairing.

Running the above list through the program returns (first 25 lines):

Code:

(0, 0.125, 12, 'Antti Tormanen - Alexandre Dinerchtein, Cristian Pop - Ilja Shikshin, Artem Kachanovskyj - Ali Jabarin, Cornel Burzo - Csaba Mero')
(0, 0.125, 22, 'Antti Tormanen - Artem Kachanovskyj, Cristian Pop - Cornel Burzo, Ilja Shikshin - Alexandre Dinerchtein, Ali Jabarin - Csaba Mero')
(0, 0.125, 44, 'Antti Tormanen - Alexandre Dinerchtein, Cristian Pop - Csaba Mero, Artem Kachanovskyj - Ali Jabarin, Ilja Shikshin - Cornel Burzo')
(0, 0.1875, 4, 'Antti Tormanen - Artem Kachanovskyj, Cristian Pop - Ilja Shikshin, Cornel Burzo - Csaba Mero, Alexandre Dinerchtein - Ali Jabarin')
(0, 0.1875, 34, 'Antti Tormanen - Artem Kachanovskyj, Cristian Pop - Alexandre Dinerchtein, Ilja Shikshin - Cornel Burzo, Ali Jabarin - Csaba Mero')
(0, 0.1875, 36, 'Antti Tormanen - Ali Jabarin, Cristian Pop - Ilja Shikshin, Artem Kachanovskyj - Alexandre Dinerchtein, Cornel Burzo - Csaba Mero')
(0, 0.1875, 36, 'Antti Tormanen - Artem Kachanovskyj, Cristian Pop - Csaba Mero, Ilja Shikshin - Cornel Burzo, Alexandre Dinerchtein - Ali Jabarin')
(0, 0.1875, 46, 'Antti Tormanen - Ali Jabarin, Cristian Pop - Cornel Burzo, Artem Kachanovskyj - Csaba Mero, Ilja Shikshin - Alexandre Dinerchtein')
(0, 0.1875, 68, 'Antti Tormanen - Ali Jabarin, Cristian Pop - Csaba Mero, Artem Kachanovskyj - Alexandre Dinerchtein, Ilja Shikshin - Cornel Burzo')
(0, 0.25, 2, 'Antti Tormanen - Artem Kachanovskyj, Cristian Pop - Ilja Shikshin, Cornel Burzo - Ali Jabarin, Alexandre Dinerchtein - Csaba Mero')
(0, 0.25, 14, 'Antti Tormanen - Alexandre Dinerchtein, Cristian Pop - Ilja Shikshin, Artem Kachanovskyj - Csaba Mero, Cornel Burzo - Ali Jabarin')
(0, 0.25, 26, 'Antti Tormanen - Artem Kachanovskyj, Cristian Pop - Csaba Mero, Ilja Shikshin - Alexandre Dinerchtein, Cornel Burzo - Ali Jabarin')
(0, 0.25, 34, 'Antti Tormanen - Artem Kachanovskyj, Cristian Pop - Ali Jabarin, Ilja Shikshin - Alexandre Dinerchtein, Cornel Burzo - Csaba Mero')
(0, 0.25, 38, 'Antti Tormanen - Alexandre Dinerchtein, Cristian Pop - Artem Kachanovskyj, Ilja Shikshin - Cornel Burzo, Ali Jabarin - Csaba Mero')
(0, 0.25, 42, 'Antti Tormanen - Artem Kachanovskyj, Cristian Pop - Ali Jabarin, Ilja Shikshin - Cornel Burzo, Alexandre Dinerchtein - Csaba Mero')
(0, 0.25, 54, 'Antti Tormanen - Alexandre Dinerchtein, Cristian Pop - Ali Jabarin, Artem Kachanovskyj - Csaba Mero, Ilja Shikshin - Cornel Burzo')
(0, 0.25, 58, 'Antti Tormanen - Ali Jabarin, Cristian Pop - Alexandre Dinerchtein, Artem Kachanovskyj - Csaba Mero, Ilja Shikshin - Cornel Burzo')
(0, 0.3125, 42, 'Antti Tormanen - Ali Jabarin, Cristian Pop - Artem Kachanovskyj, Ilja Shikshin - Alexandre Dinerchtein, Cornel Burzo - Csaba Mero')
(0, 0.3125, 50, 'Antti Tormanen - Ali Jabarin, Cristian Pop - Artem Kachanovskyj, Ilja Shikshin - Cornel Burzo, Alexandre Dinerchtein - Csaba Mero')
(1, 0.0, 12, 'Antti Tormanen - Artem Kachanovskyj, Cristian Pop - Cornel Burzo, Ilja Shikshin - Csaba Mero, Alexandre Dinerchtein - Ali Jabarin')
(1, 0.0, 14, 'Antti Tormanen - Ilja Shikshin, Cristian Pop - Cornel Burzo, Artem Kachanovskyj - Alexandre Dinerchtein, Ali Jabarin - Csaba Mero')
(1, 0.0, 20, 'Antti Tormanen - Alexandre Dinerchtein, Cristian Pop - Cornel Burzo, Artem Kachanovskyj - Ali Jabarin, Ilja Shikshin - Csaba Mero')
(1, 0.0, 44, 'Antti Tormanen - Ali Jabarin, Cristian Pop - Cornel Burzo, Artem Kachanovskyj - Alexandre Dinerchtein, Ilja Shikshin - Csaba Mero')
(1, 0.0625, 2, 'Antti Tormanen - Ilja Shikshin, Cristian Pop - Cornel Burzo, Artem Kachanovskyj - Ali Jabarin, Alexandre Dinerchtein - Csaba Mero')
(1, 0.0625, 6, 'Antti Tormanen - Ilja Shikshin, Cristian Pop - Alexandre Dinerchtein, Artem Kachanovskyj - Ali Jabarin, Cornel Burzo - Csaba Mero')

Where the first number is repeat pairings in the QF, the second is the average number of repeats in the SF, and the third is closeness to fold pairing (sum of squares). For each of them, lower is better.

The above means there are three pairings without repetitions that minimize SF repetitions to 12.5% (2 out of 16 possible results have a forced repeat pairing). The best has a fold quality of 12. It is possible to improve the fold quality to 4 (closer to fold), but at the cost of increasing the chance of a repetition in the SF to 18.75% (3/16).

The 20th to 23rd listed pairings reduce the chance of a repetition in the SF to zero, but at the cost of containing a repetition in the QF.

For determining the closeness to a fold pairing, the order of listing is used, not the order of placement as given in the data. That means you can swap lines around if you feel that the resulting listing gives a better ordering to the player, without having to renumber anything. (For example, in case of equal MMS/SOS, you might want order players according to Prior Rating, while the pairing program orders them by SOSOS)

To test the program, you can call it from the commandline and it will run on the top 8 from the EGC 2010 as listed above. The code also contains a version with Catalin instead of Antti. Choose the preset "egc2010b" for that (default is "egc2010a").

At the bottom is data from the EGC 2009 after 7 rounds, from which other test sets can be made.

To test under windows, you can also run IDLE (python GUI) and (if the file is save as egcknockout.py in python's Lib/site-packages directory) do:

Code:

import egcknockout
r = egcknockout.optimize(preset="egc2010a")
for x in r[:15]:
    print(x)

The code:

Code:

import re

"""Scan for players, one player per line.

Specify a file to read, a block of text to parse, or a list of lines to parse.

Each line should be white space separated, with the following fields:
<id> <firstname> <lastname> <other fields>

This format is compatible with the one used by wall lists for the EGD

The first three fields will be parsed as such, and are mandatory. 
The id is parsed as a number, any additional characters will be discarded.

Make sure there are no spaces in names (replace by underscores if needed).
First/last name can be in any order, the order will be retained

The other fields will be parsed for pairings. A pairing should be a number
(referring to an opponent) followed by a +, = or - (win, draw or loss).
It can optionally be followed by more characters, such as color and handicap
info, which will simply be discarded.

Each player will be assigned an order, which is used to determine the
quality of the pairing. The order is assigned according to the order of
the players in the input, not according to the player's id.
"""   

def parseplayers(filepath=None, text=None, lines=None):
   if filepath:
      try:
         with open(filepath) as f:
            text = f.read()
      except IOError:
         print("Unable to open file {0}".format(filepath))
         return False
   elif text:
      lines = text.splitlines()
   elif not lines:
      print("No argument given, please specify one of filepath, text or lines")
      return False
   players = {}
   order = 0
   pairing = re.compile(r"^(\d+)[+=-]")
   for line in lines:
      fields = line.split()
      if len(fields) > 3:
         p = {
            "id":fields[0],
            "name":"{0} {1}".format(fields[1], fields[2]),
            "order":order,
            "opponents":[]
            }
         for field in fields[3:]:
            match = pairing.match(field)
            if match:
               p["opponents"].append(match.group(1))
         order += 1
         players[p['id']] = p
   return players   

"""Creates a matrix of prior pairings"""

def priormatrix(plist):
   np = len(plist) #number of players
   priors = [[0]*np for i in range(np)] # create matrix
   for id, player in plist.items():
      for opp_id in player["opponents"]:
         if opp_id in plist: # this opponent is also in the group
            opponent = plist[opp_id]
            priors[player['order']][opponent['order']] = 1
            #the reverse matrix entry is not set, it will be set by the reverse entry
   return priors

"""Generator function, which yields all possible pairings within the given
list of players. Number of players must be even.
"""

def allpairings(plist):
   length = len(plist)
   if (length % 2) != 0 or length < 2:
      return # odd number of players or empty list
   if length == 2:
      yield [(plist[0], plist[1])]
   else:
      for i in range(1,length):
         head = (plist[0], plist[i])
         for tail in allpairings(plist[1:i] + plist[i+1:length]):
            yield [head] + tail

"""Determines pairing quality as a triple of numbers.
The first of these indicates the number of repeats in this pairing,
the second indicates the average number of repeats in the next round,
the third indicates how close to a fold pairing this pairing is. 

The arguments are a pairing and the prior pairings matrix.

How close to a fold pairing the pairing is is calculated as a sum of
the squares of the distances of each pairing from perfect.

E.g. with 8 players, 1-8 is a perfect pairing, as are 2-7, 3-6 and 4-5.
If the sum of the order of players is 9, it is perfect, and any such
pairings have distance zero. If the sum is not 9, the difference with 9
is the distance, and this is squared to calculate quality (e.g. the
pairing 1-7 is close to perfect, distance 1, squared to 1, the pairing
1-2 is far from perfect, distance 6, squared to 36.
"""

def pairingquality(pairing, priors):
   repeats = 0 # number of repeat pairings in this pairing
   avgreps = 0 # average number of repeat pairings in next round pairings
   quality = 0 # Pairing quality (sum of squares distance from fold pairing)
   np = len(pairing)
   optimal = 2 * np - 1 # with fold pairing, this is the optimal sum of positions (where first position is zero)
   for pair in pairing:
      repeats += priors[pair[0]['order']][pair[1]['order']]
      quality += (pair[0]['order'] + pair[1]['order'] - optimal) ** 2
   result = 0
   minimums = []
   while result < 2**np: # consider all possible results (encoded in bits)
      winners = []
      for i in range(np):
         if result & 2**i > 0:
            winners.append(pairing[i][1])
         else:
            winners.append(pairing[i][0])
      minreps = len(winners)
      for wpairing in allpairings(winners):
         reps = 0
         for wpair in wpairing:
            reps += priors[wpair[0]['order']][wpair[1]['order']]
         if reps < minreps:
            minreps = reps
      minimums.append(minreps)
      result += 1
   avgreps = sum(minimums)/len(minimums)
   return (repeats, avgreps, quality)

"""Evaluates all possible pairings within the given list of players,
and returns a sorted list of pairings. Each list items is a tuple of four
items. The first three are those returned by the pairingquality function,
the last is a textual representation of the pairing, using player names.
The list is first sorted by the first entry in the tuple, then the second,
then the third.
"""   

def evaluatepairings(plist, priors):
   pairings = []
   for pairing in allpairings(plist):
      pairstr = ""
      for pair in pairing:
         pairstr = pairstr + "{0} - {1}, ".format(pair[0]['name'], pair[1]['name'])
      pairstr = pairstr.rstrip(", ")
      quality = pairingquality(pairing, priors)
      pairings.append(quality + (pairstr,))
   return sorted(pairings)

def optimize(filepath=None, text=None, lines=None, preset=None):
   presets = {
"egc2010a":"""
 2    Ilja Shikshin    7d    RU    Kaza    16    105        6    11+/w    4+/b    9+/b    12+/w    1+/w    8+/b    3-/w    ~    ~    ~   12013342   2709 +13
 3    Artem Kachanovskyj    6d    UA    Rivn    16    103        6    13+/w    31+/w    7+/b    1-/b    15+/w    5+/b    2+/b    ~    ~    ~   12662870   2642 +29
 5    Cornel Burzo    6d    RO    BaMa    15    101        5    6+/b    1-/w    21+/b    36+/w    30+/w    3-/w    11+/b    ~    ~    ~   10325249   2599 +6
 6    Alexandre Dinerchtein    7d    RU    Kaza    15    100    1    5    5-/w    43+/b    31+/b    14+/w    7+/w    1-/b    13+/w    ~    ~    ~   10313237   2742 -19
 8    Csaba Mero    6d    HU    BuPe    15    99        5    38+/w    14+/b    1-/w    64+/b    11+/w    2-/w    16+/b    ~    ~    ~   10225875   2608 =0
 9    Ali Jabarin    5d    IL    TAv    15    97        5    30+/w    32+/b    2-/w    13+/b    57+/w    4-/b    17+/b    ~    ~    ~   14637810   2514 +43
 10    Cristian Pop    7d    RO    Bucu    15    95        5    14-/w    61+/b    22+/b    11-/w    28+/b    29+/w    18+/b    ~    ~    ~   10333389   2676 -1
 11    Antti Tormanen    6d    FI    Helsinki    14    103        4    2-/b    25+/w    27+/w    10+/b    8-/b    24+/w    5-/w    ~    ~    ~   12949310   2542 +24
""",
"egc2010b":"""
 2    Ilja Shikshin    7d    RU    Kaza    16    105        6    11+/w    4+/b    9+/b    12+/w    1+/w    8+/b    3-/w    ~    ~    ~   12013342   2709 +13
 3    Artem Kachanovskyj    6d    UA    Rivn    16    103        6    13+/w    31+/w    7+/b    1-/b    15+/w    5+/b    2+/b    ~    ~    ~   12662870   2642 +29
 5    Cornel Burzo    6d    RO    BaMa    15    101        5    6+/b    1-/w    21+/b    36+/w    30+/w    3-/w    11+/b    ~    ~    ~   10325249   2599 +6
 6    Alexandre Dinerchtein    7d    RU    Kaza    15    100    1    5    5-/w    43+/b    31+/b    14+/w    7+/w    1-/b    13+/w    ~    ~    ~   10313237   2742 -19
 8    Csaba Mero    6d    HU    BuPe    15    99        5    38+/w    14+/b    1-/w    64+/b    11+/w    2-/w    16+/b    ~    ~    ~   10225875   2608 =0
 9    Ali Jabarin    5d    IL    TAv    15    97        5    30+/w    32+/b    2-/w    13+/b    57+/w    4-/b    17+/b    ~    ~    ~   14637810   2514 +43
 10    Cristian Pop    7d    RO    Bucu    15    95        5    14-/w    61+/b    22+/b    11-/w    28+/b    29+/w    18+/b    ~    ~    ~   10333389   2676 -1
 12    Catalin Taranu    7d    RO    Bucu    14    100        4    29+/b    26+/b    18+/w    2-/b    4-/w    50+/w    7-/w    ~    ~    ~   10586785   2739 -13
"""
   }
   if preset in presets:
      plist = parseplayers(text=presets[preset])
   else:
      plist = parseplayers(filepath, text, lines)
   if not plist:
      print("Error parsing list of players.")
      return
   priors = priormatrix(plist)
   result = evaluatepairings(list(plist.values()), priors)
   return result

if __name__ == "__main__": ### if called from the commandline
   r = optimize(preset="egc2010a")
   for x in r:
      print(x)

### EGC2009   data
"""
 3    Alexandr Dinerstein    7d    RU    Kaza    16    101        6    19+/w    17+/b    2-/w    30+/b    8+/b    5+/w    12+/w    ~    ~    ~   10313237   2731 -1
 8    Rob van Zeijst    7d    NL    Toky    15    99        5    32+/w    1-/b    75+/b    19+/w    3-/w    22+/b    16+/w    ~    ~    ~   10337382   2668 -3
 11    Catalin Taranu    7d    RO    Bucu    15    97        5    18+/w    65+/b    1-/w    14-/b    36+/w    13+/b    15+/w    ~    ~    ~   10586785   2712 +1
 12    Geert Groenen    6d    NL    Dha    14    102        4    10-/b    29+/w    83+/b    9+/w    1-/w    19+/b    3-/b    ~    ~    ~   10501216   2543 -1
 13    Cornel Burzo    6d    RO    BaMa    14    100        4    1-/w    47+/b    24+/w    55+/w    2-/b    11-/w    49+/b    ~    ~    ~   10325249   2594 -22
 14    Thomas Debarre    5d    FR    67Se    14    100        4    9-/w    37+/b    23+/w    11+/w    6-/b    34+/w    7-/b    ~    ~    ~   12950872   2520 +27
 15    Csaba Mero    6d    HU    BuPe    14    99        4    33+/w    2-/b    22+/w    21+/w    16-/b    39+/w    11-/b    ~    ~    ~   10225875   2617 +1
 16    Cristian Pop    7d    RO    Bucu    14    98        4    24+/b    5-/w    35+/b    65+/w    15+/w    4-/b    8-/b    ~    ~    ~   10333389   2689 -5
 17    Ilja Shikshin    7d    RU    Kaza    14    98        4    38+/b    3-/w    44+/w    6-/b    31+/b    10-/w    54+/w    ~    ~    ~   12013342   2690 +3
 18    Dmytro Bogackyj    6d    UA    Kyiv    14    98        4    11-/b    111+/w    32+/w    10+/w    7-/b    21+/b    6-/w    ~    ~    ~   10250867   2540 +9
 19    Antoine Fenech    5d    FR    67Se    14    97        4    3-/b    40+/w    33+/w    8-/b    51+/b    12-/w    44+/b    ~    ~    ~   10349350   2516 +3
 20    Mykhajlo Galchenko    5d    UA    Kyiv    14    97        4    4-/b    51+/w    34+/b    7-/w    50+/w    33+/b    9-/w    ~    ~    ~   10433236   2498 +6
 21    Frank Janssen    6d    NL    Alme    14    96        4    23+/b    6-/w    49+/w    15-/b    41+/w    18-/w    36+/b    ~    ~    ~   10298596   2560 -7
 22    Javier-Aleksi Savolainen    4d    FI    Helsinki    14    93        5    59+/w    66+/b    15-/b    70+/b    23+/w    8-/w    38+/w    ~    ~    ~   14001537   2364 +54
 23    Benjamin Teuber    6d    DE    Hamb    14    91        4    21-/w    67+/b    14-/b    74+/b    22-/b    73+/w    40+/w    ~    ~    ~   10437174   2538 -8
 24    Viktor Lin    5d    AT    Wie    14    91        4    16-/w    56+/w    13-/b    36-/b    85+/w    67+/b    39+/w    ~    ~    ~   14501256   2461 +9
 25    Peter Brouwer    5d    NL    Amst    14    90        5    78+/b    31-/b    76+/w    66+/w    10-/b    58+/w    32+/w    ~    ~    ~   10986514   2472 +12
"""

Minimize Repeated Pairs in 8 Player KO

Who is online