BIT 30 (t990), 583-591 ON GENERATION OF PERMUTATIONS THROUGH DECOMPOSITION OF SYMMETRIC GROUPS INTO COSETS ZBIGNIEW KOKOSIiqSKI Institute of Electrical Engineering and Electronics, Technical University of Krakrw, ut. Warszawska 24, 31-155 Krakrw, Poland. Abstract. A hardware-oriented algorithm for generating permutations is presented that takes as a theoretic base an iterative decomposition of the symmetric group Sn into cosets. It generates permutations in a new order. Simple ranking and unranking algorithms are given. The construction of a permutation generator is proposed which contains a cellular permutation network as a main component. The application of the permutation generator for solving a class of combinatorial problems on parallel computers is suggested. C.R. categories: F.1.2, G.2.1. Keywords and Phrases: permutation generation, hardware-oriented algorithm, combinatorial problems, parallel computations. 1. Introduction. Parallel processing of many combinatorial problems solvable by inspection will become possible if the problem of parallel generation of their instances is solved in a proper way. Two different approaches to the parallel generation of combinatorial objects are of interest: 1) design of parallel algorithms for a model of parallel computations with any number of processors, and 2) application of a sequential algorithm (or a parallel algorithm for a given number of processors) to a model of parallel computations in which disjoint subsets of combinatorial objects are distributed among any number of processors (parallel processors). The parallel permutation generation method proposed in this paper may be classified among the second group of solutions. A new sequential algorithm for generation of permutations results from an iterative coset decomposition of the symmetric group S~. A linear order on the set of all n! permutations is defined. Received February 1990. Revised April 1990.
584 ZBIGNIEW KOKOSII~SKI Ranking and unranking algorithms are given that allow the choice of any partition rule. A direct hardware implementation of the permutation generation algorithm is shown with the triangular permutation network from two-state cells as a main component. Data are saved in registers (RAM memory) connected to the network inlets. Data permutations can be obtained by a "read-through-the-network" operation. The permutation generator is to be used for: 1) generation of data permutations in the form of sequences of items (for instance numbers from the set {1, 2,..., n}, addresses, etc.), and 2) generation of column (row) permutations of a Boolean matrix written in RAM memory (parallel access to all bits of a given row (column) of the matrix is possible at any moment by addressing RAM in a proper way). This last form of data permutation provides fast parallel computations with Boolean matrices representing a class of combinatorial problems solvable only by inspection. Graph isomorphism, Hamiltonian circuit and Boolean function implementation with the aid of multiplexers and decoders are typical problems from this class. Either an SIMD or an MIMD model of parallel computations is to be chosen. The number of working processors is unrestricted. "Initialization data" for every processor are obtained using the unranking algorithm. In the permutation generator the regular structures of the control circuit and of the cellular permutation network overlap. This property seems to be very important from the point of view of VLSI design requirements. Until now a large number of serial and parallel algorithms for permutation generation have been developed. An exhaustive list of early papers in this area may be found in [17]. More than ten new methods for the generation of permutations, including such specific classes of permutations as stack permutations and derangements, have been proposed in the intervening years: [I, 2, 3, 4, 6, 7, 11, 12, 14, 18, 19]. The originality of our solution consists in both treating the permutation generation as a process of generation of interconnection patterns in a permutation network and introducing the parallelism of computations by splitting the generation task using the unranking algorithm. 2. Mathematical background. In this section some concepts from the theory of groups [8, 9] and the transversal theory [13] are presented. The main ideas of the paper are described formally in order to show the validity of our results. The uniform representation of combinatorial objects by "choice functions" is introduced in [11]. Let G be a permutation group on a finite set f~ = { 1, 2... n}. For the given group G and its subgroup H the set of all elements {hg: g E G, g fixed, h ~ H} is called a right coset of H in G. Similarly, the set of all elements {gh: g ~ G,
ON GENERATION OF PERMUTATIONS THROUGH DECOMPOSITION... 585 9 fixed, h 6 H} is called a left coset of H in G. Two right (left) cosets of H in G are either disjoint or identical sets of elements. We write G = H91 u Hg2 u... u Hg,, where r denotes the number of right cosets of H in G. We call g~, i = 1, 2... r, the right coset representative. Similarly, we write G = 91H u 92H u... u 9zH, where I denotes the number of left cosets of H in G. We call g~, i = 1, 2,..., l the left coset representative. In [8] Hall and Paige have proposed two partitions of the symmetric group S, on the finite set fl = {1... n} into right and left cosets of S,_ ~ in S,: 0 1 n-1 S. = S._ 1 zn + S._ 1 z. +... + S._ ~ z,, (1) Sn =z.s._i+z.s._i+...+t. o 1 n- 1 S.-1 (2) where + denotes the union of disjoint sets and z i denotes the transposition (i j) (in particular z is the identity permutation). The complete iterative decomposition of S. into right cosets resulting from the equation (1) is given below: 0 1 n-i S. =Sn-lz.+Sn-~z. +...+S._lz. Sn_l.~-- Sn_2"tn_l o + Sn-2 T, n_ 1 1 +,.,,4- Sn_2"~n_ n-2 1... (3) s3 = s2 t + s2 + s2 $2 o 1 In a similar way the complete iterative decomposition of S, into left cosets is obtained from the equation (2): Sn = "rn S,,_ I + z~ S, - 1 +.+ "On "-1S,_ 1 n-2 Sn-1 = Zn-1 0 Sn_2.~" n-1 1 Sn_2 jt - "'".~,~n_lsn_2... 4) $3 =r S2+~S2+T~S2 $2 o Both decomposition schemes (3) and (4) have been shown to be group-theoretic representations of triangular permutation networks 1-15]. Every set Ti = {z... z~... zi i- 1 }, for i 61 = {2,.., n} is a system of right (left) coset representatives and is called a complete right (left) transversal for S~_ ~ in S~. Obviously ITd = i and Tk n Tt = 0, for every k v ~ I, k, lsi. A mapping F: F(/) = T~, for every it I, is called an indexed family of sets T~ (denoted by < T/>i~). The Cartesian product x Ti is a set of all mappings f: f(i) = "r~, iel for every i ~ I. Any mapping f which "chooses" one element from each of the sets T2... T~ is called a "choice function" of the family (T/>i~ [13] and denoted by The number of all distinct choice functions f~ x T~ is n[ = IS.l. There is i~l a one-to-one correspondence between every choice function f and a certain permu-
. 586 ZBIGNIEW KOKOSII~SKI tation ~t ~ S, for each decomposition scheme. On account of this, the terms "permutations" and "choice function" are used interchangeably throughout this paper. Let us introduce lexicographic order on the set of all choice functions f e x T~: we iei say that f~ < J~ iff there exists an i ~ I, such that for z~, z~ ~ fa and r~, z~ e fb: ~ < z~, and zj k = Z~, for every j < i (where z~ < z~ liffk<l, and zj k = z~iffk=l). For every decomposition scheme the list of choice functions in lexicographic order corresponds to the list of permutations of the set {1... n} in a new linear order. 3. The algorithm. The algorithm PG (Permutation Generation) generates a sequence of choice functions of the family (T~)~,~ in accordance with the lexicographic order defined in the previous section. The initial choice function ft is given in table A. Each transposition ~ ~ fl is written in the form A(/) = j. The variable NUMBER saves the number of choice functions to be generated. The table NEXT is filled up with the consecutive sequences of transpositions in lines 1 and 10 of the algorithm PG. When the decomposition of Sn into right (left) cosets has been carried out (according to the set of equations (3) or (4), respectively), the permutation generated by the algorithm may be easily computed by applying to the sequence of items (1, 2... n) all transpositions z~ from the table NEXT in the increasing (decreasing) order of indices i (i.e. from z~ (z~) to z~ (~½)). All identity permutations T may be omitted. Algorithm PG. Inpunt: the integer vatue NUMBER- the number of permutations to be generated; the integer table A[n] with the initial choice function fl; Output: the integer table NEXT[n] with subsequent choice functions representating permutations. 1. NEXT: = A; 2. NR:= 1; 3. print out NEXT 4. repeat 6. 7. 8. 9. 10. 11. i: = n; A(i):-- (A(i)+ 1) mod i; if A(i) = 0 then i: = i- 1; goto6 else NEXT: = A, NR: = NR + 1;
ON GENERATION OF PERMUTATIONS THROUGH DECOMPOSITION... 12. print out TEXT 13. until NR=NUMBER; 587 The sequence of choice functions generated by the algorithm PG and the two resulting sequences of permutations "a" and "b" are shown in Fig. 1. In this case the initial choice function is equal to (0, 0, 0) and NUMBER = n!. NR NEXT sequence "a" sequence "b" 1 (0,0,0) 1 2 3 4 1 2 3 4 2 (0,0,1) 4 2 3 1 4 2 3 1 3 (0,0,2) 1 4 3 2 1 4 3 2 4 (0,0,3) 1 2 4 3 1 2 4 3 5 (0,1,0) 3 2 1 4 3 2 1 4 6 (0,1,1) 3 2 4 1 4 2 1 3 7 (0,1,2) 3 4 1 2 3 4 1 2 8 (0,1,3) 4 2 1 3 3 2 4 1 9 (0,2,0) 1 3 2 4 1 3 2 4 t0 (0,2,1) 4 3 2 1 4 3 2 1 11 (0,2,2) 1 3 4 2 1 4 2 3 12 (0,2,3) 1 4 2 3 1 3 4 2 13 (1,0,0) 2 1 3 4 2 i 3 4 14 (1,0,1) 4 1 3 2 2 4 3 1 15 (1,0,2) 2 4 3 1 4 1 3 2 16 (1,0,3) 2 1 4 3 2 1 4 3 17 (1,1,0) 2 3 1 4 3 1 2 4 18 (1,1,1) 2 3 4 1 4 1 2 3 19 (1,1,2) 4 3 1 2 3 4 2 1 20 (1,1,37 2 4 1 3 3 1 4 2 21 (1,2,0) 3 1 2 4 2 3 1 4 22 (1,2,1) 3 4 2 1 4 3 1 2 23 (1,2,2) 3 1 4 2 2 4 1 3 24 (1,2,3) 4 1 2 3 2 3 4 1 i = 2,3,4 weight = 12, 4, 1 Fig. 1. Sequences of all n! choice functions and permutations for n = 4: "a"-decomposition scheme (3); "b"-decomposition (4). To illustrate the use of the algorithm, an example will be considered. The 12th permutation from the sequence "a', i.e. 1423, is obtained by applying to the sequence of items (1, 2, 3, 4) transpositions zo, z], z3, according to the decomposition scheme (3). Similarly, the 12th permutation from the sequence "b", i.e. 1342, is computed with the transpositions T3a, r~, ~o, according to the decomposition scheme (4). Let us notice that the sequence of choice functions generated by the algorithm PG corresponds to a control structure called "factorial counting" that is essential for other permutation generation algorithms [17]. In the next section, techniques for ranking and unranking of choice functions are described. They can be widely used for the proper distribution of the permutation generation task in the multiprocessor system.
588 ZBIGNIEW KOKOSINSKI 4. Ranking and unranking. Ranking of the choice functions in lexicographic order follows the general ranking scheme for the list of permutations in this order [t 6]. The rank of the choice function from the table NEXT can be computed using the well-known formula: NR = NEXT(O[n!/i! + 1, (5) i= where the factor [n!/i!] is the weight of the ith position of the table NEXT. To compute the initial choice function fl with the rank NR 1, which is an input for the algorithm PG, the following unranking algorithm is proposed: Algorithm UNRANKING Input: the integer value n - the number of items to be permuted, the integer value NR1 - the rank of the choice function fl (1 < NR1 < n!); Output: the table A with the choice function fl. 1. NRI:=NR1-1; 2. i:=l; 3. repeat 4. j:=i; 5. i:=i+l; 6. ifnr1 < j(n!/i!) 7. then if j = 1 8. then A(i): = 0; 9. else j: =j - 1; 10. go to 6 11. else A(i): =j; 12. NR 1: = NR 1 -j(n!/i!); 13. until NRI=0; 14. print out A The performance of the algorithm can be examined in Fig. 1. The algorithm UNRANKING has O(n 2) time complexity while ranking in accordance with the formula (5) requires linear time. The problem of finding the rank of any permutation written as a sequence of numbers or as a product of cycles may be solved in two steps. First, the choice function for this permutation has to be determined using the O(n) algorithm given in [15]. Then, for ranking of the choice function, formula (5) is to be applied.
ON GENERATION OF PERMUTATIONS THROUGH DECOMPOSITION... 5. Hardware implementation. 589 In this section a circuit generating permutations is described that can be used for the local generation of permutations in parallel systems. The general structure of the circuit is shown in Fig. 2. A permutation network and a programming device are the main components Of the permutation generator. I n MEMORY PERMUTATION NETWORK "READ" DEV ICE t PROGRAMMER t / Fig. 2. The structure of the generator of permutations. The permutation network can realize any interconnection between its inlets I~... In and outlets O1... On. Networks with this property are referred to in the literature as rearrangeable interconnection neworks. After establishing the initial state of the network by the programmer, the first permutation is obtained. Then, under the pulse from the programmer the network goes to next states and new permutations are realized. After presenting the general idea of the permutation generator a more detailed description of the circuit follows. The triangular permutation network is built from two-state cells (see Fig. 3.). Each cell requires a separate control signal. The circuit realizing this control is organized in the following way. With every ith column of the triangular network (2 < i < n), the ith ring counter is associated with the initial state from the "l-out-of-/" code. All column ring counters form the parallel counter with n! different states. The asynchronous setting up of each ring counter and resetting of the whole parallel counter are provided. If thejth bit of the ith ring counter bi = 1, for 1 < j < i - 1, then in the ith column of the network only one cell denoted by ~i is activated to realize the "interchange" state.- If b = 1 then all cells in the ith column of the network are in the "identity" state. The initial states of the ring counters in the control unit are computed by the UNRANKING algorithm and set up according to table A. Then, under the pulse from the programmer, the cells of the triangular network are activated in texicographic order defined on the set of all choice
590 ZBIGNIEW KOKOSII~ISKI a) i I 1 z i J 01 b) i 0 2 0 3 i bi=l J Oq Fig. 3. Triangular permutation network: a) cell in the "identity" state; b) cell in the "interchange" state; c) network for n = 4 and decomposition scheme (3). functions f ~ x T~. The output of the AND gate, which realizes the Boolean product tel of bits b "-... I bl- 1 forms the clock pulse for the (i - 1)st ring counter (triggering on the falling edge is assumed). The external clock pulse is connected to the clock input of the nth ring counter. The construction of the control circuit allows us to preserve the cellular structure of the circuit since flip-flops of the ring counters may be distributed among the cells of the triangular network. The regular topology of the generator is of great importance from the point of view of VLSI design. 6. Partitioning of the set of instances. Results presented in the previous sections may be immediately applied to the local permutation generation for any number of parallel processors. In order to generate all n! permutations in any SIMD system with N processors, the subsets of permutations must be distributed uniformly among the processors: N - 1 generators will generate [n!/n] permutations and the Nth generator will generate only (n! - (N -- 1) [n!/n]) permutations The initial choice functions for the sequence of numbers NR: 1, rn!/n] + 1, 2[n!/N] + 1... (N - 1)[n!/N] + 1 can be computed at the same time according to the UNRANKING algorithm. Then, in [n!/n] steps all n! permutations will be generated. When an MIMD computer organization is assumed, the partition of n! permutations among the processors need not be symmetric.
ON GENERATION OF PERMUTATIONS THROUGH DECOMPOSITION... 591 REFERENCES 1. S. G. Akl, A new algorithm for generating derangements, BIT 20(1980), 2. 2. G. H. Chen Ma w Sheng Chern Parallel generati n f permutati ns and c mbinati ns BIT 26 (1986) 277-283. 3. M. Cosnard, A. G. Ferreira, Generating permutations on a VLSI suitable linear network, The Computer Journal 32, 6 (1989), 571-573. 4. M.C. Er Ef cientgenerati n fstackpermutati nsinlexic graphical rder J urnal finf rmati n Processing 9, i (1985), 17-19. 5. M. Garey, D. S. Johnson, Computers and Intractability. A Guide to the Theory of NP-completeness, W. H. Freeman and Co., San Francisco (1979). 6. P. Gupta, G. P. Bhattacharjee, Parallel derangement generation algorithm, BIT 29 (1989), 14-22. 7. P. Gupta, G. P. Bhattacharjee, Parallel generation of permutations, The Computer Journal 26, 2 (1983), 97-105. 8. M. Hall, L J. Paige, Complete mappings of finite groups, Pacific Journal of Math. (1955), 541-549. 9. M. Hall, The Theory of Groups, Macmillan, New York (1959). 10. W. H. Kautz, K. N. Levitt, A. Waksman, Cellular interconneection arrays, IEEE Transactions on Computers, 17, 5 (1968), 443-451. 11. A. Kapralski, New methods for generation permutations, combinations and other combinatorial objects in parallel, to appear. 12. W. Lipski, More on permutation generation methods, Computing 23 (1979), 57-365. 13. L. Mirsky, Transversal Theory, Academic Press, N. Y., 1971. 14. M. Mor, A. S. Fraenkel, Permutation generation on vector processors, The Computer Journal, 25, 4 (1982), 423--428. 15. A. Y. Oru~, A. M. Oru~, Programming cellular permutation networks through decomposition of symmetric groups, IEEE Transactions on Computers 36, 7 (1987), 802-809. 16. E. M, Reing ld J Nievergelt N De C mbinat rial Alg rithms The ry and Pra tice Prentice Hall Inc., Englewood Cliffs, N.J., 1977. 17. R. Sedgewick, Permutation generation methods, Computing Survey 9, 2 (1977), 137-164. 18. I.Semba Generati n fstackseq encesinlexic raphical rder J urna finf rmati npr cessing5 1 (1982), 17-20. 19. S. Zaks, A new algorithm for generation of permutations, BIT 24 (1984), 196-204.