How To Create And Access A Table Based Upon The Binomial Coefficient

Introduction

The binomial coefficient is defined as the number of ways of uniquely picking n items in groups of k.  This is also known as n choose k.  For example, 4 choose 3 yields 4 unique sets of 3 items each: 123, 124, 134, and 234.

The formula for calculating the total number of unique combinations is:

N! / (K! (N – K)! )

where ! denotes a factorial. The binomial coefficients form the rows of Pascal’s triangle.  (1), (2).

This paper deals with how a model constrained by the binomial coefficient can be placed into a table and how that table can be accessed easily, efficiently, and without using large amounts of memory.  A previously unpublished mathematical relationship is explored that provides the basis for a generic C# class that implements a technique for creating and accessing elements within the table from the underlying indexes.  Software developers using models that employ the binomial coefficient should find the techniques discussed in this article as well as the BinCoeff class quite useful for solving the problem of how to create and access a table constrained by the binomial coefficient.

The initial motivation for this paper came about when analyzing how to efficiently evaluate poker hands for the game Texas Holdem.  Even though this technique can be applied to evaluating poker hands, it can also be applied to any other real world problem that uses the binomial coefficient.

One idea to efficiently evaluate poker hands is to pre-calculate them and store them inside of a table where they could be looked up during a simulation involving billions of hands.  So, for example, given a hand like AsTs8s8c5s5c2s the algorithm would need to calculate the index to the value for this hand inside the table.  The value inside the table would be pre-calculated and would specify an ace high flush since there are 5 spades in this hand.

Texas Holdem is played with a 52-card deck and each hand contains 7 cards, of which the best 5 form the hand.  If an evaluation table is constructed that is composed of 7 dimensions by 52, then the total number of entries in that table can be calculated as 52 ^ 7 = 1,028,071,702,528. Since there are several thousand types of poker hands, each entry in the table would require two bytes, which means that the total amount of memory required would be close to two terabytes, which is infeasible for most modern computers.

Using the binomial coefficient, the total number of unique combinations of 7-card poker hands can be calculated as 52 choose 7 = 52! / (7! (52 – 7)! ) = 133,784,560.  Since each entry would require two bytes, the total amount of memory is thus 267,569,120 bytes, which is not much of a burden for most modern computers.

So, the next question to solve is how to translate a 7-card hand like the example hand above into the correct index into the sorted evaluation table.  One idea would be to use a hash table.  The key for this hash table would be the 7-card hand and the value stored would be the index into the table.  However, for larger tables, the burden on memory would be too great. Another possibility would be to iterate through the elements, but that would take up too much cpu time.

To help solve this problem, it became clear that a simpler problem would be easier to work with.  So, the case 13 choose 5 was selected. 13 is the number of ranks in a deck and 5 is the number of cards of a poker hand. Each combination reflects a no-pair hand, including straights. The binomial coefficient yields 1287 unique combinations for the 13 choose 5 case, which is much easier to manage and understand.

To help analyze how to translate a 5 card hand into the correct index into the sorted table, each unique combination was printed out in order to a file.  A subset of that file is shown below:

T9854 T9853 T9852 T9843 T9842 T9832 T9765 T9764 T9763 T9762
T9754 T9753 T9752 T9743 T9742 T9732 T9654 T9653 T9652 T9643
T9642 T9632 T9543 T9542 T9532 T9432 T8765 T8764 T8763 T8762
T8754 T8753 T8752 T8743 T8742 T8732 T8654 T8653 T8652 T8643
T8642 T8632 T8543 T8542 T8532 T8432 T7654 T7653 T7652 T7643
T7642 T7632 T7543 T7542 T7532 T7432 T6543 T6542 T6532 T6432
T5432 98765 98764 98763 98762 98754 98753 98752 98743 98742
98732 98654 98653 98652 98643 98642 98632 98543 98542 98532
98432 97654 97653 97652 97643 97642 97632 97543 97542 97532
97432 96543 96542 96532 96432 95432 87654 87653 87652 87643
87642 87632 87543 87542 87532 87432 86543 86542 86532 86432
85432 76543 76542 76532 76432 75432 65432

After studying the above table for a while, it became clear that the relative position of a card within its column is constant regardless of the cards before or after it.  For example, the first time that a 5 is found in column 3 it is always 3 away from the base values of 432 in columns 1 through 3.  76532 is 3 away from the zero element 65432.  86532 is also 3 away from 85432.  Since this pattern is repeated throughout the sequence, it implies that an index table for each column can be created to hold the relative position of each card in the evaluation table.  To obtain the column indexes in the sorted evaluation table, the following formula can then be used:

Index = IndexTable1[Card1] + IndexTable2[Card2] + IndexTable3[Card3] + IndexTable4[Card4] + Card5.

Card1 – Card5 are limited to values 0 through 12 and must be in sorted order.

To create the 4 (k – 1) index tables, the following initial inefficient code, which can be found in the TestBinCoeff project, was used:

private void GetIndexesForVer()
{
   // This function gets the indexes for the 13 choose 5 case by looping through and counting the
   // position of each card.  This is an alternative way of calculating the indexes and is used
   // to verify that the BinCoeff class does this correctly.
   //
   int Rank1, Rank2, Rank3, Rank4, Rank5, Value = 0;
   // First, get the indexes for the most significant card.
   for (Rank1 = 4; Rank1 < RanksInDeck; Rank1++)
   {
      NoPair1[Rank1] = Value;
      for (Rank2 = 3; Rank2 < Rank1; Rank2++)
      {
         for (Rank3 = 2; Rank3 < Rank2; Rank3++)
         {
            for (Rank4 = 1; Rank4 < Rank3; Rank4++)
            {
               for (Rank5 = 0; Rank5 < Rank4; Rank5++)
               {
                  Value++;
               }
            }
         }
      }
   }
   // Next, get the index values for the next most significant card.
   Value = 0;
   for (Rank2 = 3; Rank2 < RanksInDeckM1; Rank2++)
   {
      NoPair2[Rank2] = Value;
      for (Rank3 = 2; Rank3 < Rank2; Rank3++)
      {
         for (Rank4 = 1; Rank4 < Rank3; Rank4++)
         {
            for (Rank5 = 0; Rank5 < Rank4; Rank5++)
            {
               Value++;
            }
         }
      }
   }
   // Next, get the relative value for the next most significant card.
   Value = 0;
   for (Rank3 = 2; Rank3 < RanksInDeckM2; Rank3++)
   {
      NoPair3[Rank3] = Value;
      for (Rank4 = 1; Rank4 < Rank3; Rank4++)
      {
         for (Rank5 = 0; Rank5 < Rank4; Rank5++)
         {
            Value++;
         }
      }
   }
   Value = 0;
   // Get the relative value for the next most significant card.
   for (Rank4 = 1; Rank4 < RanksInDeckM3; Rank4++)
   {
      NoPair4[Rank4] = Value;
      for (Rank5 = 0; Rank5 < Rank4; Rank5++)
      {
         Value++;
      }
   }
}

The values calculated within the 4 index tables have some interesting properties.  Below are the values for each of the 4 index tables that were calculated from the above code:

NoPair4 NoPair3 NoPair2 NoPair1
0 0 0 0
1 0 0 0
3 1 0 0
6 4 1 0
10 10 5 1
15 20 15 6
21 35 35 21
28 56 70 56
36 84 126 126
120 210 252
330 462
792


Pascal’s Triangle

Below is Pascal’s triangle.  Many diagrams of Pascal’s triangle start off with 1.  However, the 0 diagonals are implied and this is not the first diagram to show a zero based Pascal triangle (3).

As can be seen, the binomial coefficient values in each of the 4 index tables above match the shaded diagonal values in Pascal’s triangle starting with row 1.  The row 1 diagonal starts with an index of 1, row 2 starts with index 2, row 3 starts with index 3, and row 4 starts with index 4, which matches how the above table is also organized.  This means that the position of an entry in a sorted table of binomial coefficients can be inferred.  For example, take the hand AKQJT, which has K indexes of 12, 11, 10, 9, 8 and is the most significant value in the table.  The index of 12 corresponds to 792 in the table, which is obtained by starting with index 4 (which has a value of zero) in row 4 and counting diagonally downward until 12 is reached.  In a similar vein, the other values can be obtained from the table, which are 330, 120, and 36. To obtain the position of the table entry from the K indexes, simply add them all together:

Position in table = 792 + 330 + 120 + 36 + 8 = 1286.  Since we previously calculated the total number of unique combinations in the table via the binomial coefficient as 1287, the above number is correct since it is the last hand in the table and the table starts off with zero.

One other interesting property is that the sum of the indexes of an index table yields the value of the last index of the next most significant index table.  For example, the sum of the values in the NoPair4 index table is calculated as 1 + 3 + 6 + 10 + 15 + 21 + 28 + 36 = 120, which adds up to the last value in the NoPair3 index table.  This relationship can also be seen from Pascal’s triangle as well.

A More Efficient Way To Calculate The Index Tables

Each number in Pascal’s triangle is the sum of the two numbers directly above it (4).  This means that the code used to create the index tables can use this property to create the index tables much more efficiently.

This relationship makes it almost trivial to quickly initialize the index tables without having to write code with lots of loops like the GetIndexesForVer function.  The following function can be used to initialize the index tables and works with any n and k as long as the total number of combinations is less than or equal to two gigabytes.  It is used by the BinCoeff class for this purpose:

private void GetIndexes()
   {
   // This function creates each index that is used to obtain the index to the binomial
   // coefficient table based upon the underlying K indexes.
   //
   int LoopIndex, Loop, Value, IncValue, StartIndex, EndIndex;
   int[] IndexArray, IndexArrayPrev, IndexArrayLeast;
   //
   Indexes = new List(IndexTabNum);
   // Create the arrays used for each index.
   for (Loop = 0; Loop < IndexTabNum; Loop++)
   {
      IndexArray = new int[NumItems - Loop];
      Indexes.Add(IndexArray);
   }
   // Get the indexes values for the least significant index.
   IndexArrayLeast = Indexes[IndexTabNumM1];
   Value = 1;
   IncValue = 2;
   for (Loop = 2; Loop < IndexArrayLeast.Length; Loop++)
   {
      IndexArrayLeast[Loop] = Value;
      Value += IncValue++;
   }
   // Get the index values for the remaining indexes.
   Value = 1;
   IncValue = 2;
   StartIndex = 3;
   EndIndex = NumItems - IndexTabNumM2;
   for (LoopIndex = IndexTabNumM2; LoopIndex >= 0; LoopIndex--)
   {
      IndexArrayPrev = Indexes[(LoopIndex + 1)];
      IndexArray = Indexes[LoopIndex];
      IndexArray[StartIndex] = 1;
      for (Loop = StartIndex + 1; Loop < EndIndex; Loop++)
      {
         IndexArray[Loop] = IndexArray[Loop - 1] + IndexArrayPrev[Loop - 1];
      }
      StartIndex++;
      EndIndex++;
   }
}

The BinCoeff Class

The BinCoeff class is a generic class written in C#.  It can be used to optionally create a table that can contain up to two gigabytes of elements of any type of object.  The two gigabyte limitation is imposed by the .NET Framework – even for 64 bit applications (as of .NET Framework 4.0).  To get around this limitation, the Mono open source C# compiler (5, 6) could be used, or custom containers written.

The methods of most interest in this class are GetIndex and GetKIndexes.  GetIndex translates the k indexes into the correct position of the object within the table.  So for example, calling GetIndex with the KIndexes array set to value 12, 11, 10, 9, and 8 would return the value 1286 for the 13 choose 5 case.  If the KIndexes array is not in sorted order, then the Sorted flag should be set to false because the K indexes must be in sorted order prior to accessing the table:

public int GetIndex(bool Sorted, int[] KIndexes)
{
   // This function returns the proper index to an entry in the sorted binomial
   // coefficient table from the underlying values in KIndexes. For example, for the
   // 13 chooose 5 example which corresponds to 5 card poker hand ranks, then AKQJT
   // (which is the greatest hand in the table) would be passed as value 12, 11, 10,
   // 9, and 8, and the return value would be 1286, which is the highest
   // element.  Note that if the Sorted flag is false, then the values in KIndexes
   // will be put into sorted order and returned that way.  The sorted flag must be
   // set to false if KIndexes is not in descending order.
   //
   int LoopIndex, n, Index = 0;
   int[] IndexArray;
   if (!Sorted)
   {
      ArraySorter<int>.SortDescending(KIndexes);
   }
   for (LoopIndex = 0; LoopIndex < KIndexes.Length - 1; LoopIndex++)
   {
      IndexArray = Indexes[LoopIndex];
      n = KIndexes[LoopIndex];
      Index += IndexArray[n];
   }
   Index += KIndexes[KIndexes.Length - 1];
   return Index;
}

The GetKIndexes method performs the opposite function of GetIndex.  It takes as input the position of the object within the table and translates that to the underlying sorted K indexes.  For example, if 1286 is specified, then the KIndexes array would return 12, 11, 10, 9, and 8 respectively:

public void GetKIndexes(int Index, int[] KIndexes)
{
   // This function returns the proper K indexes from an index to the sorted binomial
   // coefficient table.  This is the reverse of the GetIndex function.
   // The correct K indexes are returned in descending order in KIndexes.
   int LoopIndex, Loop, End, RemValue = Index;
   int[] IndexArray;
   for (LoopIndex = 0; LoopIndex < KIndexes.Length - 1; LoopIndex++)
   {
      IndexArray = Indexes[LoopIndex];
      End = IndexArray.Length - 1;
      for (Loop = End; Loop &gt;= 0; Loop--)
      {
         if (RemValue >= IndexArray[Loop])
         {
            KIndexes[LoopIndex] = Loop;
            RemValue -= IndexArray[Loop];
            break;
         }
      }
   }
   KIndexes[KIndexes.Length - 1] = RemValue;
}

Two methods are provided to calculate the binomial coefficient.  The unsigned long version of GetBinCoeff should usually be used since the int version overflows and fails for the 52 choose 7 case.  Thanks to Mark Dominus (7) for this implementation. OutputKIndexes is a method that formats and outputs the KIndexes for each item in the table to a file.

Common container methods such as AddItem and GetItem are provided to round out the class.  Both of these methods are overloaded to work with both the K indexes and the index to the item within the table.

A test class called TestBinCoeff is provided that shows how to use the class and proves that the class works as intended.  This class contains several tests for the case 13 choose 5 and one test for the 52 choose 7 case to show that the 13 choose 5 case is not a fluke.

Conclusion

This paper has shown how a model constrained by the binomial coefficient can be created and easily accessed.  Analysis of the binomial coefficient and Pascal’s triangle as well as some new key properties were discussed leading to an efficient implementation of a C# generic class that makes it easy to create and use a table that can hold any type of object.  The class implements translation methods that make it easy to translate between the underlying K indexes and the position of that element within the table of sorted binomial coefficients as well as the reverse function that translates between an index and the corresponding K indexes.


Download and installation:

Click this link or the download button near the bottom of this page to download the file containing the source code.  This file is a zipped file containing 2 visual studio projects.  Extract it into the folder of your choice and leave the folder hierarchy intact.  The code was built and tested with Visual Studio 2008, but it should work with Visual Studio 2010 with little if any modification.  To make it work with previous versions of Visual Studio, you may have to open a new project and  add the individual files to each project.

The following code files are contained in the in the BinomialCoefficient folder:

  • BinCoeff.cs - contains the BinCoeff generic class that provides the functionality to calculate the binomial coefficient, creates a generic table of objects, and provides the functions to translate between the underlying indexes and the index of that object within the table and vice-a-versa.

The following code files are contained in the TestBinCoeff folder:

  • TestBinCoeff.cs - contains the class used to test out the BinCoeff generic class.  This shows how to use the class and proves that it works.
Posted in algorithm, Binomial Coefficient, C#, fitting a model into memory, minimizing memory, number theory, Sequence | Leave a comment

About the author

Software developer for over 25 years. Interested in efficient software methodology, user requirements, design, implementation, and testing. I have quite a bit to say about these topics. Experienced with C#, VC , VB, Sql Server, and office tools. MCSD.

Please feel free to check out my other blogs:

Designing Efficient Software – which contains file I/O benchmarks and a bitmap sort implemented in C#.

I can be reached by email at: RobertGBryanREMOVETHIS@yahoo.com.  In order to avoid automated spam messages, the above email address is spoofed.  Please remove REMOVETHIS from the email address.  For example, if my email address was abcREMOVETHIS@yahoo.com, then the actual email address would be abc@yahoo.com.
Posted in Uncategorized | Leave a comment