In this paper, we outline how we develop our Markov Chain approach to studying the progress of runs for a batting order of non-identical players along the lines of work in baseball modelling by Bukiet et al., 1997. We describe the issues that arise in applying such methods to cricket and how we have addressed the difficulties particular to cricket. In a Markov process, it is not important to know how a given situation arose, just that you are in a particular situation. The probability of going from one situation to any other is known. There are a finite number of situations. In the context of cricket, the dynamics of run production depends mainly on the interaction of the bowler and the batsman. So the game can be modelled as a sequence of one-on-one interactions. A batsman takes a turn and then we stop and have a new situation. The probability of any occurrence depends only on the current situation (who is the facing batsman, who is the bowler, who is the batsman at the other wicket (the non-striker), how many balls are left) and possibly only a small subset of that. For the most part, there are only 7 states to which a given situation will commonly transition on a single bowl of the ball. Making the situation slightly more complex is the effect of the batsman switching places. If an odd number of runs are scored, the batsman will have switched ends. Similarly, if a multiple of six balls have been bowled, the bowler has finished the over and a new bowler will begin bowling from the other end, resulting in the non-striker becoming the facing batsman. (We note that it is possible, but uncommon to score 5 runs. Similarly, it is possible to score runs when a batsman is runout. We disregard these events, other rare offensive possibilities as well as rules concerning fielding restrictions. Our method could handle most of these at the cost of greatly increased computational time. It appears, at least in the case of modelling baseball that ignoring many rare situations makes little difference in the results as there is much cancellation between including positive events (e.g. fives) and negative events (e.g. , runouts on run scoring balls)). Let the multidimensional Matrix M have entries M (b,r,w,b1,b2) represent the number of balls bowled, runs scored and wickets down, the next batsman and the batsman at the other wicket, respectively. For each number of balls bowled, we can compute the probability of being in a given situation by multiplying the (multidimensional) matrix representing the set of probabilities after the b-1 balls by the probability of each of the events listed above occurring. For example, the game begins with 0 balls bowled, 0 runs scored, 0 wickets gone and batsman number 1 about to hit, with batsman number 2 at the other wicket. Thus, M (0,0,0,1,2) = 1 and all other entries of M (0,r,w,b1,b2) are zero. After the first ball, M (1,0,1,3,2) = Pd, M(1,0,0,1,2) = P0, M (1,1,0,2,1) = P1, M (1,2,0,1,2) = P2 and so forth. These values are obtained in the general case (b balls) by multiplying each non-zero entry of M (b-1,r,w, b1,b2) by each of the probabilities Pd, P0-P6 and placing the result in the appropriate location in the M (b,r,w,b1,b2). After the computation has considered 300 balls (with 10 wickets down causing no future balls to be bowled) we end up with the probability of any given number of runs having been scored (the runs distribution). The computation is actually simplified by looping through the number of balls and saving only the situation after b-1 balls to compute the situation after b balls. Also, one need not keep track of wickets dismissed since the batsmen currently in the game provide that information (if batsman number 6 in the order is in the game, but 7-11 are not, then 4 batsmen have been dismissed). Thus, an 11 X 11 X 1800 (batsmen X batsmen X runs) matrix needs be maintained and updated. We note that the method automatically takes into account that the batsmen early in the lineup, if they are the best batters will face more balls than the later batsmen. Summing the product of each possible number of runs and its probability of being the result in the game gives the expected number of runs for the batting order considered. This strategy is the same in philosophy as that of Bukiet et al., 1997 only the details are modified. The strategy involves mathematically only addition, multiplication and some logic. The method makes sense only because cricket has the following properties: Some aspects of One-Day Cricket make it more complicated and lengthy to model than for baseball. The large computational time involved using this straightforward approach (referred to later on as “the straightforward method ”makes it unattractive as a planning tool for coaches, however some streamlining improves the performance. Instead of considering each batsman and each ball individually, we consider (in what we call our “streamlined method”) each pair of batsmen (11 x 10 pairs) and each over individually (50 overs). As a further simplification, we assume that a maximum of 1 wicket can fall in any given over; the result is about a 1 second improvement in processing time per line-up studied. To include bowling and defensive performance, one could scale the offensive characteristics in an appropriate way. For example, if a given bowler has performance level, say, 2% worse, than the average bowler, by some measure, then opposing players would have their offensive performance (P1- P6) increased by 2% and P0 decreased accordingly. Ideally, one would like to have enough data on how well each bowler performs against each batsman (and vice versa), but that is not likely to be the case. Another method of scaling batsman performance might take into account his “handedness”, that of the bowler, and/or the type of bowler (e.g., a spin bowler) bowling. One of the authors has looked into various methods of considering pitcher ability in baseball and found that considering such complications did not lead to improved results. |