March 31, 2015
All Spin Is Not Alike
Alan M. Nathan
Ever since the early days of PITCHf/x, we have had unprecedented information about the movement of pitches. We now have a precise quantitative measure of how much and in what direction a pitch moves—i.e., deviates from a straight-line path. The movement is the result of the combined forces of gravity pulling the ball downward and the so-called Magnus force on a spinning baseball. It has become conventional to remove the effect of gravity, which is easily calculable, so that the resulting movement—pfx_x and pfx_z in PITCHf/x lingo—is due only to the Magnus force. I will utilize that convention in this article. It seems sort of reasonable that there ought to be some simple relationship between the movement to the spin rate. For example, if a pitch is spinning at a higher rate, the expectation is that there will be more movement. But is that expectation correct? In fact, it is not correct because, as the title of this article suggests, all spin is not alike. And that is the issue I want to discuss here.
So why is it that all spin is not alike? The reason has to do with the vector nature of the spin: It has a magnitude and a direction. The magnitude is pretty simple, since it is just the number of revolutions per minute, or rpm. Let’s talk about the direction. The easiest way to determine the direction of the spin is to use a right-hand rule: Wrap the fingers of your right hand around the ball so that they point in the direction that the ball is turning. Your thumb will then point in the direction of the spin axis.
Here are some examples. A straight overhand fastball has pure backspin and the spin axis points to the pitcher’s right. An overhand “12-6” curveball has pure topspin and the spin axis points to the pitcher’s left. A ball thrown with pure sidespin has its spin axis pointing up or down. In all these examples, the spin axis is perpendicular to the direction of motion. On the other hand, a gyroball is a pitch thrown with the spin axis perfectly aligned along the direction of motion, much like a spiral pass in football. Indeed, it is often called “bullet spin”, since that is how a bullet will spin when shot from a rifle. All of these pitches are special cases, since in general the spin axis could be pointing in any direction whatsoever.
Let’s talk about the gyroball some more. Back in 2007 when Daisuke Matsuzaka first burst onto the MLB scene, there were a lot of articles written about the gyroball. Much of what was written was pure nonsense, as writers were claiming all kinds of weird movement for that pitch. However, the key feature of the gyroball that makes it unique is that, since the spin is perfectly aligned with the direction of motion, there is no movement to the pitch. The gyroball is an extreme example of the following general principle:
Only the component of spin that is perpendicular to the direction of motion of the ball contributes to the movement.
In general, the spin vector can be written as the vector sum of two components. One is parallel to the direction of motion, and I will refer to it as the “gyrospin” component . The other is perpendicular to the direction of motion, and I will refer to it as the “transverse spin” component . Remember that only contributes to the movement. In general, if we can measure the movement, we can determine both the magnitude and direction of , which is exactly what PITCHf/x does. But PITCHf/x has no way to determine anything about . We may refer to the transverse spin as the “useful spin”, since it is directly related to the amount of movement, in the sense that increasing the transverse spin will increase the movement. However, increasing the gyrospin will not increase the movement. As the title says, all spin is not alike. Note that the total spin rate is the Pythagorean sum of and :
Now let’s fast-forward to more recent years, when many MLB teams have the Trackman radar system installed in their ballparks. Recall that Trackman, much like PITCHf/x, is a device that is used to measure the trajectory of a pitched baseball. But unlike PITCHf/x, Trackman can actually measure the magnitude of the total spin . Moreover since the trajectory is measured, the movement can be determined, so that the magnitude and direction of can also be determined, just as is done by PITCHf/x. Given that information, namely the magnitude of the total spin and the magnitude and direction of the transverse spin, it is possible to determine both and the direction of the total spin.
Several years ago, I wrote an unpublished article outlining the procedure for doing that, and I refer you to it if you want to see some of the technical details. One important result from that article is the relationship between movement and transverse spin (see Eq. 8 and Fig. 1 in the linked article), since that relationship is essential for determining from the movement data. The relationship was established from a variety of controlled experiments done in the laboratory.
Since I happen to have some Trackman, I decided to analyze two games of data from the 2012 season to see what could be learned about the direction of the spin axis. In particular, I was interested to learn which types of pitches had primarily transverse spin and which had a major gyrospin component. A total of 312 pitches were in the database, all of which had information on the full trajectory as well as the total spin . The data were cross-correlated with PITCHf/x data, from which a pitch type could be assigned using tags borrowed from Harry Pavlidis. The techniques described in my unpublished article were used to determine the movement and the corresponding value of transverse spin .
The results of the analysis are presented in Figures 1-3. First refer to Figure 1, which is a scatterplot of transverse spin vs. total spin , with different symbols/colors representing different pitch types as follows.
Blue triangles are 4S fastballs (FA); black diamonds are sinkers (SI); azure diamonds are changeups (CH); black squares are splitters (FS); green diamonds are sliders (SL); red stars are curveballs (CU); and black asterisks are cutters (FC). The line represents . The basic idea is that pitches whose total spin is all transverse—i.e., no gyrospin component—should be along the line, whereas pitches with a gyrospin component would be expected to fall below the line.
The first thing that jumps out at you in the plot is that there is considerable scatter in the data, so much so that it does not look very optimistic for learning much from this plot. In fact, there are even points for which the transverse spin exceeds the total spin, a physical impossibility. The scatter is an inevitable consequence of random measurement error. I’ll return to a discussion of random error later in the article. But for now, take a closer look at the plot and you will see that despite the scatter, the pitches fall into two distinct categories. In one group are those pitches that follow the general trend of the line, albeit with random scatter about the line. These pitches are of type FA, SI, CH, and FS, but for ease of discussion I will hereafter refer to them as “fastballs/changeups”. In the other group are those pitches that lie systematically below the line. These pitches are of type SL, CU, and FC, and I’ll refer to them as “breaking pitches”. So despite the scatter, we nevertheless arrive at the following tentative conclusion: the pitches in the fastball/changeup group have little or no gyrospin, whereas the breaking pitches all have various degrees of significant gyrospin. Another way to say this is that essentially all the fastball/changeup spin is useful spin, meaning that it results in movement, and that a significant fraction of the breaking ball spin is not useful spin.
The conclusion about the pitches in the fastball/changeup group makes good sense, at least to me. Based on what is known about the grip and release of the pitchers who throw them, it is hard to see how a pitcher could put a significant amount of gyrospin on the ball. In that sense, we haven’t really learned anything that we didn’t know already, namely that the spin on these pitches is all useful spin. But it is good that the actual data support our suspicions. Moreover, it gives us confidence that the analysis and interpretation of the data are basically correct, particularly our techniques for determining the movement and for relating the movement to the transverse spin. I note that PITCHf/x uses a different algorithm for both movement and transverse spin.
The conclusion about breaking pitches is more interesting. While it is well known that sliders have a significant gyrospin component, I was surprised to learn that the same is true for curveballs. Just to state the point explicitly, not all the spin on curveballs is useful spin, a point that will be discussed further shortly. Since sliders and cutters are often confused, I was not particularly surprised about the gyrospin component of the latter.
Now refer to Figure 2, which is a histogram of spin differences for fastballs/changeups (blue) and breaking pitches (red).
The dotted curve is a fit to the fastballs/changeups data using a Gaussian distribution with approximately zero mean (actually the mean is 80 rpm) and a significant standard deviation of 500 rpm. Given what we know about these pitches, namely that little or no gyrospin is expected, the near-zero mean is another indication that the points in Figure 1 do indeed follow the line expected if the useful spin is identical to the total spin. Moreover it provides additional confirmation that the technique used to derive the movement and relate it to the useful spin is substantially correct.
The symmetry of the distribution about zero mean is strongly suggestive of random measurement error, with a standard deviation of 500 rpm. Given the simplicity of the plot, there are only two possible sources of measurement error: the total spin and/or the transverse spin , so let’s look at each of them. In a measurement that my Washington State colleagues and I did last year, we compared the Trackman measurement of total spin to that of high-speed video. We found that over the range 1200-3200 rpm, the two measurements agreed to within a standard deviation of 35 rpm, which is far less than what is observed in our distribution. Therefore, the random error observed must be due entirely to the measurement of , which in turn comes from the measurement of the movement.
Now let’s ask whether a random measurement error of 500 rpm in makes sense based on what we know about the Trackman system. Given the relationship between transverse spin and movement, it can be shown that a random error of 500 rpm in would be expected if the random error in the movement is about 2 inches. By that I mean that if the distribution of differences between the measured and the “true” movement were plotted, it would have zero mean and a standard deviation of 2 inches. I have never actually investigated the question of random measurement error for the Trackman system. However, I investigated it for the PITCHf/x system in another unpublished article from 7 years ago and found that it has a random measurement error on movement of 2.0-2.5 inches. Based on comparisons I have made between PITCHf/x and Trackman data, the random error for the two systems is comparable. I conclude that a random error on the Trackman movement of 2 inches is reasonable. Therefore the scatter we see in Figure 1 (or the width of the distribution in Figure 2) is completely plausible based on what we know about the Trackman system.
When I did the random error analysis for PITCHf/x, there was a cautionary tale that went along with it. Namely, given the measurement error, one must be very careful over-interpreting information about any particular pitch. For example, there is a non-negligible probability that the movement measured on a single pitch could differ by several inches from the actual movement, simply based on random measurement error. In the present context, fully half the fastballs/changeups in any given sample would apparently violate a fundamental law of physics by having their measured transverse spin larger than their actual spin. Of course no law is violated: It is just random error rearing its ugly head. But there also is a more optimistic tale that I hope I am demonstrating in this article: Despite the random errors, it is still possible to glean interesting information from the data. As Nate Silver might say, it is all in learning how to distinguish the signal from the noise.
I want to discuss briefly my surprise regarding the significant gyrospin component for curveballs. In doing so, I want to compare two different pitchers, labeled A and B, in the table below:
I have calculated the mean values of total spin and useful spin for curveballs thrown by these two pitchers. We see that B throws his curveball with 5% more spin than A; however A is more efficient in creating movement since his useful spin is 18% larger than B’s. So despite B having greater spin on his curveballs, A gets more movement. All spin is not alike! Would B be more a effective pitcher if he learned how to harness the spin on his curveball to make it more useful? I don’t know the answer. However, perhaps this type of analysis will allow one to at least address the question.
Finally refer to Figure 3 in which the movement is plotted versus the quantity for the two groups of pitches.
The quantity is the relevant quantity for comparing movement to spin. To remove the scatter, the data have grouped in buckets of , then averaged. The curve, which is the relationship from previous laboratory experiments that relates movement to , is an excellent representation of the fastball/changeup data. I note in passing that the curve is not linear: Doubling the transverse spin does not double the movement. If the curve were reduced by about 3%, the agreement would be even better, and such a reduction would be entirely consistent with the laboratory data upon which it is based. In fact, there are far more Trackman fastball/changeup data than there are laboratory data, so there is a real research project out there for anyone with access to those data.
To summarize, here are the things I consider useful takeaways from this article:
The analysis presented here is an extension of the framework set up in my earlier unpublished article. However, the application of that framework to actual data—in particular, the crucial role of random measurement error--was the suggestion of a friend, who prefers to remain anonymous. I thank her/him. I also thank Prof. Dave Kagan for some helpful comments on the article.