Motion capture of piano performance

Research project “Movement Strategies and Sensory Feedback in Piano Playing”, Principal investigator: Dr Werner Goebl
Funded by a Erwin-Schrödinger Fellowship (J 2526, 2006–2008) of the Austrian Science Fund (FWF); carried out at the Sequence Production Lab at McGill University in Montreal, QC, Canada.

Capturing piano performance at the finger level

Using a Vicon 460 passive motion capture system with a frame rate of 250 hz, we tracked 25 reflective 4-mm markers glued on the pianist’s hand and arm joints and 15 markers glued on the piano keys. This complex setup is technically challenging as the abundance of markers brings the triangulation software quickly to its limits, but it delivers unique insights into the detailed kinematics of fingers, hand, and lower arm during piano performance. The V460 consisted of 6 infrared cameras, each filming the hand from a different angle. From these 6 two-dimensional views, one three-dimensional representation is triangulated.

Figure 1. The pianists’ right hands were equipped with 25 reflective markers covering the joints of the 5 fingers, the back of the hand and the lower arm (left). The motion capture setup involved 6 infrared cameras arranged around the digital piano (middle). These cameras monitored the 15 markers placed on the keyboard (middle) as well as the hand markers (right).

Professional pianists from the Montreal area played several simple melodies at a wide range of tempi, timed by a metronome in a synchronization-continuation paradigm, starting from a medium tempo of 500 ms inter-onset interval (IOI, i.e., 2 tones per second or 120 beats per minute with the beat being the quarter note) up to an extreme tempo of 62 ms IOI (16 tones per second or 240 beats per minutes with 4 tones per beat). In addition to the movements, also the performance data (i.e., MIDI data) were recorded and analyzed.

Functional Data Analysis of motion data

In the kinematic analysis of the motion data, we are interested not only in the position trajectories, but also in the derivatives velocity and acceleration. This required to considerably expand existing analysis techniques based on Functional Data Analysis (FDA), a statistical framework for time-series data by Ramsay and Silverman (2005). To demonstrate this analysis technique, the raw motion-capture data of an exemplary struck keystroke is shown in Figure 2. The fingertip height above keyboard of the index finger is shown over time in Figure 2a, top panel, together with the key position. To estimate fingertip velocity and acceleration, a simple sample-wise numerical derivative is plotted.

a) b)
c) d)
Figure 2. The fingertip kinematics in the height dimension above the keyboard (z) of an index finger striking a key from a distance above the key surface (struck touch). The plots show z position, velocity, and acceleration separately. The vertical lines denote kinematic landmarks: maximum height of the fingertip (mxH), finger-key impact (FK), and key-bottom impact by fingertip and key (KB). The four panels show the same keystroke excerpt, in part combined with different smoothing techniques: (a) raw data only with numerically estimated derivatives velocity and acceleration (black lines), (b) raw data (grey lines) smoothed with a Gaussian window (smoothing window corresponding to 5 data points either side, red lines), (c) raw data (grey lines) with FDA smoothing with evenly spaced knots (blue for fingertip, red for key position), and (d) raw data (grey lines) with FDA smoothing as in (c), but with additional knots at the instants of the kinematic landmarks.

Each keystroke contains certain typical kinematic landmarks, indicated by vertical lines in Figure 2:

  1. the maximum height of the fingertip (mxH) before the finger starts its decent towards the key surface,
  2. the finger-key impact (FK), when the fingertip hits the key surface and is (almost) stopped by it, and
  3. the key-bottom contact (KB), when the finger and the key motion is finally stopped by the keyframe.

The two impact landmarks FK and KB are clearly represented by very steep and sudden peaks in the acceleration trajectories. Particularly FK landmarks can have very high peaks ranging up to 150 m/s² or more than 15 times gravitational force. In the context of the sensory experience of pianists during a keystroke, we assume that these acceleration peaks represent prominent tactile cues to the pianist. Therefore, we analyze the temporal location and the height of these acceleration peaks as tactile markers. Moreover, the FK landmarks disappear when a key is played with a pressed touch, because the finger is initially resting on the key (for a comparison of a pressed and struck touch, see Figure 3). Therefore, the FK landmarks becomes an indicator to identify the type of touch used for a particular tone (Goebl and Palmer 2008).

To remove the considerable noise in the motion capture raw data, that becomes particularly apparent in the derivatives and to interpolate occasional missing data points, the data is subjected to smoothing or filtering procedures. We show a simple Gaussian filtering in Figure 2b (with a symmetric smoothing window of 5 data points either side). Figure 2c shows the same keystroke with smoothed trajectories using FDA with order-6 b-splines fit to the second derivative by applying the roughness penalty on the fourth derivative (λ = 10–18, see Ramsay and Silverman 2005). With both smoothing methods (Figure 2b and c), the acceleration peaks are reduced considerably. However, as we are interested in these landmarks, a data smoothing procedure should preserve those acceleration peaks. To this end, we added additional knots at the time instants of the kinematic landmarks which causes the curve fitting to feature sharp bends in the acceleration trajectory at those points (Figure 2d). This method clearly fits the data and retains the impact characteristics more closely than without adding additional knots, also reflected in the smaller generalized cross-validation (GCV) values printed in the figure titles (Figure 2c and d). This method is also suited for other motion trajectories featuring impacts with rigid bodies, such as the motion of a mallet hitting a membrane or a key of a marimbaphone.

Struck and Pressed Touch

Piano touch — the way pianists approach the key surfaces of the piano keys with their fingers — has been a vibrant topic for debate and discussion among pianists, piano educators, piano students, and piano lovers over the past three centuries the piano exists. Already in the 1920ies, touch was defined by another variable other than the speed of the hammer arriving at the strings:

“One of the most interesting questions, from the musician’s standpoint perhaps the most interesting of all, is the effect of finger-stroke upon tone-quality. However fanciful our conception of the artistic phases of piano touch may be, whatever poetic qualities we assign to the piano tone, the fact remains that percussion and intensity are the only determinants.” (Ortmann 1929, p. 243, my emphasis)

In analogy to Ortmann’s antagonism of non-percussive and percussive keystrokes (Ortmann 1929), we use the antagonism of pressed and struck touch (Goebl and Bresin 2003, Goebl et al 2005) to denote the same touch distinction as Ortmann and Anders Askenfelt (e.g., Askenfelt and Jansson 1990). This struck–pressed antagonism is defined by the initial speed of the finger when hitting the key surface. In a percussive, struck touch, the finger has a certain speed when arriving at the key surface (thus coming from a distance above), while at the non-percussive pressed touch, the speed is zero, thus the finger resting initially on the key surface and pressing it down. In Figure 3, we show


Figure 3. Index finger performing a struck touch (left) and a pressed touch (right). Blue lines indicate position (top panels), velocity (middle panels), and acceleration (bottom panels) in the height dimension above the keyboard plane. The red line shows the key position. Three kinematic landmarks are identified: maximum height of the finger before a keystroke (mxH), the finger-key contact (FK), when the finger impacts on the rigid surface of the key, and the key-bottom contact (KB), when the finger and the key are stopped by the key frame.

Figure 4. The two touch examples from Figure 3 in a 3-dimensional animated skelletal reconstruction in slow motion. The video files do not contain sound.


Movement properties change with performance tempo

In Goebl and Palmer (2008), we analyzed the acceleration peaks of the kinematic landmarks of over 18,000 performed keystrokes over 4 tempo conditions from slow to medium fast and related them to measures of timing accuracy and precision. Taking the presence of FK landmarks as identification cue for touch, we could show that at slow tempi most pianists used pressed and struck touches about equally often, while at medium fast tempi struck touches were used for almost all keystrokes.

Moreover, we could show for most pianists a relationship between the magnitude of the finger-key surface impact and the timing accuracy such that the larger the finger-key impact (thus the more the tactile sensation of the pianist), the more accurate the subsequent time interval was executed (Goebl and Palmer 2008, p. 477). This finding seems intuitive from a musical perspective: to hit the fingers harder against the keys when trying to play a passage more accurately timed.

Higher performance tempi entail also more kinetic energy in the fingers which catapults them further away from the keyboard, resulting in higher movement amplitudes and at the same time more struck keystrokes at faster tempi than at slower tempi (see videos in Figure 5). In this light the advice of piano educators, to play fast passages close to the keyboard to conserve energy, is not feasible in reality.

Figure 5. The same musical sequence played by the same pianist at the slowest tempo condition (2 tones per second or 500 ms inter-onset intervall, left side) and at a medium-fast tempo (7 tones per second, or 143 ms IOI, right side). The original recordings were slowed down by different factors to match each other visually. The video file does not contain sound.

A detailed analysis of the timing of the kinematic landmarks (mxH, FK, KB, see above) relative to the performed events showed that already at medium fast tempi the mxH for the subsequent tone occurs when the previous tone was just played, thus the execution of the individual events start to “overlap.” At fastest tempi, even the FK of the next tone occurs simultaneously with the previous tone’s onset (see Goebl and Palmer 2009).


Piano technique and movement efficiency

Skilled piano performance requires considerable movement control to accomplish the high levels of timing and force precision common among professional musicians, who acquire piano technique over decades of practice. Finger movement efficiency in particular is an important factor when pianists perform at very fast tempi. We document the finger movement kinematics of highly skilled pianists as they performed a five-finger melody at very fast tempi. A three-dimensional motion-capture system tracked the movements of finger joints, the hand, and the forearm of twelve pianists who performed on a digital piano at successively faster tempi (7–16 tones/s) until they decided to stop. Joint angle trajectories computed for all adjacent finger phalanges, the hand, and the forearm (wrist angle) indicated that the metacarpophalangeal joint contributed most to the vertical fingertip motion while the proximal and distal interphalangeal joints moved slightly opposite to the movement goal (finger extension). An efficiency measure of the combined finger joint angles corresponded to the temporal accuracy and precision of the pianists’ performances: Pianists with more efficient keystroke movements showed higher precision in timing and force measures. Keystroke efficiency and individual joint contributions remained stable across tempo conditions. Individual differences among pianists supported the view that keystroke efficiency is required for successful fast performance. (Goebl and Palmer 2013)

Figure 6. Pianists S17 and S24 (see Goebl and Palmer 2013). The video shows hand reconstructions of the smoothed marker data of Pianists S17 on the left and S24 on the right. The white spheres in the foreground (and one next to the pianists’ little finger) are markers attached to the piano keys. The original recordings were slowed down by a factor of 5.4 and 5.8, respectively (as S17 performed at a slightly slower rate than S24), to match each other visually. The video file does not contain sound.



Askenfelt, A., and Jansson, E. V. (1990).
From touch to string vibrations. I. Timing in the grand piano action.
Journal of the Acoustical Society of America, 88(1), 52–63. doi: 10.1121/1.399933.

Goebl, W., and Bresin, R. (2003).
Measurement and Reproduction Accuracy of Computer Controlled Grand Pianos,
Journal of the Acoustical Society of America, 114(4), 2273–2283, doi: 10.1121/1.1605387.

Goebl, W., Bresin, R., and Galembo, A. (2005).
Touch and temporal behavior of grand piano actions.
Journal of the Acoustical Society of America, 118(2), 1154–1165, doi: 10.1121/1.1944648.

Goebl, W., and Palmer, C. (2008).
Tactile feedback and timing accuracy in piano performance.
Experimental Brain Research, 186(3), 471–479, doi: 10.1007/s00221-007-1252-1.

Goebl, W., and Palmer, C. (2009).
Finger motion in piano performance: Touch and tempo.
International Symposium on Performance Science, ISPS 2009 (15–18 December 2009), Auckland, New Zealand. European Association of Conservatoires (AEC), Utrecht, NL, pp. 65–70.

Goebl, W., and Palmer, C. (2013).
Termporal control and hand movement efficiency in skilled music performance.
PLOS ONE, 8(1), e50901, doi: 10.1371/journal.pone.0050901.  Open Access Open Access

Ortmann, O. (1929). The Physiological Mechanics of Piano Technique. London, New York: Kegan Paul, Trench, Trubner, E. P. Dutton.