Movement represented by film score and sound design

3 downloads 0 Views 1MB Size Report
Music and sound, however, do not only show references to visual perception in ... Michailowitsch Eisenstein: Александр Невский – Alexandr Newski (1938) [1].
27th TONMEISTERTAGUNG – VDT INTERNATIONAL CONVENTION, November, 2012

Movement represented by film score and sound design (Repräsentation von Bewegung in Filmmusik und Sounddesign) Michael Haverkamp * Ford Werke GmbH Köln, [email protected]

Abstract Music and sound support the narrative, dramatic and emotional contents of films. Already during the early days of cinema, alignment of images and sounds was aspired by means of synchrony of visual and auditory elements, e.g. according to Eisenstein’s concept of vertical montage. Music and sound, however, do not only show references to visual perception in case of in-parallel stimulation of both senses. Furthermore, even isolated auditory events represent sensations of other modalities, notably in a context of movement and emotion. Firstly, this contribution describes fundamentals of cross-sensory connections within the perceptual system and the representation of movement. The perceptual system is capable of deriving motional information not only instantaneously at the given moment of auditory stimulation. Moreover, sounds point to processes of movement which existed in the past or will exist in the near future. Movement can also be perceived during silence which spans between specific auditory events. Implicit, explicit and anticipating features of sound require particular interest. Those parameters constitute the well-known impact of sound and music on the appearance of films: A variability of temporal allocations of images and sounds is enabled by means of the perceptual representation of movement, which widely extends the strict vertical montage of auditory and visual contents.

1. Introduction Even from the times of early cinema, accompanying presentations of music and sound were seen as essential elements of the show. The famous film director Sergei Michailowitsch Eisenstein (1898 -1948) was one of the first experts who prepared approaches regarding the optimum alignment of visual and auditory contents. In analogy to the meaningful assembly of images in terms of visual montage, he defined vertical montage as a process of simultaneous alignment of music score, image pattern and movement. The term vertical refers to the inparallel adjustment of auditory and visual elements, in contrast to the visual montage which is mainly done in a time-sequential manner, thus using the horizontal direction of an imagined time axis. Figure 1 shows an excerpt of that concept, applied to the historical epos Alexandr Newski, premiered in 1938 [1]. The original music score was delivered by Sergei Prokofjew. Eisenstein’s systematic configuration includes image patterns, music score and movement patterns. It was intended to present all those elements of the film in a synchronous manner [2]. Patterns of movement provide a link between visual elements, e.g. the gestures of the actors, and auditory movement suggested by music and sounds. In this regard, movement indicated by auditory content plays an important role in connecting the single streams of the final

27th TONMEISTERTAGUNG – VDT INTERNATIONAL CONVENTION, November, 2012

audio-visual Gesamtkunstwerk. Movement suggested by sound is not a new topic. In the past it has been approached from various directions. Eduard Sievers analyzed the movement content of the speech melody by an experimental method named Schallanalyse [3], which originally did not mean a physical analysis of sound, but an optimization of the text declamation by accompanying body movements. In a comparable manner, Gustav Becking analyzed the musical rhythm and tried to find a pattern of the individual style of a composer [4].

Figure 1: Vertical Montage. Alignment of music score, image pattern and movement. Sergei Michailowitsch Eisenstein: Александр Невский – Alexandr Newski (1938) [1] A more in-depth investigation was presented by Alexander Truslit, who outlined a fundamental theory of genuine movement (Urbewegung) implied by music, and of its presumed biological roots [5]. The philosopher Schmitz interprets the motion suggested by sound (Bewegungssuggestion) as an essential part of every day perception [6]. Detailed studies of correlations of music and motion have been published by various scientists, e.g. Zohar Eitan and Roni Granot [7] or HaiHong Zhou[8]. The approach presented within this paper shows that auditory signals mainly communicate movement by references to other sensory channels, here named modalities.

2. Monaural versus Binaural Representation Movement of objects is indicated by sound via two main classes of parameters: binaural and monaural qualities (qualia). Binaural hearing provides information about the location of sound sources and movement via differences of the signals approaching both ears - see example [9]. Specific effects of interaction between perceived auditory and visual movement occur [10], which also influence the estimation of the trend of further development of a movement [11, 12]. Whereas in that case auditory movement manifests itself as change of the perceived location of a sound source, movement is also suggested by features of monaural signals. The question is raised as to which sound features are specifically correlated to

27th TONMEISTERTAGUNG – VDT INTERNATIONAL CONVENTION, November, 2012

movement without binaural signal processing, thus forming the monaural qualia. Some examples of monaural qualia of movement are: - temporal change of loudness (loudness ≈ nearness) - temporal change of timbre (relation of direct sound to reverberation ≈ nearness) - change of pitch (Doppler shift ≈ object passing the location of max. nearness) - rhythm/modulation (periodicity ≈ rotational movement ≈ velocity) Monaural qualia indicate specific parameters of an object movement, like nearness of the perceived object to the observer, velocity, acceleration/deceleration, and whether the object is approaching or departing. Therefore the suggestion of movement is based on the perceptual estimation of cross-sensory correlations between the auditory field on one side, the visual and tactile senses on the other side. Figure 2 shows the major possibilities of the perceptual system to represent a process (e.g. of movement) via auditory stimuli. Three main strategies of cross-sensory connections extract visual and tactile aspects of the process out of the auditory signal [13]. In the following, the most important perceptual strategies of crossmodal analogy and iconic connection will be discussed.

Figure 2: Auditory representation of processes via multi-sensory strategies of the perceptual system

3. Cross-Modal Analogy The term cross-modal analogy refers to the capability of the perceptual system to detect correlations of specific attributes and to analyze them for identification of physical objects and atmospheric features [14]. The analysis of analogies can include: • • • •

generic attributes (loudness, pitch, timbre, sharpness, brightness, ...) movement (straight, rotational, irregular, expanding, ...) body perception (tense, relaxed, floating, ...) emotion (calm, troubled, angry, ...)

27th TONMEISTERTAGUNG – VDT INTERNATIONAL CONVENTION, November, 2012

A common analogy is the subjective assignment of musical pitch and/or timbre to spatial features. The correlation of pitch and visual height is an especially important basis of classical (European) systems of musical notation. This effect is widely used for acoustic simulation of vertical movement. Thus, pitch is suitable for representation of movement and/or movability within sounds, which usually correlates to largeness and sluggishness of animals. A specific periodicity of a sound signal can indicate a rotational movement, e.g. of a manual coffee grinder or a metal cylinder rolling over the floor. This effect can be used to indicate rotation within artificial signals, including changes of rotational speed (Figure 3). The periodicity, however, must not exceed a frequency of 20Hz. Above that value, the expression of visual rotation is substituted by the sensation of a constant signal with specific roughness.

Figure 3: Acoustic simulation of rotational movement with increasing rotational speed Cross-modal analogies are suitable to ensure plausibility of abstract and/or innovative sounds, because their interpretation does not require former experience with those signals. Innovative film sound design shall thus be based on this perceptual strategy. As an historical, but nonetheless instructive example, the design of purely artificial sounds used within “Forbidden Planet” provides various samples of plausible auditory movement. The film was directed by Fred McLeod Wilcox and premiered 1956. Electronic sounds were designed by Bebe and Louis Barron. Those electronic tonalities were used as an atmospheric background to substitute music as well as for description of functional processes. Even if auditory and visual streams are processed by the perceptual system either in-parallel or shortly after each other, an interaction between both can be verified: If a continuously downsizing visual object is presented, this can be interpreted as a movement away from the observer. If in this case a physically constant tone is presented after the visual object, an illusionary sensation of increasing loudness can occur as a specific after-effect [12]. Results of experiments by Zohar Eitan and Roni Granot [7] demonstrate that there is not only one possibility of building analogies between auditory and visual features. Nevertheless, the number of existing ways of connections which are indicated by test persons as being

27th TONMEISTERTAGUNG – VDT INTERNATIONAL CONVENTION, November, 2012

preferred solutions is not unlimited, but it is quite restricted. The test persons were instructed to steer a figure along a monitor screen, and during this time, different acoustic signals with systematic variations were presented. The figure was supposed to be moved analogous to the alteration of the musical parameters. As an example, some of the preferred connections chosen are – in the order of nominations: • • • •

crescendo -> figure approaching or speeding decrescendo -> figure moving away or descending increase of pitch -> figure ascending or moving away decrease of pitch ->figure descending or moving to the left

In this case, both symmetric and asymmetric classifications occur. In the case of the symmetric classification, for example, a conception of movement upwards (ascending) corresponds to the increase in the tone pitch, whereas – symmetrically – a downwards movement (descending) corresponds to the sinking of the tone pitch. Other test persons, however, attribute a different type of movement to the same musical parameter, namely a movement away from the subject respectively – asymmetrically – a movement to the left side. The results indicate that there are obvious connections which tend to be preferred by test persons. The results correlate with those of other perceptual experiments throughout different cultural areas. As results of an investigation done in China by Zhou HaiHong, increasing pitch is correlated to increasing visual brightness, enhanced height, perception of objects as small and lightweight, and increased emotional involvement ([8], cited by Liu [2]). The philosopher and founder of New Phenomenology Hermann Schmitz explains the importance of suggestions of motion, as caused by auditory signals, within the process of perception and initiation of emotions: By means of suggestions of motion, courses of Gestalt (Gestaltverläufe) emerge as dynamic expressions of perceptual forms of sound. The close connection of motion, gestures and such courses of Gestalt enables the dynamic content of stimuli to be perceived somatically [6]. From this statement one can understand the importance of perceived motion within music and also within sounds for cinema applications. The close connection to body reactions provides intuitive understanding of the narrative and emotional content. Suggestions of movement are thus also essential for the establishment of warning signals. Even in the perception of music, analogies between the movement in the spectral and temporal dimension and the spatial movement of objects play a major role. The human capability to establish this form of connection manifests itself in dancing and in conducting. In the construction of motives in the opera music, analogies to theatrical gestures are similarly of great importance. Gustav Becking made efforts to transliterate the musical rhythm by utilizing accompanying movements (Mitbewegungen) of the body – similar to the movements of the orchestral conductor – and to accordingly deduce from it various stylistic characteristics regarded as being typical of the composer [4]. His intention, nevertheless, was not to correlate the bodily movements and the music with respect to one another. Instead, he aimed at describing the movement which is perceived as the characteristics of the auditory signal. With similar intentions, the linguist Eduard Sievers recommended the employment of bodily movement within the realm of the Schallanalyse which he developed in order to optimize the text declamation and as a form of assistance in interpreting [3]. Schallanalyse includes an investigation of the optimum intonation for a given text regarding movement conceptions on the basis of articulation and the manner of emphasis. For this purpose, Sievers developed a

27th TONMEISTERTAGUNG – VDT INTERNATIONAL CONVENTION, November, 2012

method for the experimental investigation of characteristic forms of movement which could be depicted as curved lines. Alexander Truslit utilized analogies of movement for the purpose of musical analysis as well as for supporting the interpretation, and he deduced from these efforts basic principles, with which he accentuated movement as a decisive basis – and, indeed, even as the origin – of the music [15]. Even he used line figures in order to visualize the movement being sensed, for the purpose of depicting rising and falling, acceleration and deceleration. Such a form of movement according to Truslit is included into Figure 3.

4. Synchronicity Audio-visual synchronicity (also described as synchrony) is an important parameter of the grade with which visual and auditory streams are perceived as to be correlated. Small delays within the range of some ten milliseconds, however, are already perceived to be synchronous. The threshold values of delay, which is tolerated by the perceptual system, were examined by Armin Kohlrausch and Steven von der Par [16]. For this purpose, simple coupled stimuli in the form of audio and video signals, having differing delays with regard to synchronic or asynchronous perception, were evaluated. With respect to the time axis of delays, the results exhibit a characteristic asymmetry: The differential threshold for asynchronous signals amounts to -30 ms with a delayed video signal. With a delayed audio signal, on the other hand, it amounts to +120 ms (or +90 ms, according to the results of other experiments). A possible explanation for the asymmetry is based upon the fact that the sound of a source, as compared to light, reaches the ear on a delayed basis in the case of the perception of natural processes – due to the different velocity of wave propagation. A source distance of 10 m already causes a delay of 30 ms. The threshold values for the perception of synchrony are dependent upon the kind of the signals utilized. Vocal signals result in considerably greater values – that means that greater temporal differences between auditory and visual stimuli are tolerated. The result of the described experiment pertaining to synchrony can also be summarized in the following manner [16]: • • •

There is a range of approximately 150 ms, in which delays between auditory and visual stimuli cannot be perceived. A characteristic asymmetry exists, and this leads to a shifting of the range of synchronous perception out of the zero position. The asymmetry can also be regarded as a symmetric phenomenon around a point of subjective similarity, which involves an audio delay of +40 to +50 ms with respect to the video signal.

5. Iconic Connection The iconic strategy to establish cross-modal connections is based on associations suitable to identify known physical objects or processes. A single stimulus can refer to a multi-modal perceptual object stored in memory. Thus a specific sound stimulus can evoke imagination of

27th TONMEISTERTAGUNG – VDT INTERNATIONAL CONVENTION, November, 2012

the sound source with all of its cross-sensory attributes, if the variety of features was experienced before. In principle, even a purely visual montage can introduce new meaning by including references between single images or scenes. Those references can be based on visual analogies or iconic features. An example was presented by the German pioneer of experimental cinema Hans Richter. The correlation of forms of loafs of bread with a bald head shows intuitive plausibility, without need of object identification. On the contrary, the iconic connection between a worker using a shovel and a smoking chimney is evident only with knowledge (i.e. previous experience) of the related environment, where both can be seen [17]. In the literature, those perceptual connections are named by different terms: Iconic reference (primarily used within Semiotics), Dingwahrnehmung [18], Causal Listening [19], Semantics of first order [20], Concrete association [21]. A sound can thus indicate an active sound source, an object touched, a process, or an ambience. Iconic elements of sounds are well known from former perceptual experience. Those elements can easily be recognized and support the plausibility and interpretation of sound. In specific cases, this plausibility can be much stronger than the comprehension of reality. Many science fiction films present sound to support the dynamic expression of space ships moving in space. Although physically incorrect, due to the fact that sound waves cannot propagate through a vacuum, this accompanying sound is widely accepted by spectators all over the world. On the contrary, the experience within the virtual world of cinema and electronic games modifies perception and affirms itself. Until today, it is not really understood whether the interpretation of cross-modal analogies (according to Chapter 3) is a genuine, possibly inherited feature of the perceptual system, or whether single features are extracted from audition of daily life. In the latter, monaural qualities of movement would appear to be features which have been extracted from iconic connections. For example, the relation of pitch to visual height is observed while filling water into a cup. Resonances within the air remaining in the upper part of the cup increase with the increasing height of the fluid. Pitch is thus related to height, whereas pitch change refers to upward movement of the surface of fluid. The situation, however, is ambiguous: if the cup is placed upon a resonant mechanical structure, the increasing mass of the fluid can also cause a decreasing tone.

6. Explicit/implicit Representation of Movement With a closer look to complex processes of movement, we find expression of movement within the temporal changes of auditory parameters, like pitch and timbre. Those explicit representations within the sound arise in perception by means of cross-sensory references via either iconic connections or cross-modal analogies. This applies to numerous processes, like a rolling ball, writing with pens or chalk on the board, vehicle driving sound, and many others. Many processes of movement, however, include periods of silence, which are also part of the whole activity. Those implicit representations occur in combination with sounds. They refer to soundless parts of complex dynamic processes. As an example, a bouncing ball excites sound only when it touches a surface. The movement in-between two pulses must be

27th TONMEISTERTAGUNG – VDT INTERNATIONAL CONVENTION, November, 2012

estimated from the periodicity and sound quality of the pulses. This implicit reference to the total process is a quality which arises as a combination of sounds. It is only applicable if former perceptual experience enables iconic connections, i.e. the knowledge of the complete process. Similar dynamic structures of sound can be found while listening to steps, a tennis match, activation of a ballpoint pen, a vehicle handbrake being applied, etc. Figure 4 shows the complicated combination of explicit and implicit elements during a launch of a firework rocket.

Figure 4: Implicit and explicit representation of movement

7. Conclusion In the field of cinema, auditory representations of movement are suitable to prepare a scene, specifically to raise awareness of changes, e.g. upcoming danger, and other incidences. This is well known and internalised by the perceptual system. Typical situations of daily life with strong suggestions of movement caused by sound are perceptions of a rolling stone, vehicles approaching on the road, the sound of boiling water or a rising storm. Movement is also expressed by music. The experience with auditory movement can easily be translated to the perceived flow of emotions. Therefore sound and music are also suitable to express a foreshadowing of emotions. Iconic representations of movement within sound are essential to perceive rotational movement, the soughing of wind, treatment of materials, writing, touching of materials, and other processes. The Doppler-shift indicates sounding objects passing by the observer.

27th TONMEISTERTAGUNG – VDT INTERNATIONAL CONVENTION, November, 2012

Additional information is provided implicitly, due to the fact that interruptions of the sound represent specific ways of movement. As examples, this is obvious while listening to a bouncing ball, a series of steps, or a ping-pong match. The unreal, but nonetheless plausible sound of space ships, as used within numerous science fiction films, demonstrates that iconic relations result form perceptual experience of both, real and even virtual worlds. If daily-life experience with specific sounds cannot be presumed for the spectator, crossmodal analogies are suitable to represent movement by connecting simple features of sound and images. In this manner, as a typical example, increasing pitch of sound may be used to indicate acceleration of objects. As Mickey-Mousing, this method has widely been applied to animation films. With a view to the method of vertical montage as proposed by Eisenstein, the following conclusions may be drawn: Synchronous montage of sound and images is an appropriate starting point to generate a plausible audio-visual correlation. An appropriate auditory representation of movement, however, enables the sound to speak for itself. Thus, sound and music can more or less be interpreted independently from the visual content. This enables the anticipation of scenes, action, emotions, etc. by auditory content. Even interruptions of a stream of sound can implicitly express movement. The capability of sound to transfer references to the visual sense - as well as to other senses enables a much wider spectrum of possibilities than originally stated by Eisenstein.

8. References [1] Eisenstein, S. M.: “Вертикальный монтаж (Vertical Montage)“. Искусство кино (Cinema Art), 1940, 9:16-25 & 12:27-3 and 1941, 1:29-38 [2] Liu, G.: “Die Macht der Filmmusik: Zum Verhältnis von musikalischem Ausdruck und Emotionsvermittlung im Film.“ Tectum Verlag, Marburg, 2010 [3] Sievers, E.: “Ziele und Wege der Schallanalyse.“ Carl Winter Verlag, Heidelberg, 1924 [4] Becking, G.: “Der musikalische Rhythmus als Erkenntnisquelle.“ Benno Filser Verlag, Augsburg, 1928 [5] Repp, Bruno H.: “Music as Motion: a synopsis of Alexander Truslit's (1938) ‘Gestaltung und Bewegung in der Musik‘.“ Psychology of Music, 21, 1993, 48-72 [6] Schmitz, H.; Müllan, R.O.; Slaby, J.: “Emotions outside the box—the new phenomenology of feeling and corporeality.” Phenom Cogn Sci (2011) 10:241–259, DOI 10.1007/s11097-011-9195-1 [7] Eitan, Z.; Granot, R. E.: “Musical parameters and images of motion.” In: Parncutt, R. et al. (eds.): Proc. of the conference on interdisciplinary musicology, Graz, 2004 [8] Zhou, H.H.: “Musik und die durch sie dargestellte Welt. Die psychologische und ästhetische Forschung zur Beziehung zwischen dem Klang der Musik und der Darstellung von Gegenständen.“ Central Conservatory of Music in China, Beijing, 2004 [9] Blauert, J.: “Spatial Hearing - The Psychophysics of Human Sound Localisation”. The MIT Press, Cambridge, Mass, 1996, ISBN 0-262 02413-6 [10] Fischer, R.L.; Getzmann, S.: “Wahrnehmung akustischer Bewegungen und visueller Positionen – ein Fall audiovisueller Integration?“ Proc. DAGA’08. DEGA, Berlin, 2008, 363-64

27th TONMEISTERTAGUNG – VDT INTERNATIONAL CONVENTION, November, 2012

[11] Getzmann, S.; Lewald, J.: “Repräsentationales Momentum bei der visuellen und auditiven Bewegungswahrnehmung.“ Proc. DAGA’08. DEGA, Berlin, 2008, 365-66 [12] Kitagawa, N.; Ichihara, S.: “Hearing Visual Motion in Depth.“ Nature 416 (2002) 172-74 [13] Haverkamp, M.: “Advanced description of noise perception by analysis of crosssensory interactions within soundscapes.” Noise Control Eng. J. 58 (5), Sept-Oct 2010 [14] Haverkamp, M.: “Synesthetic Design. Handbook for a Multi-Sensory Approach.” Birkhäuser Verlag, Basel, 2012 [15] Truslit, A.: „Gestaltung und Bewegung in der Musik.“ Christian Friedrich Vieweg, Berlin, 1938 [16] Kohlrausch, A.; Van de Par, S.: “Audio-Visual Interaction in the Context of MultiMedia Applications.“ In: Blauert, J. (ed.): “Communication Acoustics”, Springer Verlag, Berlin, 2005, 109-38 [17] Richter, H.: “Filmgegner von heute - Filmfreunde von morgen.“ Hermann Reckendorf Verlag, Berlin, 1929 [18] Voss, W.: „Das Farbenhören bei Erblindeten.“ In: Anschütz, G. (ed.): „Farbe-TonForschungen.“ Vol.2. Psychologisch-Ästhetische Forschungsgesellschaft, Hamburg, 1936, 5-207 [19] Chion, M.: „Audio-Vision. Sound on Screen.“ Columbia University Press, New York, 1994 [20] Flückiger, B.: “Sound Design. Die virtuelle Klangwelt des Films.” Schüren, Marburg, 2001 [21] Haverkamp, M.: “Audio-Visual Coupling and Perception of Sound-Scapes.” Proc. CFA/DAGA'04. DEGA, Oldenburg, 2004, 365-6