Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SOUND REPRODUCTION SYSTEM AND METHOD
Document Type and Number:
WIPO Patent Application WO/2023/274499
Kind Code:
A1
Abstract:
A sound reproduction system (110) and method are disclosed. The system (110) comprises a plurality of loudspeakers configured to generate a sound field in a sound reproduction zone and a processing circuitry configured to operate the loudspeakers based on an input audio content in a plurality of different sound reproduction modes, comprising a first reflection- based or a second direct-sound based sound reproduction mode. The processing circuitry is further configured to obtain information about the acoustic environment of the system (110) and to determine, based on the information, a performance measure of the first reflection-based sound reproduction mode. Moreover, the processing circuitry is configured to, based on the performance measure, select or recommend the first reflection-based sound reproduction mode or the second direct-sound based sound reproduction mode for operating the loudspeakers. Thus, the system (110) may select or recommend the sound reproduction mode most suited for a given acoustic environment.

Inventors:
KARAPETYAN ALEKSANDR (DE)
GROSCHE PETER (DE)
Application Number:
PCT/EP2021/067762
Publication Date:
January 05, 2023
Filing Date:
June 29, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HUAWEI TECH CO LTD (CN)
KARAPETYAN ALEKSANDR (DE)
International Classes:
H04S7/00; H04R3/12
Foreign References:
US20080165979A12008-07-10
EP2208369A22010-07-21
EP3301947A12018-04-04
Attorney, Agent or Firm:
KREUZ, Georg (DE)
Download PDF:
Claims:
CLAIMS

1. A sound reproduction system (110), comprising: a plurality of loudspeakers configured to generate a sound field in a sound reproduction zone; a processing circuitry configured to operate the plurality of loudspeakers based on an input audio content in a plurality of different sound reproduction modes, comprising a first reflection-based sound reproduction mode (410a) or a second direct-sound based sound reproduction mode (410b); wherein the processing circuitry is further configured to obtain information about the acoustic environment of the sound reproduction system (110) and to determine, based on the information about the acoustic environment of the sound reproduction system (110), a performance measure of the first reflection-based sound reproduction mode (410a); wherein the processing circuitry is further configured to, based on the performance measure, select or recommend the first reflection-based sound reproduction mode (410a) or the second direct-sound based sound reproduction mode (410b) for operating the plurality of loudspeakers.

2. The system (110) of claim 1, wherein the processing circuitry is configured to select or recommend the first reflection-based sound reproduction mode (410a) based on a comparison between a value of the performance measure with a performance measure value threshold.

3. The system of claim 1 or 2, wherein the processing circuitry is configured to determine the performance measure based on a plurality of quality indexes, wherein each quality index is indicative of a different acoustic performance of the sound reproduction provided by the sound reproduction system within the acoustic environment.

4. The system of claim 3, wherein at least one of the plurality of quality indexes is based on a plurality of quality sub-indices and wherein the at least one of the plurality of quality indexes is an average or a minimum of the plurality of quality sub-indices.

5. The system of claim 3 or 4, wherein the processing circuitry is configured to determine the performance measure as a weighted sum of the plurality of quality indexes.

6. The system of claim 5, wherein one or more weights of the weighted sum of the plurality of quality indexes depend on the type of the input audio content, and/or the number of the plurality of quality indexes.

7. The system of any one of claims 3 to 6, wherein the processing circuitry is further configured to adjust one or more sound reproduction parameters of the sound reproduction system for improving one or more of the plurality of quality indexes.

8. The system of any one of claims 3 to 7, wherein the plurality of quality indexes comprises a first quality index indicative of a virtual loudspeaker angle conformity of the first reflection-based sound reproduction mode.

9. The system of claim 8, wherein the information about the acoustic environment of the sound reproduction system comprises information about an actual angle of a virtual loudspeaker provided by the first reflection-based sound reproduction mode and wherein the processing circuity is configured to determine the first quality index based on the actual angle and a target angle of the virtual loudspeaker.

10. The system of any one of claims 3 to 9, wherein the plurality of quality indexes comprises a second quality index indicative of a virtual loudspeaker angle symmetry of the first reflection-based sound reproduction mode.

11. The system of claim 10, wherein the information about the acoustic environment of the sound reproduction system comprises information about a first angle of a first virtual loudspeaker and a second angle of a second virtual loudspeaker provided by the first reflection-based sound reproduction mode and wherein the processing circuity is configured to determine the second quality index based on the first angle and the second angle.

12. The system (110) of any one of claims 3 to 11, wherein the plurality of quality indexes comprises a third quality index indicative of a localization accuracy of one or more virtual loudspeakers of the first reflection-based sound reproduction mode (410a).

13. The system (110) of claim 12, wherein the information about the acoustic environment of the sound reproduction system comprises information about a delay and a ratio between a reflected sound signal and a direct sound signal provided by the first reflection-based sound reproduction mode (410a) and wherein the processing circuitry is configured to determine the third quality index based on the information about the delay and the ratio between the reflected sound signal and the direct sound signal provided by the first reflection-based sound reproduction mode.

14. The system (110) of any one of claims 3 to 13, wherein the plurality of quality indexes further comprises a quality index indicative of sound quality, a quality index indicative of speech intelligibility and/or a quality index indicative of sound envelopment of the first reflection-based sound reproduction mode.

15. The system (110) of any one of the preceding claims, wherein the processing circuitry is further configured to check, based on the information about the acoustic environment of the sound reproduction system (110), whether the acoustic environment of the sound reproduction system (110) meets one or more minimal requirements for the first reflection- based sound reproduction mode (410a) and to select or recommend the second direct-sound based sound reproduction mode (410b) for operating the plurality of loudspeakers, if the acoustic environment of the sound reproduction system (110) does not meet the one or more minimal requirements for the first reflection-based sound reproduction mode (410a).

16. The system (110) of any one of the preceding claims, wherein the processing circuitry is configured to implement one or more beamformers for operating the plurality of loudspeakers based on the input audio content in the first reflection-based sound reproduction mode (410a).

17. The system (110) of any one of the preceding claims, wherein the second direct- sound based sound reproduction mode (410b) for operating the plurality of loudspeakers comprises a mono sound reproduction mode (410b), a stereo sound reproduction mode (410b) or a cross-talk cancellation reproduction mode (410b).

18. The system (110) of any one of the preceding claims, wherein the sound reproduction system (110) further comprises a user interface and wherein the processing circuitry is configured to recommend the first reflection-based sound reproduction mode (410a) or the second direct-sound based sound reproduction mode (410b) for operating the plurality of loudspeakers to a listener (130) via the user interface.

19. The system (110) of any one of the preceding claims, wherein the sound reproduction system (110) further comprises one or more sensors configured to provide at least a portion of the information about the acoustic environment of the sound reproduction system (110), including information about a position of a listener (130) within the sound reproduction zone and/or information about a position, an orientation, and/or a reflection property of a reflecting surface (120a,b) within the acoustic environment of the sound reproduction system (110).

20. The system (110) of any one of the preceding claims, wherein the sound reproduction system (110) further comprises one or more microphones configured to provide at least a portion of the information about the acoustic environment of the sound reproduction system (110).

21. A sound reproduction method (1400), comprising: generating (1401) with a plurality of loudspeakers a sound field in a sound reproduction zone; operating (1403) the plurality of loudspeakers based on an input audio content in a plurality of different sound reproduction modes, comprising a first reflection-based sound reproduction mode (410a) or a second direct-sound based sound reproduction mode (410b); obtaining (1405) information about the acoustic environment; determining (1407), based on the information about the acoustic environment, a performance measure of the first reflection-based sound reproduction mode (410a); and based on the performance measure, selecting or recommending (1409) the first reflection- based sound reproduction mode (410a) or the second direct-sound based sound reproduction mode (410b) for operating the plurality of loudspeakers.

22. A computer program product comprising a computer-readable storage medium carrying program code which causes a computer or a processor to perform the method (1400) of claim 21 when the program code is executed by the computer or the processor.

Description:
Sound reproduction system and method

TECHNICAL FIELD

The present disclosure relates to audio processing and sound generation. More specifically, the present disclosure relates to a sound reproduction system and method.

BACKGROUND

In the last decades, surround sound audio formats such as 5.1, which are supported by e.g. Dolby Digital, DTS and THX and are commonly used in home theaters, have become more and more popular. It is almost natural that movies on a DVD/Blu-ray disc provide a surround format in addition to stereo. To reproduce a surround sound format, individual loudspeakers are usually placed in the room around the listener. The more accurate the positioning of the loudspeakers, the more accurate the reproduction of the surround sound. Some sound reproduction systems also provide calibration hardware/software to compensate for any errors in speaker positioning and/or negative influences of room acoustics on the sound characteristics of the reproduction. Due to advanced hardware and advanced audio processing algorithms, it has become possible to replace the often impractical setup of individual speakers with a single device, such as a soundbar. Using transaural or reflection- based techniques, virtual speakers are formed around the listener to reproduce surround or even 3D sound.

SUMMARY

An improved sound reproduction system and method are provided by the subject matter of the independent claims. Further implementation forms are apparent from the dependent claims, the description and the figures.

More specifically, according to a first aspect, a sound reproduction system is provided. The sound reproduction system comprises a plurality of loudspeakers configured to generate a sound field in a sound reproduction zone. Moreover, the sound reproduction system comprises a processing circuitry configured to operate the plurality of loudspeakers based on an input audio content in a plurality of different sound reproduction modes, comprising a first reflection-based sound reproduction mode or a second direct-sound based sound reproduction mode. The first reflection-based sound reproduction mode may be a sound reproduction mode using beamforming. The processing circuitry of the sound reproduction system is further configured to obtain information about the acoustic environment of the sound reproduction system and to determine, based on the information about the acoustic environment of the sound reproduction system, a performance measure of the first reflection-based sound reproduction mode. Based on the performance measure, the processing circuitry of the sound reproduction system is further configured select or recommend the first reflection-based sound reproduction mode or the second direct-sound based sound reproduction mode for operating the plurality of loudspeakers. Thus, the sound reproduction system may select or recommend the sound reproduction mode most suited for a given acoustic environment.

In a further possible implementation form, the processing circuitry of the sound reproduction system is configured to select or recommend the first reflection-based sound reproduction mode based on a comparison of a value of the performance measure with a performance measure value threshold. For instance, if the value of the performance measure value is larger than the performance measure value threshold, the processing circuitry may select or recommend the first reflection-based sound reproduction mode. Thus, the sound reproduction system may determine the most appropriated sound reproduction mode in a computationally efficient way.

In a further possible implementation form, the processing circuitry is configured to determine the performance measure based on a plurality of quality indexes, wherein each quality index is indicative of a different acoustic performance of the sound reproduction provided by the sound reproduction system within the acoustic environment. Thus, the sound reproduction system may determine the most appropriated sound reproduction mode taken into account a plurality of different aspects of the acoustic environment.

In a further possible implementation form, at least one of the plurality of quality indexes is based on a plurality of quality sub-indices and wherein the at least of the plurality of quality indexes is an average or a minimum of the plurality of quality sub-indices. Thus, the sound reproduction system may determine the respective quality indexes in a computationally efficient way.

In a further possible implementation form, the processing circuitry is configured to determine the performance measure as a weighted sum of the plurality of quality indexes. Thus, the sound reproduction system may determine the most appropriated sound reproduction mode in a computationally efficient way. In a further possible implementation form, one or more weights (herein also referred to as relevance gains) of the weighted sum of the plurality of quality indexes depend on the type of the input audio content, and/or the number of the plurality of quality indexes. Thus, the processing circuitry may be configured to adjust the one or more weights based on the type of the input audio content (for instance, speech input audio content), and/or the number of the plurality of quality indexes. Thus, the system may adapt to select the beneficial properties of a reproduction mode for the selected input audio content type.

In a further possible implementation form, the processing circuitry is further configured to adjust one or more sound reproduction parameters of the sound reproduction system for improving one or more of the plurality of quality indexes. Thus, the sound reproduction system may try to improve an unfavorable acoustic environment for still being able to use the first reflection-based sound reproduction mode with a good performance.

In a further possible implementation form, the plurality of quality indexes comprises a first quality index indicative of a virtual loudspeaker angle conformity of the first reflection-based sound reproduction mode. Thus, the system is able to judge on the accuracy of the reproduced audio format in a computationally efficient way.

In a further possible implementation form, the information about the acoustic environment of the sound reproduction system comprises information about an actual angle of a virtual loudspeaker provided by the first reflection-based sound reproduction mode and the processing circuity is configured to determine the first quality index based on the actual angle and a target angle of the virtual loudspeaker. Thus, the system is able to judge on the accuracy of the reproduced audio format in a computationally efficient way.

In a further possible implementation form, the plurality of quality indexes comprises a second quality index indicative of a virtual loudspeaker angle symmetry of the first reflection-based sound reproduction mode. Thus, the system is able to judge on the accuracy of the reproduced audio format in a computationally efficient way.

In a further possible implementation form, the information about the acoustic environment of the sound reproduction system comprises information about a first angle of a first virtual loudspeaker and a second angle of a second virtual loudspeaker provided by the first reflection-based sound reproduction mode and the processing circuity is configured to determine the second quality index based on the first angle and the second angle. Thus, the system is able to judge on the accuracy of the reproduced audio format in a computationally efficient way.

In a further possible implementation form, the plurality of quality indexes comprises a third quality index indicative of a localization accuracy of one or more virtual loudspeakers of the first reflection-based sound reproduction mode. Thus, the system is able to judge on the accuracy of the reproduced audio format in a computationally efficient way.

In a further possible implementation form, the information about the acoustic environment of the sound reproduction system comprises information about a delay and a ratio between a reflected sound signal and a direct sound signal provided by the first reflection-based sound reproduction mode and the processing circuitry is configured to determine the third quality index based on the information about the delay and the ratio between the reflected sound signal and the direct sound signal provided by the first reflection-based sound reproduction mode. Thus, the system is able to judge on the accuracy of the reproduced audio format in a computationally efficient way.

In a further possible implementation form, the plurality of quality indexes further comprises a quality index indicative of sound quality, a quality index indicative of speech intelligibility and/or a quality index indicative of sound envelopment of the first reflection-based sound reproduction mode. Thus, the sound reproduction system may determine the most appropriated sound reproduction mode taken into account a plurality of different aspects of the acoustic environment.

In a further possible implementation form, the processing circuitry is further configured to check, based on the information about the acoustic environment of the sound reproduction system, whether the acoustic environment of the sound reproduction system meets one or more minimal requirements for the first reflection-based sound reproduction mode and to select or recommend the second direct-sound based sound reproduction mode for operating the plurality of loudspeakers, if the acoustic environment of the sound reproduction system does not meet the one or more minimal requirements for the first reflection-based sound reproduction mode. Thus, the sound reproduction system may determine in a computationally efficient way whether the acoustic environment of the sound actually meets the one or more minimum requirements before determining the performance measure of the first reflection-based sound reproduction mode. In a further possible implementation form, the processing circuitry is configured to implement one or more beamformers for operating the plurality of loudspeakers based on the input audio content in the first reflection-based sound reproduction mode. The one or more beamformers may comprise one or more add and delay beamformers.

In a further possible implementation form, the second direct-sound based sound reproduction mode for operating the plurality of loudspeakers comprises a mono sound reproduction mode, a stereo sound reproduction mode or a cross-talk cancellation reproduction mode. Thus, the sound reproduction system may operate the plurality of loudspeakers in a second direct-sound based sound reproduction mode appropriate for a less optimal acoustic environment.

In a further possible implementation form, the sound reproduction system further comprises a user interface and the processing circuitry is configured to recommend the first reflection- based sound reproduction mode or the second direct-sound based sound reproduction mode for operating the plurality of loudspeakers to a listener via the user interface. The user interface may issue a visual and/or acoustical signal to the listener for recommending the first reflection-based sound reproduction mode or the second direct-sound based sound reproduction mode.

In a further possible implementation form, the sound reproduction system further comprises one or more sensors configured to provide at least a portion of the information about the acoustic environment of the sound reproduction system, including information about a position of a listener within the sound reproduction zone and/or information about a position, an orientation, and/or a reflection property of a reflecting surface within the acoustic environment of the sound reproduction system.

In a further possible implementation form, the sound reproduction system further comprises one or more microphones configured to provide at least a portion of the information about the acoustic environment of the sound reproduction system.

According to a second aspect a sound reproduction method is provided. The method comprises: generating with a plurality of loudspeakers a sound field in a sound reproduction zone; operating the plurality of loudspeakers based on an input audio content in a plurality of different sound reproduction modes, comprising a first reflection-based sound reproduction mode or a second direct-sound based sound reproduction mode; obtaining information about the acoustic environment; determining, based on the information about the acoustic environment, a performance measure of the first reflection-based sound reproduction mode; and based on the performance measure, selecting or recommending the first reflection-based sound reproduction mode or the second direct-sound based sound reproduction mode for operating the plurality of loudspeakers. Thus, the sound reproduction method may select or recommend the sound reproduction mode most suited for a given acoustic environment.

The method according to the second aspect of the present disclosure can be performed by the sound reproduction system according to the first aspect of the present disclosure. Thus, further features of the method according to the second aspect result directly from the functionality of the data sound reproduction system according to the first aspect as well as its different implementation forms described above and below. In other words, further features and implementation forms of the method according to the second aspect correspond to the features and implementation forms of the sound reproduction system according to the first aspect.

According to a third aspect, a computer program product is provided comprising a computer- readable storage medium for storing program code which causes a computer or a processor to perform the method according to the second aspect when the program code is executed by the computer or the processor.

Details of one or more embodiments are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description, drawings, and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

In the following, embodiments of the present disclosure are described in more detail with reference to the attached figures and drawings, in which: Fig. 1 shows a schematic top view illustrating a sound reproduction system according to an embodiment for operating in different sound reproduction modes;

Figs. 2a-c show schematic top views illustrating different acoustical environments of a sound reproduction system according to an embodiment for operating in different sound reproduction modes;

Fig. 3 shows a schematic top view illustrating different sound zones generated by a sound reproduction system according to an embodiment;

Fig. 4 shows a flow diagram illustrating processing stages implemented by a sound reproduction system according to an embodiment;

Fig. 5 shows a schematic top view illustrating the generation of a virtual loudspeaker by a sound reproduction system according to an embodiment;

Fig. 6 shows a schematic top view illustrating the generation of a further virtual loudspeaker by a sound reproduction system according to an embodiment;

Fig. 7 shows a function implemented by a sound reproduction system according to an embodiment for estimating its performance;

Fig. 8 shows a further function implemented by a sound reproduction system according to an embodiment for estimating its performance;

Figs. 9a-c show further functions implemented by a sound reproduction system according to an embodiment for estimating its performance;

Fig. 10 shows a further function implemented by a sound reproduction system according to an embodiment for estimating its performance;

Fig. 11 shows an exemplary measurement of a direct sound and a reflected sound generated by a sound reproduction system according to an embodiment;

Figs. 12a, b show further functions implemented by a sound reproduction system according to an embodiment for estimating its performance; Figs. 13a-c show tables illustrating different performance measures of a sound reproduction system according to an embodiment; and

Fig. 14 shows a flow diagram illustrating a sound reproduction method according to an embodiment.

In the following, identical reference signs refer to identical or at least functionally equivalent features.

DETAILED DESCRIPTION OF THE EMBODIMENTS

In the following description, reference is made to the accompanying figures, which form part of the disclosure, and which show, by way of illustration, specific aspects of embodiments of the present disclosure or specific aspects in which embodiments of the present disclosure may be used. It is understood that embodiments of the present disclosure may be used in other aspects and comprise structural or logical changes not depicted in the figures. The following detailed description, therefore, is not to be taken in a limiting sense, and the scope of the present disclosure is defined by the appended claims.

For instance, it is to be understood that a disclosure in connection with a described method may also hold true for a corresponding device or system configured to perform the method and vice versa. For example, if one or a plurality of specific method steps are described, a corresponding device may include one or a plurality of units, e.g. functional units, to perform the described one or plurality of method steps (e.g. one unit performing the one or plurality of steps, or a plurality of units each performing one or more of the plurality of steps), even if such one or more units are not explicitly described or illustrated in the figures. On the other hand, for example, if a specific apparatus is described based on one or a plurality of units, e.g. functional units, a corresponding method may include one step to perform the functionality of the one or plurality of units (e.g. one step performing the functionality of the one or plurality of units, or a plurality of steps each performing the functionality of one or more of the plurality of units), even if such one or plurality of steps are not explicitly described or illustrated in the figures. Further, it is understood that the features of the various exemplary embodiments and/or aspects described herein may be combined with each other, unless specifically noted otherwise.

Figure 1 shows a schematic top view illustrating a sound reproduction system 110 according to an embodiment. The sound reproduction system 110 comprises a plurality of loudspeakers configured to generate a sound field in a sound reproduction zone, where a listener 130 is located within a room 120. As illustrated in figure 1, the sound reproduction system 110 may be a soundbar, a smart speaker or a component of a TV. As will be described in more detail below, the sound reproduction system 110 comprises a processing circuitry configured to operate the plurality of loudspeakers based on an input audio content in a plurality of different sound reproduction modes, comprising a first reflection-based sound reproduction mode or a second direct-sound based sound reproduction mode.

The processing circuitry of the sound reproduction system 110 may be implemented in hardware and/or software, such as by means of one or more processors. The hardware may comprise digital circuitry, or both analog and digital circuitry. Digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or general-purpose processors. In an embodiment, the sound reproduction system 110 may further comprise a communication interface for transmitting and receiving data. In an embodiment, the communication interface may comprise a wired or wireless communication interface, such as a WiFi interface. The sound reproduction system 110 may further comprise a memory configured to store executable program code which, when executed by the processing circuitry, causes the sound reproduction system 110 to perform the functions and operations described herein.

The first reflection-based sound reproduction mode may comprise a beamforming sound reproduction mode (also known as "Dipole Processing" or "Null Steering" sound reproduction mode). As will be appreciated, such a beamforming sound reproduction mode uses the plurality of loudspeakers, for instance, an array of loudspeakers to redirect the direction of propagation of the sound waves generated by the loudspeakers to the side in order to create reflections at e.g. walls 120a, b of a room 120. This gives the listener 130 the impression that the sound is coming from a different direction. In an embodiment, the plurality of loudspeakers of the sound reproduction system 110 may comprise one or speakers tilted to the side or upward to create pure reflections.

Since reflection-based sound reproduction modes make use of their environment, calibration to the given listening room 120 provides a better performance. For example, setting the correct angles, levels and delays of the outgoing sound will cause the reflections arriving at the listening position of the listener 130 to have equal magnitudes and delays. Furthermore, it is possible to correct the sound characteristics of the reflections to achieve a spectrally balanced impression. To do so, information about the acoustic environment of the sound reproduction system 110 is necessary, such as information about the geometry of the room 120, the position of the listener(s) 130 relative to the loudspeakers of the sound reproduction system 110, and other acoustic parameters of the room 120, such as the RT60 parameter, i.e. the reverberation time defining the time for the sound pressure to be reduced by 60dB. Thus, in an embodiment, the processing circuitry of the sound reproduction system 110 is configured to obtain this type of information about the acoustic environment of the sound reproduction system 110. In an embodiment, the sound reproduction system 110 may comprise one or more internal or external sensors, such as one or more optical sensors and/or one or more microphones for obtaining this information about the information about the acoustic environment of the sound reproduction system 110 (e.g. by means of acoustic measurements).

Conventional calibration methods can quickly reach their limits if, for example, there are no reflective surfaces in certain directions or the position and/or orientation of these surfaces is unfavorable. This is illustrated in figures 2a-c, where an "ideal" acoustical environment of the sound reproduction system 110 is compared with two exemplary more realistic acoustic environments. For instance, as illustrated in figure 2b, depending on the surface properties, the reflection can be so strongly attenuated that it cannot be used for sound rendering. Thus, the quality of the surround sound reproduction can vary strongly so that a reflection-based reproduction method may not always be the best choice.

As will be described in the following in more detail, the processing circuitry of the sound reproduction system 110 is configured to determine, based on the information about the acoustic environment of the sound reproduction system 110, a performance measure of the first reflection-based sound reproduction mode. Based on the performance measure of the first reflection-based sound reproduction mode, the processing circuitry of the sound reproduction system 110 is further configured to select or recommend the first reflection- based sound reproduction mode or a second direct-sound based sound reproduction mode for operating the plurality of loudspeakers of the sound reproduction system 110. In an embodiment, the second direct-sound based sound reproduction mode for operating the plurality of loudspeakers may comprise a mono sound reproduction mode, a stereo sound reproduction mode or a cross-talk cancellation reproduction mode.

In an embodiment, prior to determining the performance measure of the first reflection-based sound reproduction mode, the processing circuitry of the sound reproduction system 110 may use the information about the acoustic environment for checking whether one or more minimum requirements for operating the plurality of loudspeakers in the first reflection-based sound reproduction mode are met. For instance, these minimum requirements may comprise 120a,b whether or not reflective surfaces are present in the sector in which a reflection is to be generated. In an embodiment, the sectors may be defined as shown in figure 3, e.g. into front/back and left/right directions.

As already mentioned above, the processing circuitry of the sound reproduction system 110 is configured to determine, based on the information about the acoustic environment of the sound reproduction system 110, a performance measure of the first reflection-based sound reproduction mode and based thereon select or recommend the first reflection-based sound reproduction mode or the second direct-sound based sound reproduction mode for operating the plurality of loudspeakers of the sound reproduction system 110. This will be described in more detail in the following under further reference to figure 4, which shows a flow diagram illustrating processing stages implemented by the processing circuitry of the sound reproduction system 110 according to an embodiment.

In a block 401 the processing circuitry of the sound reproduction system 110 according to an embodiment obtains the information about the acoustic environment of the sound reproduction system 110. As already described above, this information may be provided by one or more internal or external sensors, such as one or more optical sensors and/or one or more microphones, and may contain geometrical information about the room 120 or the reflective surfaces 120a, b relative to the sound reproduction system 110, information about the number and position(s) of listener(s) 130 relative to the sound reproduction system 110, room acoustic parameters (e.g. RT60, C50), and/or information about the characteristics or properties of reflective surfaces 120a,b within the room 120. The block 401 may also provide information about any parameters required for further calculations, such as angles of reflection.

As already described above, in a block 403 the processing circuitry of the sound reproduction system 110 may check whether one or more minimum requirements for operating the plurality of loudspeakers in the first reflection-based sound reproduction mode are met. In an embodiment, this check of the minimum requirements does not have to include all of the sectors illustrated in figure 3. However, usually the sectors "Front Left" and "Front Right" should be present for reproducing a wide scene in front of the listener 130.

If the minimum requirements are not met in block 403 of figure 4, the processing circuitry of the sound reproduction system 110 selects or recommends the second direct-sound based sound reproduction mode 410b for operating the plurality of loudspeakers of the sound reproduction system 110. Whether the second direct-sound based sound reproduction mode 410b is a 3D (e.g. Crosstalk-Cancellation which usually is designed only for one listener) or a "classical" sound reproduction mode (e.g. mono, stereo) may depend on the number of listeners 130 and potentially the input audio format (e.g. 1.0, 2.0, 5.1, binaurally rendered content).

For determining the performance measure of the first reflection-based sound reproduction mode in the embodiment shown in figure 4 the processing circuitry of the sound reproduction system 110 is configured to determine a plurality of quality indexes 405a-n. As will be described in more detail in the following, these quality indexes 405a-n may relate, for instance, to a virtual loudspeaker angle conformity (indicative of whether the reflection angles with respect to the listener 130 are similar to those of the target layout, e.g. 5.1), to a virtual loudspeaker angle symmetry (indicative of whether both front (or back) channels are symmetrical with respect to the listener 130), and/or a localization accuracy of the virtual loudspeaker (depending on the radiation pattern of the sound the perceived angle of the virtual loudspeaker angle may differ from that of the reflection point). Moreover, the quality indexes 405a-n may relate, for instance, to the sound quality (depending on the characteristics of the reflective surface 120a,b the sound may become diffusive and result in a blurred, smeared and unprecise sound image), speech intelligibility (depending on the acoustical room parameters T60 and C50 the speech intelligibility can be estimated), and/or envelopment (e.g. depending on whether or not or where the reflections for surround channels can be realized the perceived envelopment will differ).

Thus, for each of these acoustical aspects the processing circuitry of the sound reproduction system 110 may determine a quality index (Ql) value 405a-n. If it is possible to improve the Ql value by one or more adaptations 406a-n, the processing circuitry of the sound reproduction system 110 may implement the optional loop A*. For instance, the angle of the virtual sound source (usually it is the angle of the reflection with respect to the listener 130) may be psychoacoustically adjusted by the processing circuitry of the sound reproduction system 110 using amplitude panning between e.g. the non-modified virtual sound source and the plurality of loudspeakers.

In blocks 407a-n each of the resulting Ql values may be weighted by a respective relevance gain (RG). Using the RG weights the processing circuitry of the sound reproduction system 110 may adjust the degree of influence of the respective aspect on the overall quality performance measure. For practical reasons, the value ranges of the Ql values and RG weights may be normalized to the range [0.1] In block 409 of figure 4 the processing circuitry of the sound reproduction system determines a performance measure value P of the first reflection-based sound reproduction mode as a weighted sum of the plurality of quality index values 405a-n. The resulting performance measure value P is then compared with a predefined threshold (see block 409 of figure 4) which determines whether the performance quality of the first reflection-based sound reproduction mode is sufficiently good or not.

Based on the comparison with the threshold in block 409, the processing circuitry of the sound reproduction system 110 selects or recommends either the first reflection-based sound reproduction mode 410a or the second direct-sound based sound reproduction mode 410b for operating the plurality of loudspeakers. In an embodiment, the sound reproduction system 110 may comprise an acoustical or visual user interface, such as a display, wherein the processing circuitry is configured to recommend the first reflection-based sound reproduction mode 410a or the second direct-sound based sound reproduction mode 410b for operating the plurality of loudspeakers to the listener 130 via the user interface.

In an embodiment, in case an acoustical performance aspect is used more than once, for example, because a Ql value is calculated for each of the channels to be rendered, the Ql values of this aspect may be grouped as quality subindexes into one Ql value. For example, the mean or the smallest value can be taken from the set of all Ql values belonging to a specific aspect.

As illustrated by the processing block 404 in figure 4, for some QI/RG combinations the processing circuitry of the sound reproduction system 110 may take into account metadata. For example, in an embodiment, the information about whether the content is dialog-heavy or not can influence the RG value for the Ql value indicative of the speech intelligibility.

As will be appreciated, the following parameters may be chosen and adjusted by the processing circuitry of the sound reproduction system 110 depending, for instance, on the use case, the hardware specifications of the sound reproduction system 110 as well as the content type: the number of Ql values, the specific Ql value types, the RG values and/or the predefined threshold value.

In the following some exemplary quality indexes will be described in more detail that may be used by the processing circuitry of the sound reproduction system 110 for determining the performance measure value P of the first reflection-based sound reproduction mode 410a, such as the quality index indicative of the virtual loudspeaker angle conformity. For instance, the target angle for virtual loudspeaker 501a for the channel "Front Right" in a 5.1 configuration may be 30°. As illustrated in figures 5 and 6, the angle of the virtual loudspeaker 501b may vary dependent on the position and the orientation of the reflective surface 120b. For instance, in figure 5 the angle of the virtual loudspeaker 501b in figure 5 is 60°, while in figure 6 it is 110°.

Since both cases differ from the target angle of 30° to a different extent, the processing circuitry of the sound reproduction system 110 may determine different Ql values for these cases. In an embodiment, in the first case the Ql value will be significantly higher than in the second case. To this end, the processing circuitry of the sound reproduction system 110 may implement a Ql gain curve, such as illustrated in figure 7. Based on the exemplary Ql gain curve illustrated in figure 7, the processing circuitry of the sound reproduction system 100 may determine for the azimuth angle of 60° a Ql value of about 0.5 and for the azimuth angle of 110° a Ql value of about 0.05. This reflects that the virtual loudspeaker angle of 110° is essentially unsuitable for rendering the "Front Right" channel. In this scenario, an adaptation 406a-n may be realized by amplitude panning. The processing circuitry of the sound reproduction system 110 may change the position of the virtual speaker 501b by mixing its signal into the plurality of loudspeakers of the sound reproduction system 110. Thus, by means of the adaptation 406a-n the final angle may approach the target angle 501a and the corresponding Ql value may increase. A further exemplary gain curve that may be implemented by the processing circuitry of the sound reproduction system 110 for determining the Ql value indicative of the indicative of the virtual loudspeaker angle conformity is illustrated in figure 8. As will be appreciated, the exemplary gain curve shown in figure 8 provides a value of 1 within a range of the target angle of 30° and a value of 0 otherwise.

In an embodiment, the processing circuitry of the sound reproduction system 110 is configured to perform the mode selection/recommendation in real-time or offline. Whether a rendering mode is changed may depend potentially on one or more of the following factors. If the content type (e.g. dialogue based content, movie, music) has an influence on the calculation of at least one quality index, the rendering mode selection/recommendation may be adjusted in real time depending on this. If the environment parameter estimation is based on optical sensor data and a real-time tracking of the listener 130 is available, the performance estimation may also be performed in real-time if the listener 130 is outside of the already calibrated position.

In the following, an embodiment of the sound reproduction system 110 will be described in more detail using a Ql value indicative of the virtual loudspeaker angle conformity, a Ql value indicative of the virtual loudspeaker angle symmetry, and a Ql value indicative of a localization accuracy of the virtual loudspeaker. For the Ql value indicative of the virtual loudspeaker angle conformity the processing circuitry of the sound reproduction system 110 may use "Front Left", "Front Right" and "Front Up" channels as well as the exemplary Ql gain curves illustrated in figures 9a-c. These curves may be designed heuristically based on listening sessions, defining the desired accuracy of the virtual sound source angles. The target azimuth angles for the front channels are defined as +/- 50°. In an embodiment, the surround channels may not be used, thus in order to achieve a wider spatial image the target angles may be chosen larger than those of a standard 5.1 layout. By way of example, the target elevation angle for the "Front Up" channel may be defined as 50°.

For the Ql value indicative of the of the virtual loudspeaker angle symmetry it has to be appreciated that in a multichannel audio layout, opposing speakers often work together. For example, the two "Front" channels define the width of the scene, which in most cases is expected to be symmetrical. It also may happen that phantom sound sources are created between the opposing pairs of loudspeakers, which should often appear centrally. For these reasons, the preservation of the loudspeaker angle symmetry is often very important. As can be taken from figure 10, which shows a heuristically obtained gain curve, small deviations from the perfect symmetry will achieve high gains. Large deviations (e.g. 50°) will be “penalized” stronger. In an embodiment, the processing circuitry of the sound reproduction system 110 may determine the symmetry of both front channels as follows:

Df = | |<AL I - |<jP 2 l |. where <p t and f 2 are the azimuth angles of the corresponding loudspeaker pair.

For the Ql value indicative of a localization accuracy of the virtual loudspeaker it has to be appreciated that the basic principle of beamforming (or "Dipole Processing" or "Null- Steering") is to maximize the energy radiation in the desired direction and to minimize the energy radiation in the direction of the listener (null). Depending on how long the travel path of the reflection is but also how much it is attenuated at the reflection point, this ratio will tend to become smaller. In figure 11 an impulse response of a certain beam direction is shown. It can be seen that the direct sound, although attenuated, compared to the intended beam reflection still has a substantial amount of energy. Depending on the delay between the direct sound and the intended reflection (“DelayDiff’) and the (intended) reflection to direct sound ratio (“RDSR”), the direct sound will influence the perceived angle of the reflected wave.

Thus, the localization accuracy of the virtual loudspeaker will decrease. An exemplary gain curve for the Ql value indicative of a localization accuracy of the virtual loudspeaker is shown in figure 12. The value range of RDSR, DelayDiff and the weight itself are normalized. The original value ranges were limited to realistic value ranges. The shape of the surface was determined heuristically, based on informal listening tests. The highest localization accuracy is achieved when the RDSR is high while the DelayDiff is small. On the contrary, when RDSR is small and DelayDiff is large, the localization accuracy is small.

Since all three Ql values used in this embodiment are concerned with the localization accuracy of the virtual sound sources, the RG values may be set to 1/3 each. The threshold may be set to 0.6, as will be described in more detail in the following based on several exemplary acoustic environments of the sound reproduction system 110.

In a first example with a "good" performance of the first reflection-based sound reproduction mode 410a, assuming distances of the side walls 120a, b of +/-120cm, the reflection angles will have a value of +/-50 0 in azimuth. In addition, the ceiling reflection shall appear at +50° elevation. The RDSR and DelayDiff values for this examples are listed in the table shown in figure 13a. As can be taken from the table shown in figure 13a, the RDSR values are rather high and the DelayDiff values rather small. Furthermore, since all reflection angles perfectly match the corresponding target angle, for each channel the Qh ch ® is 1, where Qh ch ® denotes the Quality Index of the first aspect for each of the three channels i. The final Qh can be calculated (grouped) as follows:

The resulting Qh thus is 1. The Quality Index for the angle symmetry of both front channels (Front Left and Front Right) is 1 as well. According to the gain curve of figure 12 the localization accuracy index is 0.99. Combining these results with the corresponding RGs leads to a performance of the first reflection-based sound reproduction mode 410a of P =

0.99 for this first example.

For a second example with a "bad" performance of the first reflection-based sound reproduction mode 410a the parameters are listed in the table shown in figure 13b. While in comparison with the first example the parameters for the channel "Front Left" did not change, the conditions for the "Front Right" channel have become more critical. The azimuth angle is not only further away from the target, additionally it results in an asymmetry with respect to the left channel. Furthermore, the DelayDiff is significantly higher and the RDSR lower. This has a strong impact on the localization accuracy (Q = 0.05). Combining these results with the corresponding RGs leads to a performance of the first reflection-based sound reproduction mode 410a of P = 0.37 for this second example.

For a third example with an "acceptable" performance of the first reflection-based sound reproduction mode 410a the parameters are listed in the table shown in figure 13c. The angles of both channels, i.e. "Front Left" and "Front Right", deviate from their target angles by +/-15 0 . This results in an asymmetry of 30°, which is strongly penalized (Qh = 0.33). This third example illustrates that although both, angle conformity (Qh = 0.83) and localization accuracy (Qh = 0.93) have acceptable values, the angle asymmetry greatly reduces the overall performance value P (P = 0.69). It can also be seen that even though the symmetry between the front two channels is not optimal, the overall performance is still usable. Therefore, the performance is still acceptable. Thus, based on these examples, the processing circuitry of the sound reproduction system 110 may use a threshold value of 0.6.

Figure 14 shows a flow diagram illustrating a sound reproduction method 1400 according to an embodiment. In an embodiment, the sound reproduction method 1400 may be performed by the sound reproduction system 100 and its different embodiments described above.

The method 1400 comprises a step 1401 of generating with a plurality of loudspeakers a sound field in a sound reproduction zone. Moreover, the method 1400 comprises a step 1403 of operating the plurality of loudspeakers based on an input audio content in a plurality of different sound reproduction modes, comprising the first reflection-based sound reproduction mode 410a or the second direct-sound based sound reproduction mode 410b. The method 1400 further comprises a step 1405 of obtaining information about the acoustic environment. Moreover, the method 1400 comprises a step 1407 of determining, based on the information about the acoustic environment, a performance measure of the first reflection-based sound reproduction mode 410a. The method 1400 further comprises a step 1409 of, based on the performance measure, selecting or recommending the first reflection-based sound reproduction mode 410a or the second direct-sound based sound reproduction mode 410b for operating the plurality of loudspeakers.

The person skilled in the art will understand that the "blocks" ("units") of the various figures (method and apparatus) represent or describe functionalities of embodiments (rather than necessarily individual "units" in hardware or software) and thus describe equally functions or features of apparatus embodiments as well as method embodiments (unit = step). For the several embodiments disclosed herein, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the described embodiment of an apparatus is merely exemplary. For example, the unit division is merely a logical function division and may be another division in an actual implementation. For example, a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented by using some interfaces. The indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, functional units of the embodiments disclosed herein may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.