Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A HOME AUTOMATION SYSTEM
Document Type and Number:
WIPO Patent Application WO/2020/212644
Kind Code:
A1
Abstract:
A method for providing information for home automation and/or service provisioning in a space comprising at least a private subspace and a non-private subspace, the method comprising monitoring the non-private subspace by at least one camera sensor; recognizing at least one object moving in the non-private subspace; determining, based on movements sensing an activity of the at least one object, an event occurred in the non-private subspace; and providing information, based on the determined event, for adjusting a home automation system or for service provisioning at least in said space.

Inventors:
EDELMAN HARRY (FI)
HÄSTBACKA DAVID (FI)
PERTILÄ PASI (FI)
PARVIAINEN MIKKO (FI)
Application Number:
PCT/FI2019/050316
Publication Date:
October 22, 2020
Filing Date:
April 18, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
TAMPERE UNIV FOUNDATION SR (FI)
International Classes:
H04L12/28; F24F11/30; G05B13/02; G05B15/02; G05B19/048; G05B23/02; G06N3/04; G06N3/08; G06N20/00; G06Q50/16; G06V10/00
Domestic Patent References:
WO2018217665A12018-11-29
Foreign References:
US20160248847A12016-08-25
US20180323996A12018-11-08
US20170123391A12017-05-04
US20140084165A12014-03-27
US20180342329A12018-11-29
US20180357247A12018-12-13
US20030227439A12003-12-11
US20070288414A12007-12-13
US20040133533A12004-07-08
Attorney, Agent or Firm:
BERGGREN OY (FI)
Download PDF:
Claims:
Claims:

1 . A method for providing information for home automation and/or service provisioning in a space comprising at least a private subspace and a non-private subspace, the method comprising

monitoring the non-private subspace by at least one camera sensor;

recognizing at least one object moving in the non-private subspace;

determining, based on sensing an activity of the at least one object, an event occurred in the non-private subspace; and

providing information, based on the determined event, for adjusting a home automation system or for service provisioning at least in said space.

2. The method according to claim 1 , further comprising

monitoring the non-private subspace by a plurality of further sensors.

3. The method according to claim 1 or 2, further comprising

inputting data obtained from said one or more sensors into a first machine learning application for carrying out recognition of said object.

4. The method according to claim 3, further comprising

inputting data obtained from the first machine learning application and/or from said plurality of further sensors into a second machine learning application for determining the event occurred in the non-private subspace, wherein the determining further comprises

determining at least a first location of the recognized at least one object at a first time instance;

determining at least a second location of the recognized at least one object at a second time instance; and

determining said event occurred in the non-private subspace based on at least differences of the first and second locations of the recognized at least one object. 5. The method according to claim 4, further comprising

inputting event data obtained from the second machine learning application into a third machine learning application for determining a pattern comprising a sequence of events occurred in the non-private subspace, wherein the determining the pattern further comprises

combining the current event data to data of at least one previous event determined to be occurred in said non-private subspace;

comparing the combination of the current event data and the previous event data to training data of third machine learning application, wherein the training data comprises a plurality of combinations of events, each combination determined as a pattern; and

in response to detecting a combination of events similar to the combination of said current event data and said previous event data, determining the current event to belong to a pattern.

6. The method according to claim 4 or 5, further comprising

using output data of the second and/or the third machine learning application for determining the information to be provided for adjusting the home automation system or for service provisioning at least in said space.

7. The method according to any preceding claim, further comprising

obtaining data from an external system for carrying out recognition of said object or for adjusting the home automation system.

8. A system comprising at least one camera sensor arranged to be positioned towards a non-private subspace of a space, a processor, a memory, computer program code residing in the memory, which when executed by the processor, causes the system to

monitor the non-private subspace by the at least one camera sensor;

recognize at least one object moving the non-private subspace; determine, based on sensing an activity of the at least one object, an event occurred in the non-private subspace; and

provide information, based on the determined event, for adjusting a home automation system or for service provisioning at least in said space.

9. The system according to claim 8, further comprising a plurality of further sensors for monitoring the non-private subspace.

10. The system according to claim 8 or 9, further comprising a first machine learning application configured to use data obtained from said one or more sensors as input for carrying out recognition of said object.

1 1 . The system according to claim 10, further comprising

a second machine learning application configured to use data obtained from the first machine learning application and/or from said plurality of further sensors as input for determining the event occurred in the non-private subspace, wherein the second machine learning application is configured to determine at least a first location of the recognized at least one object at a first time instance;

determine at least a second location of the recognized at least one object at a second time instance; and

determine said event occurred in the non-private subspace based on at least differences of the first and second locations of the recognized at least one object.

12. The system according to claim 1 1 , further comprising a third machine learning application configured to use event data obtained from the second machine learning application as input for determining a pattern comprising a sequence of events occurred in the non private subspace, wherein the third machine learning application is configured to

combine the current event data to data of at least one previous event determined to be occurred in said non-private subspace;

compare the combination of the current event data and the previous event data to training data of third machine learning application, wherein the training data comprises a plurality of combinations of events, each combination determined as a pattern; and

in response to detecting a combination of events similar to the combination of said current event data and said previous event data, determine the current event to belong to a pattern.

13. The system according to claim 11 or 12, wherein

output data of the second and/or the third machine learning application are configured to be used for determining the information to be provided for adjusting the home automation system or for service provisioning at least in said space.

14. The system according to any of claims 8 - 13, further comprising

a source of light for illuminating at least one non-private subspace.

15. The system according to any of claims 8 - 14, further comprising

means for adjusting camera angle of the at least one camera sensor.

16. The system according to any of claims 8 - 15, further comprising

means for obtaining data from an external system for carrying out recognition of said object or for adjusting the home automation system.

17. The system according to any of claims any of claims 8 - 16, wherein the functionalities of the system are integrated into one apparatus.

18. An apparatus comprising at least one camera sensor arranged to be positioned towards a non-private subspace of a space, a processor, a memory, computer program code residing in the memory, which when executed by the processor, causes the system to

monitor the non-private subspace by the at least one camera sensor;

recognize at least one object moving the non-private subspace; determine, based on sensing an activity of the at least one object, an event occurred in the non-private subspace; and

provide information, based on the determined event, for adjusting a home automation system or for service provisioning at least in said space.

19. The apparatus according to claim 18, further comprising the system according to any of claims 9 - 17.

Description:
A HOME AUTOMATION SYSTEM

Field of the invention

The invention relates to home automation systems, especially to gathering sensor data for adjusting a home automation system and for enabling service provisioning.

Background of the invention

Current home automation and smart home solutions involve various services, including, for example, the fields of energy management and savings, security such as burglary and fire safety solutions, health and/or well-being such as security for independent living for the elderly, commercial solutions such as in- home delivery and/or e-commerce, space sharing and facility management services, among others.

The use of the current solutions requires the use of one or more user-interfaces for controlling the home automation or smart home, such as a mobile app, wall control panel, push button, or computer interface. Many users, such as children or elderly, fail to command or forget to use these interfaces, or people are reluctant to use the interfaces that require active human intervention for receiving the services.

The shortcomings of usability could be facilitated by providing information about the use and occupancy of various architectural spaces using sensor technology, such as audio and video sensors, and controlling an automation application based on said information. However, people are typically unwilling to reveal their personally identifiable information, such as sensor data describing their movements in private spaces, to any third-party applications. This may hinder or at least slow down the development of such applications. Therefore, there is a need for a home automation system where sensor data could be gathered without endangering the privacy of inhabitants.

Brief summary of the invention

Now, an improved arrangement has been developed to reduce the above- mentioned problems. As aspects of the invention, we present a method, a system and an apparatus, which are characterized in what will be presented in the independent claims.

The dependent claims disclose advantageous embodiments of the invention.

According to an aspect of the invention, there is provided a method for providing information for home automation and/or service provisioning in a space comprising at least a private subspace and a non-private subspace, the method comprising monitoring the non-private subspace by at least one camera sensor; recognizing at least one object moving in the non-private subspace; determining, based on movements sensing an activity of the at least one object, an event occurred in the non-private subspace; and providing information, based on the determined event, for adjusting a home automation system or for service provisioning at least in said space.

According to an embodiment, the method further comprises monitoring the non-private subspace by a plurality of further sensors.

According to an embodiment, the method further comprises inputting data obtained from said one or more sensors into a first machine learning application for carrying out recognition of said object.

According to an embodiment, the method further comprises inputting data obtained from the first machine learning application and/or from said plurality of further sensors into a second machine learning application for determining the event occurred in the non-private subspace, wherein the determining further comprises determining at least a first location of the recognized at least one object at a first time instance; determining at least a second location of the recognized at least one object at a second time instance; and determining said event occurred in the non-private subspace based on at least differences of the first and second locations of the recognized at least one object.

According to an embodiment, the method further comprises inputting event data obtained from the second machine learning application into a third machine learning application for determining a pattern comprising a sequence of events occurred in the non-private subspace, wherein the determining the pattern further comprises combining the current event data to data of at least one previous event determined to be occurred in said non-private subspace; comparing the combination of the current event data and the previous event data to training data of third machine learning application, wherein the training data comprises a plurality of combinations of events, each combination determined as a pattern; and in response to detecting a combination of events similar to the combination of said current event data and said previous event data, determining the current event to belong to a pattern.

According to an embodiment, the method further comprises using output data of the second and/or the third machine learning application for determining the information to be provided for adjusting the home automation system or for service provisioning at least in said space.

According to an embodiment, the method further comprises obtaining data from an external system for carrying out recognition of said object or for adjusting the home automation system.

A second aspect relates to a system comprising at least one camera sensor arranged to be positioned towards a non-private subspace of a space, a processor, a memory, computer program code residing in the memory, which when executed by the processor, causes the system to monitor the non-private subspace by the at least one camera sensor; recognize at least one object moving the non-private subspace; determine, based on movements sensing an activity of the at least one object, an event occurred in the non-private subspace; and provide information, based on the determined event, for adjusting a home automation system or for service provisioning at least in said space.

According to an embodiment, the system is arranged to carry out any of the embodiments of the method.

According to an embodiment, the system further comprises a source of light for illuminating at least one non-private subspace.

According to an embodiment, the system further comprises means for adjusting camera angle of the at least one camera sensor. According to an embodiment, the system further comprises means for obtaining data from an external system for carrying out recognition of said object or for adjusting the home automation system.

A third aspect relates to an apparatus comprising at least one camera sensor arranged to be positioned towards a non-private subspace of a space, a processor, a memory, computer program code residing in the memory, which when executed by the processor, causes the apparatus to monitor the non private subspace by the at least one camera sensor; recognize at least one object moving the non-private subspace; determine, based on sensing an activity of the at least one object, an event occurred in the non-private subspace; and provide information, based on the determined event, for adjusting a home automation system or for service provisioning at least in said space.

According to an embodiment, the apparatus comprising the system according to any of the embodiments.

These and other aspects, embodiments and advantages will be presented later in the detailed description of the invention.

Brief description of the drawings

The invention will now be described in more detail in connection with preferred embodiments with reference to the appended drawings, in which:

Fig. 1 shows a simplified example illustrating a categorization of an architectural space into private and non-private subspaces;

Fig. 2 shows a flow chart of a method for providing sensor information for home automation and/or service provisioning according to an embodiment;

Figs. 3a, 3b show some examples of positioning at least one camera sensor in the space according to some embodiments;

Fig. 4 shows a simplified example of a neural network for object recognition; Fig. 5 shows an example of determining an event according to an embodiment;

Fig. 6 shows an example of the method for providing control information for adjusting the home automation system or for enabling service provisioning according to an embodiment;

Figs 7a, 7b show examples of machine learning applications according to some embodiments; and

Fig. 8 a block chart of a system for providing sensor information for home automation and/or service provisioning according to an embodiment.

Detailed description of the embodiments

Conventionally, architectural spaces can be divided into private, semi-private (a.k.a. semi-public) spaces, and public spaces. Private spaces may be considered as spaces, which one or more persons regard psychologically as their own and in which another person’s access is permitted only by their consent. Private spaces typically include a possibility to limit or hinder visibility into the space from outside. Such spaces may include e.g. living rooms, bedrooms and bathrooms of an apartment or a house. Public spaces, in turn, may be considered as spaces, which are generally open and accessible to anyone, such as street areas, parks and staircases of block of flats.

Semi-private or semi-public spaces are transitional spaces between private and public spaces in terms of privacy, and they include e.g. entry halls, foyers, private yards and other entrance areas. In many cases, there is a view from a semi-private or semi-public space to both a private space and a public space. This categorization can be used in support of placing sensors for the home automation system gathering non-intrusive data. The semi-private/semi-public spaces or their proximities are better suited for gathering sensor data than private spaces regarding privacy and the acceptability of data collection in a non-intrusive fashion. Figure 1 shows a simplified example illustrating this categorization. Figure 1 shows a floor plan view of a space 100 comprising an apartment 102 and a subspace 104 outside the apartment, which may be e.g. a corridor, a hall, yard etc., i.e. a subspace that may be categorized as a semi-private/semi-public or a public space. The apartment 102 may be divided into an entrance area 102a and private space 102b. From the private space 102b, there may be connections to further private spaces.

The entrance areas of spaces are crucial information sources regarding the use of the space, for example, at homes. Entry halls, foyers and other entrance areas are also transitional spaces in terms of privacy. Such spaces have connections to both private and semi-private and/or public spaces, such as staircases of block of flats or street areas. For example, when a door opens, a view exposes between the different spaces providing a visual contact from a public space to a private space and the other way. Accordingly, the entrance area 102a may be categorized as a semi-private/semi-public space. Thus, the space 100 comprises a private subspace 102b and a non-private subspace consisting of the entrance area 102a and the subspace 104 outside the apartment.

Flerein below, a novel method and system are presented, in which the categorization of architectural spaces into private and non-private subspaces is utilized in providing sensor information for home automation and/or service provisioning in a non-intrusive manner.

Another aspect on the sensor data on architectural spaces involves creating a digital understanding about the use of space. Beyond resolving the issues on user-interfaces, such pool of real-time data enables the provision of services, such as indoor climate control based on demand and occupancy, and trustworthy in-home deliveries to apartments when nobody is at home.

The method, which is disclosed in Figure 2, can be operated in a system comprising at least one camera sensor arranged to be positioned towards a non-private subspace of a space, a processor, a memory, computer program code residing in the memory, which when executed by the processor is arranged to carry out the method. In the method, information is provided for home automation and/or service provisioning in a space comprising at least a private subspace and a non private subspace. The method comprises monitoring (200) the non-private subspace by at least one camera sensor; recognizing (202) at least one object moving in the non-private subspace; determining (204), based on sensing an activity of the at least one object, an event occurred in the non-private subspace; and providing (206) information, based on the determined event, for adjusting a home automation system or for service provisioning at least in said space.

Thus, the presence of the users in space and the ways the spaces are used are essential information for the adjustment of the home automation system or for the delivery of services. Herein, for the privacy and the acceptability reasons, the monitoring is focused only in the non-private subspace. For example, a video camera targeting only the entrance door inside an apartment is considered less intrusive than a video camera in a bedroom.

The presence of users in the space, including also the private subspace, is determined based on object recognition in the non-private subspace, and based on sensing an activity of the object in the non-private subspace, an event is determined to take place. Herein, an event refers to a spatial transition of a recognized object over time either within the non-private subspace or from the non-private subspace to the private subspace. An event provides information regarding the occupancy and use of the space including the non private subspace and the private subspace. Examples of determined events may be a member of the family entering or exiting via a door, a courier delivering services or an unknown person intruding the space. Based on the determined event, information is provided for adjusting the home automation system or for service provisioning. For example, an event relating to a member of the family entering or exiting the space may provide information for adjusting the heating and/or lightning of the space. Various embodiments relating to recognizing objects in the non-private subspace and determining events based on the sensed activity of the object are disclosed further below.

It is noted that the information for adjusting the home automation system or for service provisioning, obtained based on the detected event, is not necessarily limited to utilized only in the space where the event was detected, but the information may be delivered to further spaces and/or to stakeholder(s) acting upon said space. For example, in a block house at least some of the actuators may be (de)activated in neighboring apartments based on an event detected in one apartment, such as a fire alarm, and/or the fire alarm is sent to the fire department.

Figures 3a and 3b show some examples of positioning the at least one camera sensor in the space shown in Figure 1 . Figure 3a shows a floor plan view of the space 100 illustrating three possible locations A, B, C for a camera sensor. Figure 3b shows a vertical view of the space 100 as cross-sectional A-A’ view illustrating the same three possible locations A, B, C for the camera sensor in vertical direction. The camera sensor may be installed to the ceiling of the non private subspace (location A), to the wall of the non-private subspace (location B), or as a free-standing solution in the non-private subspace (location C). In each case, the capturing angle of the camera sensor is directed towards the non-private subspace, preferably towards the entrance between the entrance area 102a and the subspace 104 outside the apartment to protect the privacy of the private areas of the apartment. It is also possible to install a (further) camera sensor in the subspace 104 outside the apartment (location D), wherein the capturing angle of the camera sensor is preferably directed towards said entrance, preferably such that the capturing does not extend to the private subspace of the apartment.

According to an embodiment, the method further comprises monitoring the non-private subspace by a plurality of further sensors. Thus, the system may comprise, in addition to the at least one camera sensor, one or more further sensors, such as audio, video, CO2 (Carbon Dioxide), CO (Carbon Monoxide), VOC (Volatile Organic Compound), infrared, smoke, gas, temperature, humidity, and/or air pressure sensors. The data obtained from the plurality of sensors may also be referred to as multimodal (sensor) data.

The multimodal sensor data provides a more comprehensive understanding about the occupancy and use of the space and/or conditions requiring adjustment of the home automation system. For example, the audio sensor may comprise one or more microphone arrays, each including e.g. 5 or more microphones enabling rather accurate determination of a direction of sound e.g. through beamforming. Thus, the audio sensor may provide further information e.g. for determining that a family member is within the space even if not detected by the camera sensor for a certain time. On the other hand, any of the CO 2 , CO, VOC, infrared, smoke, gas, temperature, humidity, and/or air pressure sensors may be used, instead of or in addition to the camera sensor, for sensing an activity of an object and they may provide information for adjusting the home automation system or e.g. providing an alert.

For example, the home automation system may be configured to monitor the audio context of the space for providing safety service for elderly person(s) living in the space. If a person is located in the private subspace, i.e. not visible for the camera sensor, the sensing of an activity may be carried out solely using the audio sensor. Based on the volume, frequency, direction and/or movement of a sound, the sensed activity may be categorized e.g. as“normal” or“abnormal”, wherein an abnormal activity may cause an alarm to be sent to a service provider.

According to an embodiment, the method further comprises inputting data obtained from said one or more sensors into a first machine learning application for carrying out recognition of said object. The developments in computer vision, machine learning and artificial intelligence allow for recognizing and detecting objects in images or videos with great accuracy. For several years, it has been known to use computer vision applications, where the image objects are recognized from pre-trained (hand-engineered) image features (e.g., SIFT features). Recent development in deep learning techniques, especially neural networks, have enabled to develop image recognition techniques that learn to recognize image objects directly from the raw data. During the training stage, deep learning techniques build hierarchical layers which extract image features of increasingly abstract level.

According to an embodiment, the first machine learning application is implemented as a pre-trained first neural network for carrying out recognition of said object. It is noted that while the fast development of neural networks recently has made neural networks as advantageous platforms of artificial intelligence, the embodiments are not limited to neural networks solely. Other types of machine learning applications, such as Bayesian networks or support vector machines, may be used as well. Artificial neural networks, or simply neural networks, are parametric computation graphs comprising nodes (or “neurons”) and connections between the nodes. The nodes may be arranged in successive layers, and in some neural network architectures only nodes in adjacent layers are connected. Each connection can transmit information, a.k.a. signal, from one node to another. From one node, there may be connections to a plurality of other nodes.

Each connection has an associated parameter or weight, which defines the strength of the connection. A node that receives a signal can process it by multiplying the incoming signal by the weight, and then supply the weighted signal to further nodes connected to it. The weight increases or decreases the strength of the signal at a connection. Nodes may have a threshold such that the signal is only sent if the aggregate signal reaches the threshold. In common neural network implementations, the signal at a connection between nodes is a real number, and the output of each node is computed by some non-linear function of the sum of its inputs.

Typically, nodes are aggregated into layers. Different layers may perform different kinds of transformations on their inputs. Signals travel from the first layer (the input layer), to the last layer (the output layer), possibly after traversing the layers multiple times. The input layer receives the input data, for example in a case of object recognition image data from a camera sensor, and the output layer is task-specific and outputs an estimate of the desired data, for example a vector whose values represent a class distribution in the case of image classification.

Training takes advantage of the paramount property of neural networks and other machine learning applications that they are able to learn properties from input data. The training may be implemented as a training algorithm, or as a meta-level neural network providing the training signal.

Training a neural network may be regarded as an optimization process, where the goal is to make the neural network to learn the properties of the data distribution. Hence, the goal is to train the neural network to generalize to previously unseen data, i.e., data which was not used for training the neural network. The quality of the neural network is evaluated by comparing the neural network’s output to ground-truth output data. The comparison may include a cost function, which is run on both the neural network’s output and the ground-truth data, wherein the comparison provide a loss value.

The neural network may then be trained based on at least the loss value. The nodes and connections typically have weights that are adjusted as a part of the training process. The weights of the connections represent the biggest part of the learnable parameters of a neural network.

Figure 4 shows a simplified example of a neural network for object recognition. Convolutional Neural Networks (CNNs) have been recently used in various image processing applications, such as in object recognition, since CNNs are easier to train than other neural networks and have fewer learnable parameters. As shown in Figure 4, a CNN is composed of one or more convolutional layers with fully connected layers on top. In the case of object recognition, the input to a CNN is image data obtained e.g. from the camera sensor. The purpose of each layer of a CNN is to provide an increased level of abstraction of the input data than the previous layer. For this purpose, each layer extracts multiple feature maps from the input data at its own abstraction (or semantic) level. Thus, each layer aims to find features of a certain semantic level from the input image data received from the previous layer.

It is noted that the CNN in Figure 4 has only three abstraction layers C1 , C2, C3 for the sake of simplicity, but the number of abstraction layers in CNNs is not limited and there may be tens of layers. The first convolution layer C1 of the CNN consists of extracting 4 feature-maps from the first layer (i.e. from the input image). These maps may represent low-level features found in the input image, such as edges and corners. The second convolution layer C2 of the CNN, consisting of extracting 6 feature-maps from the previous layer, increases the semantic level of extracted features. Similarly, the third convolution layer C3 may represent more abstract concepts found in images, such as combinations of edges and corners, shapes, etc. The last layer of the CNN (fully connected MLP) does not extract feature-maps. Instead, it usually consists of using the feature-maps from the last feature layer in order to predict (recognize) the object class. For example, it may predict that the object in the image is a family member. According to an embodiment, the method further comprises obtaining data from an external system for carrying out recognition of said object or for adjusting the home automation system. Thus, the system may benefit from external sensor data from any external sensors, such as video feed of wearable cameras. For example, a courier may have a wearable camera, which provide recognition data for the system even before the courier enters the space. On the other hand, the courier may have an identification tag providing further identification data for the system or he/she may be prompted to enter a security code to an external system before entering the space. Likewise, any form of big data, such as weather information, can adopted to contribute the services and the adjustment of the home automation system.

According to an embodiment, the method further comprises inputting data obtained from the first machine learning application and/or from said plurality of further sensors into a second machine learning application for determining the event occurred in the non-private subspace, wherein the determining further comprises determining at least a first location of the recognized at least one object at a first time instance; determining at least a second location of the recognized at least one object at a second time instance; and determining said event occurred in the non-private subspace based on at least differences of the first and second locations of the recognized at least one object.

The starting point for detecting an event is having recognized presence of an object at a plurality of time instances as input data for machine learning. Based on the detected sequence of the object in a range of time, an event matching a designated task of the event recognition may be determined, such as "a man entering from a corridor to the entrance room of an apartment", wherein "man" is an object, "entry" is an event while rest of the content defines at least a spatial context. In other words, the entry event includes an object of detection for the machine learning task. It is noted that the object may refer to any biotic or abiotic element still comprising an entry event, for example, "a rolling ball pushes the door open and rolls in". An event may provide real-time information for the home automation and/or provision of services, such as "a courier has entered the apartment".

Additionally, the events may have cross-objects. A cross-object may be any biotic or abiotic element related to an event, such as "a man entered with a dog", comprising an object, an event and a cross-object. Cross-events in turn may occur when two or more events occur simultaneously at a range of time, for example, "a man entered with a dog and a courier came out".

Figure 5 shows an example of determining an event. Figure 5 shows an example for collecting multimodal sensor data in a spatial context similar to Figure 3b, where an apparatus implementing the system (A, B, C) may be attached to a surface such as a wall, ceiling, or floor, including a free-standing movable lighting fixture. If the monitored space (102b) is spatially connected to multiple exterior doors or other entrances, each door or entrance may be monitored with an apparatus (A, B, C) that may collect and exchange data about the occupancy of the space. The defined locations of the apparatus (A, B, C) enable predicting the content and quality of the data for machine learning purposes.

The spatial transition (500) of any object (502) and a possible cross-object (504) is defined as the starting point of an event at a point of time t=0 (506) for the collecting the sensor data having the point of time t=x (508) as an ending of collecting the sensor data comprising an event E0..x. The endpoint 508 of event E0..x may define a point for additional set(s) of data collections as later events related to the previous event. In spatial terms of detecting objects and related events, the spatial transition takes place in a range of time instances t = [0, x] located in spatial contexts, such as moving from public space to a private space defining an entry event. The entry events, and accordingly the reversed exit events, may be used for determining the occupancy of a space. In addition to that, an in-depth qualitative and quantitative analysis of the events may be used for further understanding on the nature of the event if desirable for the automation or the provision of the services enabled by the system.

According to an embodiment, the method further comprises inputting event data obtained from the second machine learning application into a third machine learning application for determining a pattern comprising a sequence of events occurred in the non-private subspace, wherein the determining the pattern further comprises combining the current event data to data of at least one previous event determined to be occurred in said non-private subspace; comparing the combination of the current event data and the previous event data to training data of third machine learning application, wherein the training data comprises a plurality of combinations of events, each combination determined as a pattern; and in response to detecting a combination of events similar to the combination of said current event data and said previous event data, determining the current event to belong to a pattern.

Accordingly, the third machine learning application carries out behavioral analysis of various sequences of event and tries to identify patterns, i.e. sequences of two or more events, that may have significance upon adjusting the home automation system or for enabling service provisioning in said space. Thus, the combinations of events may form patterns that impose information, such as occupancy, for the home automation and for the provision of services whether the apartment is occupied or not, for example, "a man enters an apartment each day at 22:30 and leaves at 9:30". Similarly, any events and/or cross-events may form patterns. Cross-patterns may take place when two or more events occur at known ranges of time of the patterns. Objects, cross objects, events, cross-events, patterns, and cross-patterns may be deployed for classifications, such as identification of families in accordance with specific apartments.

According to an embodiment, the method further comprises using output data of the second and/or the third machine learning application for determining the information to be provided for adjusting the home automation system or for service provisioning at least in said space. Hence, detected events and/or sequences of events, i.e. patterns, are used as control information for adjusting one or more parameters of the home automation system or for enabling one or more services to be provided in the space.

The second and/or the third machine learning applications may be implemented as neural networks. Thus, the neural networks are trained to detect events, for example using a training algorithm, or a meta-level neural network providing the training signal. The second and/or the third neural networks may be provided with initial training data e.g. regarding the parameters of the space, the number of the persons living and constantly visiting the space and the sensor data required for carrying out their object recognition. The second and/or the third neural networks may be trained with said initial training data to detect at least some basic events, such as a member of a family entering or exiting the space, and to provide information about the occupancy of the space as control information to the home automation system.

Upon detecting real-world events occurring in the non-private subspace through the multimodal sensor data of the system, the second and/or the third neural networks aim to classify the detected event to match with a learned event. However, if no learned event sufficiently corresponds to the multimodal sensor data obtained from the occurrence, the multimodal sensor data may be stored for further training of neural network. Hence, if a similar occurrence with substantially similar multimodal sensor data repeatedly takes place, the training algorithm or the meta-level neural network providing the training signal may train the neural network, e.g. by adjusting the weights of the neural network, to learn to identify the occurrence with said substantially similar multimodal sensor data as a new event, and thereafter classify similar occurrence as the new event.

It is evident that the learning process is even more important for the third machine learning application or neural network aiming to detect the sequences of events, i.e. the patterns. It can be concluded that the number of various events taking place in the non-private subspace and being pertinent for adjusting the home automation system or for enabling service provisioning in said space is still reasonably limited. Thus, the second machine learning application or neural network aiming to detect single events needs to be learned to detect a limited number of events, such as a couple of dozens. However, a much larger, even an unlimited, number of various permutations of a sequence of two or more detected events may take place in the non private subspace. Consequently, if more clever control information for adjusting the home automation system or for enabling service provisioning in said space based on the detected patterns is desired, the importance of the training of the third machine learning application or neural network is emphasized.

Figure 6 shows an example of the method for providing control information for adjusting the home automation system or for enabling service provisioning on a general level. As a starting point, multimodal data 600 obtained from the plurality of sensors is provided 602 for the machine learning 604. The form of data used for machine learning, such audio or video, depends on the machine learning task in question. For example, for detecting entry and exit events video data may be provided for the machine learning. However, for detecting the events and/or patterns and actions to be taken based on the events and/or patterns, additional or alternative machine learning tasks may exist using also other complementary or alternative sources of data, such as audio.

In an exemplifying manner, Figure 6 shows the machine learning 604 as involving three different learning tasks, such as object recognition 606, event detection 608 and pattern detection 610. These tasks may be implemented e.g. as three different neural networks connected in series, as illustrated in Figure 6. Alternatively, there may be one or more machine learning applications configured to carry out the whole sequence of machine learning tasks. It is noted that it may not be necessary to carry out all learning tasks 606, 608, 610 on each data processing round, but the output 612 of the machine learning 604 may include an output of the object recognition 606 or the event detection 608 or both. It is further noted that the number of machine learning tasks and the machine learning applications, such as neural networks, is not limited to the examples shown in Figures, but they may vary depending on the desired machine learning tasks and their preferred implementation.

The information based on the multimodal sensor data 600 and machine learning 604 is used for control purposes based on the trained logic of the machine learning applications. The output 612 of the machine learning task, which may comprise one or more of a recognized object, a detected event or a detected pattern, is used as an input for the automation tasks 614. The automation tasks 614 may further involve configuring the output of the machine learning applications for desired objectives of automation. Such configuration may occur, for example, if the machine learning 604 based on the initial data 600 fails to provide adequate outcome for automation. In such case, additional training of one or more of the machine learning tasks is needed for the machine learning. Nevertheless, as a result of the configuration carried out by the automation tasks 614, the control of the automation tasks is achieved without complicated user interfaces and necessity of active involvement of the user managing the automation.

The output 616 of the automation tasks 614 may involve one or more messages or commands sent to actuators or implementors 618 of services in question. Similarly, actuators or implementors 618 of services may send one or more messages or commands back to a control unit of automation 614. Such communication may comprise a direct exchange of control data, such as temperature reading from an external sensor or a voice command through a smart speaker. However, the data on service implementation, such as video feed on behavior, certain act or gesture, may also be directed to machine learning. For the gathering of such data on service implementation, the sensors 600 of the system may be used, or any other external sensor, such as wearable sensors. From the actions 618 carried out by the actuators or implementors according to the control data 616, a feedback data 620 may be supplied via the sensor data 600 to the machine learning 604 for enhancing the automation and service delivery.

Figures 7a and 7b show an example of a hierarchy of the machine learning applications according to an embodiment. Figures 7a and 7b show an example of nested machine learning applications, but a skilled person appreciates that the machine learning applications could be arranged e.g. in series, as shown in Figure 6.

According to an embodiment, the machine learning task can be divided into a main task(s) (Figure 7a) and related advanced machine learning task(s) (Figure 7b). Both types of tasks contain a hierarchy, but they are hierarchical between themselves also. Advanced machine learning task may contribute to the main machine learning task with a more detailed information for further enhancing the automation process.

In Figure 7a, the innermost machine learning application 700 comprises an object predictor 700 (a.k.a. object recognition application), which uses the sensor data obtained from one or more of the multimodal sensors as an input 702. The task of the object predictor is to identify at least one object from the sensor data at a time instance t = t, in accordance with the task. The output 704 of object predictor is a recognition of any biotic or abiotic object. The output 704 of the object predictor may be used as such, for example for providing an identification data of an object to the automation tasks 614 in Figure 6.

However, for further utilizing the recognition of the object, the recognition data is supplied to an input 708 of an event predictor 706 (a.k.a. event detection application). In addition to the recognition data of the object, the input 708 of the event predictor comprises a sensor data sequence at a range of time from ti to tn. The output 710 from the event predictor from sensor data sequence between times t, - t n is a detected event E,. The output 710 of the event predictor may be used as such, for example for providing a detected event to the automation tasks 614 in Figure 6, which may further provide one or more control signals for adjusting the home automation system based on the detected event.

Again, for further utilizing the detection of the event, the event data is supplied to an input 714 of a pattern predictor 712 (a.k.a. pattern or a sequence of events detection application). In addition to the event data E,, the input 714 of the pattern predictor comprises at least one further event E j and possibly sensor data from a time range covering both events E, and Ej. It is noted the event Ej may take place before (t=t a - t n -i) or after (t=t n+i - t x ) the event Ej, and there may be one or more still further events, which may form a pattern to be detected. The output 716 of the pattern detector is a pattern related to the sequence of detected events Ej, Ej, .. . A pattern comprises a sequence of events related to the detected objects during a period of time.

Figure 7b illustrates the advanced machine learning task(s), which may provide more detailed information, especially relating to recognized cross objects and/or detected cross-events or cross-patterns for further enhancing the automation process. Flerein, cross-objects may relate to a plurality of objects recognized from the sensor data at the same time instance t=t,. Similarly, cross-events may relate to a plurality of events detected at least partly simultaneously during the same range of time from t, to t n . Thus, cross events do not take place sequentially, but at least partly concurrently. Further, cross-patterns may relate to a plurality of patterns detected at least partly simultaneously during the same range of time from t=t a - t n -i , t= t, - t n , t=t n+i - tx. Flence, cross-patterns do not take place sequentially, but at least partly concurrently.

The hierarchy of recognizing cross-objects and detecting cross-events and cross-patterns may be described similarly in Figure 7b as the recognition of objects and the detection of events and patterns in Figure 7a. Flerein, the output 704B of the object predictor 700B is a cross-object CO,, the output 710B of the event detector 706B is a cross-event CE,, and the output 716B of the event detector 712B is a cross-pattern CE,, CE j , ... Thus, a number of cross patterns may occur simultaneously related to main machine learning task. Accordingly, the number and complexity of the detection tasks are not limited and the exemplified hierarchy as disclosed in Figures 7a and 7b provides a linkage between machine learning tasks.

The system, or at least a subset of the features of the system, are preferably integrated into one apparatus. Figure 8 shows an apparatus 800 according to an embodiment, wherein the apparatus may comprise a plurality of sensors 802 for collecting and delivering multimodal sensor data; one or more processing units CPU; 804 for carrying out at least parts of the machine learning and automation tasks; a connectivity unit 806, such as but not limited to WLAN, 4G, 5G, Bluetooth, ZigBee, EnOcean, Wirepas, Ethernet; a positioning unit 808, such as WLAN, Bluetooth or other indoor positioning system; and a memory 810 for storing at least a part of the computer code of the machine learning and automation tasks.

The apparatus may comprise a light 812, such as a LED light in any form or shape, and power supply means 814, which be connected, for example, to a power outlet 816 that is commonly needed for any kind of lighting fixture, for example, in the ceiling of the non-private subspace, such as an entrance room of an apartment. Thus, the apparatus according to the embodiment may replace any form of other lighting fixture and integrate a range of sensors of a monitored space, for example, in the ceiling or other surface of the space.

The sensors 302 may be present in the apparatus in various embodiments, and they may comprise e.g. at least one camera sensor, and one or more further sensors, such as audio, video, CO2, CO, VOC, infrared, smoke, gas, temperature, humidity, and/or air pressure sensors. For example, capabilities of the camera sensor may cover infrared and photogrammetry for the aims of precise measurements on monitored objects and cross-objects and their distances to surrounding space. For enabling a context for the measurements, the apparatus 800 may act as the origin (0,0,0) of the Euclidean coordinate system or any other system of coordinates. Alternatively or in addition, the embodiments may use a signal, WLAN, Bluetooth, 4G, 5G, image, video, optics or proximity based positioning detection, or any form of indoor navigation or other means for establishing the positioning capacity of the system.

According to an embodiment, the system comprises means for adjusting camera angle of the at least one camera sensor. As described above, the camera angle of the one or more camera sensors are preferably directed towards the non-private subspace. Thus, the camera angle may be adjusted e.g. manually or via a user interface (Ul) of the system. The Ul may be provided in the apparatus or it may be a remote Ul, e.g. an application operated over a connection to the connectivity unit.

The system may also utilize data from external sources 818, such as sensor data from sensor locating outside the space or data provided by various services, such as weather forecasts.

The information of the spatial position can be used, for example, in support of the safety and well-being services supported by the system through the feedback from actuators and service providers, such as providing an automated message for the fire squad about the occupancy of an apartment or a group of apartments, such as in the case of high-rise buildings. From the audio point of view, the microphones of the apparatus may be located on the surface to allow known behavior of sound waves impinging on the surface of a rigid body to determining the direction of the sound wave.

The automated services enabled by the system include but are not limited to indoor air control, heating, cooling, safety and well-being, and logistics, in- home deliveries, or any other services. The system is applicable in any form of housing and can be adopted to other spatial context as well.

According to an embodiment, at least a part of the machine learning and home automation tasks are arranged to be carried out remotely from the space. Thus, the system may be arranged to function autonomously through a network 820, such as cloud connected, gateway mediated or mesh network connected, without a necessity to use a user interface for maintenance and control of the system. Thus, at least a part of the machine learning and home automation tasks related to the automated services 822 may be implemented e.g. on a remote server, wherein the system provides the control of the actuators and the services based on the collected multimodal sensor data. On the other hand, feedback from the actuators and the service providers based on the detected use patterns and other behavioral data may be supplied via the network 820 to the remote machine learning and home automation tasks to train them further.

The system may benefit from and/or contribute to the skills of any third party device, such as a smart speaker, for changing a preset modes, such as user profiles, of the system. For example, indoor air control, heating, and/or cooling may include a variety of preset modes for controlling the automated services such as "eco", "comfort", "health", "basic", "wellness" that can be changed at will through a computer interface, app, or any third party device.

In general, the various embodiments may be implemented in hardware or special purpose circuits or any combination thereof. While various embodiments may be illustrated and described as block diagrams or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.

A skilled person appreciates that any of the embodiments described above may be implemented as a combination with one or more of the other embodiments, unless there is explicitly or implicitly stated that certain embodiments are only alternatives to each other.

The various embodiments can be implemented with the help of computer program code that resides in a memory and causes the relevant apparatuses to carry out the invention. Thus, the implementation may include a computer readable storage medium stored with code thereon for use by an apparatus, which when executed by a processor, causes the apparatus to perform the various embodiments or a subset of them. In addition, or alternatively, the implementation may include a computer program embodied on a non- transitory computer readable medium, the computer program comprising instructions causing, when executed on at least one processor, at least one apparatus to apparatus to perform the various embodiments or a subset of them. For example, an apparatus may comprise circuitry and electronics for handling, receiving and transmitting data, computer program code in a memory, and a processor that, when running the computer program code, causes the apparatus to carry out the features of an embodiment.

It will be obvious for a person skilled in the art that with technological develop ments, the basic idea of the invention can be implemented in a variety of ways. Thus, the invention and its embodiments are not limited to the above-described examples but they may vary within the scope of the claims.