Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND SYSTEM FOR DETECTING AND QUANTIFYING COMPUTER MALWARE ACTIVITY IN NETWORKS
Document Type and Number:
WIPO Patent Application WO/2024/119246
Kind Code:
A1
Abstract:
A method for estimating a risk of computer malware infections in a computer network comprising a plurality of nodes. Data features are extracted from a network traffic dataset of the computer network, where the extracted data features comprise computer network data features and malware data features of different types of computer malware infections. Risk values of at least infectivity of the computer malware are derived from the malware data features. A stochastic model is generated comprising probability distributions of each node of the computer network transitioning between compartments. A compartmental model is generated including the compartments, the stochastic model, and an indication of one or more of the derived risk values that are associated with transitioning at least one node between the compartments, where the compartmental model accounts for recursive relationships between the different types of computer malware. The risk of computer malware infections is estimated using the compartmental model.

Inventors:
LYNAR TIMOTHY MICHAEL (AU)
MODINI JESSEMYN MARJORIE (AU)
Application Number:
PCT/AU2023/051280
Publication Date:
June 13, 2024
Filing Date:
December 08, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NEWSOUTH INNOVATIONS PTY LTD (AU)
International Classes:
G06F21/57; G06F18/2415; G06F21/55; G06F21/56; H04L9/40; H04W12/128
Attorney, Agent or Firm:
FB RICE PTY LTD (AU)
Download PDF:
Claims:
CLAIMS:

1. A method for estimating a risk of computer malware infections in a computer network that comprises a plurality of nodes, the method comprising: extracting data features from a network traffic dataset of the computer network, wherein the extracted data features comprise computer network data features and malware data features of different types of computer malware infections; deriving risk values of at least infectivity of the different types of computer malware from the malware data features; generating a stochastic model comprising Bayesian probability distributions, a time-based reproduction number, and a node-level dispersion number between each node of the computer network transitioning between a plurality of compartments; generating a spatio-temporal compartmental model including: the compartments; the stochastic model; and an indication of one or more of the derived risk values that are associated with transitioning at least one node of the computer network between at least some of the compartments, wherein the compartmental model accounts for one or more recursive relationships between the different types of computer malware; and estimating the risk of computer malware infections in the computer network using the compartmental model.

2. The method of claim 1, wherein the one or more recursive relationships between the different types of computer malware are determined by processing the malware data features.

3. The method of claim 1 or 2, wherein Bayesian model averaging is used, on a plurality of heterogeneous compartment models with a plurality of compartments and nodes, to correct for a data shift between a training data set and an execution data set.

4. The method of claim 2 or 3, wherein the processing of the malware data features to determine the one or more recursive relationships includes identifying: a first type of computer malware infection in at least one node of the computer network; and a second type of computer malware infection in the least one node that occurs sequentially following the first type of computer malware infection.

5. The method of any of claims 1 to 4, wherein the risk values associated with transitioning at least one node of the computer network between at least some of the compartments are determined at least in part from the one or more recursive relationships between the different types of computer malware.

6. The method of any of claims 1 to 5, wherein the probability distributions of the stochastic model are determined at least in part from the one or more recursive relationships between the different types of computer malware.

7. The method of any of claims 1 to 6, wherein the one or more of the determined risk values are processed to indicate a relative propensity of the malware type to cause an infection of the node within the computer network, wherein the relative propensity to cause an infection is correlated with an intentional behaviour of the malware type.

8. The method of any of claims 1 to 7, wherein generating the compartmental model further includes processing the data features to calculate one or more dispersion parameters.

9. The method of claim 8, wherein the one or more dispersion parameters are specific to each type of computer malware, and wherein generating the compartmental model further includes processing the dispersion parameter of each type of computer malware and the one or more recursive relationships between the different types of computer malware to determine an indication of the overall transmissibility of malware in the computer network.

10. The method of any of claims 6 to 9, wherein the intentional behaviour of the malware type comprises one or more of: spreading one or more types of computer malware among the computer network; introducing further computer malware to the computer network according to the one or more recursive relationships; establishing a botnet using the computer network; exfiltrating host information from the nodes of the computer network; controlling the nodes of the computer network; and disabling one or more functionalities of the nodes of the computer network.

11. The method of any of claims 1 to 10, wherein for each respective type of the computer malware, the set of compartments comprise: a susceptible (S) compartment indicating a number of the nodes that are susceptible to the respective type of the computer malware; an unsusceptible (U) compartment indicating a number of the nodes that are unsusceptible to the respective type of the computer malware; an exposed (E) compartment indicating a number of the nodes that are exposed to the respective type of the computer malware; an infected (I) compartment indicating a number of the nodes that are infected by the respective type of the computer malware; a recovered (R) compartment indicating a number of the nodes that are recovered from the respective type of the computer malware; a carrier (C) compartment indicating a number of the nodes that carry the respective type of computer malware; and a non-recoverable (N) compartment indicating a number of the nodes that are unrecoverable from the respective type of the computer malware.

12. The method of claim 11, wherein for each respective type of the computer malware, the infected (I) compartment comprises: an incubation compartment indicating a number of incubating nodes that are infected but not attacked by the respective type of the computer malware; and a symptomatic compartment indicating a number of symptomatic nodes that are both infected and attacked by the respective type of the computer malware, wherein, in response to a given incubation node remaining in the incubation compartment for an incubation period that is greater than or equal to an incubation threshold, the given incubation node transitions from the incubation compartment to the symptomatic compartment.

13. The method of claim 12, wherein the incubation threshold is determined based on at least one of: the incubation node; and the malware type(s) infecting, but not attacking, the incubation node.

14. The method of any of claims 10 to 13, wherein the probability distributions of the stochastic model comprise: an exposed-to-infected probability distribution of the nodes in the exposed (E) compartment transitioning to the infected (I) compartment; an exposed-to-carrier probability distribution of the nodes in the exposed (E) compartment transitioning to the carrier (C) compartment; an exposed-to-recovered probability distribution of the nodes in the exposed (E) compartment transitioning to the recovered (R) compartment; an infected-to-recovered probability distribution of the nodes in the incubation compartment transitioning to the recovered (R) compartment; a recovered-to-unsusceptible probability distribution of the nodes in the recovered (R) compartment transitioning to the unsusceptible (U) compartment; and a recovered-to-carrier probability distribution of the nodes in the recovered (R) compartment transitioning to the carrier (C) compartment.

15. The method of any of claims 10 to 14, wherein the risk values comprise: a susceptible-to-exposed risk value being a rate of the nodes in the susceptible (S) compartment transitioning to the exposed (E) compartment; a unsusceptible-to-exposed risk value being a rate of the nodes in the unsusceptible (U) compartment transitioning to the exposed (E) compartment; a symptomatic-to-non-recoverable risk value being a rate of the symptomatic nodes transitioning to the non-recoverable (N) compartment; a symptomatic-to-recovered risk value being a rate of the nodes in the symptomatic compartment transitioning to the recovered (R) compartment; and a carrier-to-recovered risk value being a rate of the nodes in the carrier (C) compartment transitioning to the recovered (R) compartment.

16. The method of any of claims 1 to 15, wherein the computer network data features comprise one or more of: an indication of a number of nodes of the computer network, wherein the number of nodes is dynamically changeable over time; an indication of a number of infected nodes at an initial time; historical data of the computer network; or real-time data of the computer network.

17. The method of claim 16, wherein the compartments include a disconnected compartment indicating a number of nodes that are disconnected from the computer network at a current time (disconnected nodes), wherein any node not in the disconnected compartment has a disconnection probability of transitioning to the disconnected compartment at the current time.

18. The method of any of claims 1 to 17, wherein the malware data features comprise: a target operating system of one or more of the different types of computer malware; time-related values indicating one or more time intervals during which one or more of the different types of malware affect a respective node; a historical and statistical data set for the different types of the computer malware, the historical and statistical data set comprising one or more of: expected values of the incubation period caused by the different types of computer malware; expected values of an symptomatic period caused by the different types of computer malware; a set of infection rates of the different types of computer malware; a set of hospitalisation rates of the different types of computer malware; and a set of fatality rates of the different types of computer malware.

19. The method of claim 18, wherein the time-related values include time-related values for each node and for each type of malware, and comprise: a hospitalisation period from a first time at which the node demonstrates one or more indicators of compromise (IOCS) to a second time at which the node is infected by the malware; the incubation period from the second time to a third time at which the malware commences attacking; an symptomatic period from the third time and during which the malware is undertaking attacking; an infected-to-non-recovered period from the second time to a fourth time at which the node is transitioned to the non-recoverable (N) compartment; and an infected-to-recovery period from the second time to a fifth time at which the node is transitioned to the recovered (R) compartment.

20. The method of any of claims 1 to 19, further comprising, in response to the estimated risk of computer malware infections being greater than or equal to one or more predefined risk thresholds, controlling the computer network, wherein controlling the computer network comprises one or more of: generating one or more intimations; resetting one or more credentials of the computer network; disconnecting one or more nodes in the computer network; running anti -malware software against the malware; and disabling the computer network.

21. The method of any of claims 11 to 20, wherein for each respective type of the computer malware, the carrier (C) compartment comprises: a carrier natural host (Cnh) compartment indicating a number of carrier natural host (Cnh) nodes that are not infected by the type of computer malware but can spread the type of computer malware to other nodes in the computer network; a carrier natural reservoir (Cnr) compartment indicating a number of carrier natural reservoir (Cnr) nodes that do not actively spread but allow the persistence of the type of computer malware in the computer network; a carrier vector (Cv) compartment indicating a number of carrier vector (Cv) nodes that are developed, by the type of computer malware, into an intermediate means for spreading the type of computer malware to other nodes in the computer network; a carrier dead end host (Cdeh) compartment indicating a number of carrier dead end host (Cdeh) nodes that are infected with the type of computer malware but are unable to spread the type of computer malware to other nodes due to the isolation or limited connectivity of the nodes; and a carrier amplifier (Ca) compartment indicating a number of carrier amplifier (Ca) nodes that are particularly effective at spreading the type of computer malware to other nodes in the computer network.

22. A computer program comprising machine -readable instructions that, when executed by a computer, causes the computer to perform the method of any of claims 1 to 21.

23. A computer system for estimating and controlling risk of one or more computer malware infections in a computer network that comprises a plurality of nodes, the computer system comprising: a processor configured to: extract data features from a network traffic dataset of the computer network, wherein the extracted data features comprise computer network data features and malware data features of different types of computer malware infections; derive risk values of at least infectivity of the different types of computer malware from the malware data features; generate a stochastic model comprising Bayesian probability distributions, a time-based reproduction number, and a node-level dispersion number between each node of the computer network transitioning between a plurality of compartments; generate a spatio-temporal compartmental model including: the compartments; the stochastic model; and an indication of one or more of the derived risk values that are associated with transitioning at least one node of the computer network between at least some of the compartments, wherein the compartmental model accounts for one or more recursive relationships between the different types of computer malware; and estimate the risk of computer malware infections in the computer network using the compartmental model.

Description:
"Method and system for detecting and quantifying computer malware activity in networks"

Technical Field

[0001] This disclosure relates to analysing, estimating and controlling malware infections in a computer network that comprises multiple nodes by using epidemiological models of malware spread.

Background

[0002] Cyber security is one of the most significant challenges in contemporary society due to the prevalence computer systems and devices, and the ever increasing reliance on data and computing capability. Specifically, the increasing degree of interconnectivity between computing devices (e.g., over wide area networks, local networks, and the Intemet-of-everything) has provided increased opportunity for conducting malicious activities. The targets, tools, techniques, tactics and procedures of malicious cyber actors are advancing with the development of computing and network technology. As a result, cyber security risk factors are dynamic and increasingly complex, and encompass hardware, software, configuration, user and environment vulnerabilities.

[0003] Malware is a threat to providing cyber security in many computing domains and applications. Malware refers to any software executable by a computer device that is intentionally designed to cause disruption to the computer device, or another device associated with the computer device (e.g., a server, client, or other device on a common network). For example, malware may leak private information, gain unauthorized access to data or systems, deprive access to data, or otherwise interfere with the security and privacy of a user. The threat posed by malware is pervasive and dynamic, increasing with actor capability and intent. [0004] Malware related phenomena within computer networks have been described according to epidemiology of human populations. For example, a “computer virus” describes a program that can “infect” other programs by modifying the program code to include a possibly evolved copy of itself. According to its infection characteristics, a virus can spread throughout a computer network using the authorization credentials of the network users. Like a virus attacking human cells, every program that becomes infected may also act as a virus thereby propagating the infection in a computer device and its associated network.

[0005] Just as securing public health requires the assessment of, and preparation for, biological disaster events, such an epidemic or pandemic, providing effective cyber security requires the detection of, response to, and recovery from malicious activity caused by malware.

[0006] Epidemiological principles have been applied to characterize the behaviour and attributes of malware in a computer network. Analogously to disease propagation in humans, malware spreads through a computer network based on a variety of factors, including the software and hardware configurations of the computers (also referred to as “nodes”) of the network, the communication behaviour of the nodes (e.g., the degree of connectedness of each node to one or more other nodes), and the use cases of the nodes within the network.

[0007] It is desirable to develop systems, devices, and methods that utilize epidemiological models to analyse, estimate and control malware infections in a computer network.

[0008] Throughout this specification the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps. [0009] Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present disclosure as it existed before the priority date of each of the appended claims.

Summary

[0010] There is provided a method for estimating a risk of computer malware infections in a computer network that comprises a plurality of nodes, the method comprising: extracting data features from a network traffic dataset of the computer network, wherein the extracted data features comprise computer network data features and malware data features of different types of computer malware infections; deriving risk values of at least infectivity of the different types of computer malware from the malware data features; generating a stochastic model comprising Bayesian probability distributions, a time-based reproduction number, and a node-level dispersion number between each node of the computer network transitioning between a plurality of compartments; generating a spatio-temporal compartmental model including: the compartments; the stochastic model; and an indication of one or more of the derived risk values that are associated with transitioning at least one node of the computer network between at least some of the compartments, wherein the compartmental model accounts for one or more recursive relationships between the different types of computer malware; and estimating the risk of computer malware infections in the computer network using the compartmental model.

[0011] In some embodiments, the one or more recursive relationships between the different types of computer malware are determined by processing the malware data features.

[0012] In some embodiments, Bayesian model averaging is used, on a plurality of heterogeneous compartment models with a plurality of compartments and nodes, to correct for a data shift between a training data set and an execution data set. [0013] In some embodiments, the processing of the malware data features to determine the one or more recursive relationships includes identifying: a first type of computer malware infection in at least one node of the computer network; and a second type of computer malware infection in the least one node that occurs sequentially following the first type of computer malware infection.

[0014] In some embodiments, the risk values associated with transitioning at least one node of the computer network between at least some of the compartments are determined at least in part from the one or more recursive relationships between the different types of computer malware.

[0015] In some embodiments, the probability distributions of the stochastic model are determined at least in part from the one or more recursive relationships between the different types of computer malware.

[0016] In some embodiments, the one or more of the determined risk values are processed to indicate a relative propensity of the malware type to cause an infection of the node within the computer network, wherein the relative propensity to cause an infection is correlated with an intentional behaviour of the malware type.

[0017] In some embodiments, generating the compartmental model further includes processing the data features to calculate one or more dispersion parameters.

[0018] In some embodiments, wherein the one or more dispersion parameters are specific to each type of computer malware, and wherein generating the compartmental model further includes processing the dispersion parameter of each type of computer malware and the one or more recursive relationships between the different types of computer malware to determine an indication of the overall transmissibility of malware in the computer network.

[0019] In some embodiments, the intentional behaviour of the malware type comprises one or more of: spreading one or more types of computer malware among the computer network; introducing further computer malware to the computer network according to the one or more recursive relationships; establishing a botnet using the computer network; exfdtrating host information from the nodes of the computer network; controlling the nodes of the computer network; and disabling one or more functionalities of the nodes of the computer network.

[0020] In some embodiments, for each respective type of the computer malware, the set of compartments comprise: a susceptible (S) compartment indicating a number of the nodes that are susceptible to the respective type of the computer malware; an unsusceptible (U) compartment indicating a number of the nodes that are unsusceptible to the respective type of the computer malware; an exposed (E) compartment indicating a number of the nodes that are exposed to the respective type of the computer malware; an infected (I) compartment indicating a number of the nodes that are infected by the respective type of the computer malware; a recovered (R) compartment indicating a number of the nodes that are recovered from the respective type of the computer malware; a carrier (C) compartment indicating a number of the nodes that carry the respective type of computer malware; and a non-recoverable (N) compartment indicating a number of the nodes that are unrecoverable from the respective type of the computer malware.

[0021] In some embodiments, for each respective type of the computer malware, the infected (I) compartment comprises: an incubation compartment indicating a number of incubating nodes that are infected but not attacked by the respective type of the computer malware; and a symptomatic compartment indicating a number of symptomatic nodes that are both infected and attacked by the respective type of the computer malware, wherein, in response to a given incubation node remaining in the incubation compartment for an incubation period that is greater than or equal to an incubation threshold, the given incubation node transitions from the incubation compartment to the symptomatic compartment. [0022] In some embodiments, the incubation threshold is determined based on at least one of: the incubation node; and the malware type(s) infecting, but not attacking, the incubation node.

[0023] In some embodiments, the probability distributions of the stochastic model comprise: an exposed-to-infected probability distribution of the nodes in the exposed (E) compartment transitioning to the infected (I) compartment; an exposed-to-carrier probability distribution of the nodes in the exposed (E) compartment transitioning to the carrier (C) compartment; an exposed-to-recovered probability distribution of the nodes in the exposed (E) compartment transitioning to the recovered (R) compartment; an infected-to-recovered probability distribution of the nodes in the incubation compartment transitioning to the recovered (R) compartment; a recovered-to- unsusceptible probability distribution of the nodes in the recovered (R) compartment transitioning to the unsusceptible (U) compartment; and a recovered-to-carrier probability distribution of the nodes in the recovered (R) compartment transitioning to the carrier (C) compartment.

[0024] In some embodiments, the risk values comprises: a susceptible-to-exposed risk value being a rate of the nodes in the susceptible (S) compartment transitioning to the exposed (E) compartment; a unsusceptible-to-exposed risk value being a rate of the nodes in the unsusceptible (U) compartment transitioning to the exposed (E) compartment; a symptomatic-to-non-recoverable risk value being a rate of the symptomatic nodes transitioning to the non-recoverable (N) compartment; a symptomatic-to-recovered risk value being a rate of the nodes in the symptomatic compartment transitioning to the recovered (R) compartment; and a carrier-to- recovered risk value being a rate of the nodes in the carrier (C) compartment transitioning to the recovered (R) compartment.

[0025] In some embodiments, the computer network data features comprise one or more of: an indication of a number of nodes of the computer network, wherein the number of nodes is dynamically changeable overtime; an indication of a number of infected nodes at an initial time; historical data of the computer network; or real-time data of the computer network.

[0026] In some embodiments, the compartments include a disconnected compartment indicating a number of nodes that are disconnected from the computer network at a current time (disconnected nodes), wherein any node not in the disconnected compartment has a disconnection probability of transitioning to the disconnected compartment at the current time.

[0027] In some embodiments, the malware data features comprise: a target operating system of one or more of the different types of computer malware; time-related values indicating one or more time intervals during which one or more of the different types of malware affect a respective node; a historical and statistical data set for the different types of the computer malware, the historical and statistical data set comprising one or more of: expected values of the incubation period caused by the different types of computer malware; expected values of an symptomatic period caused by the different types of computer malware; a set of infection rates of the different types of computer malware; a set of hospitalisation rates of the different types of computer malware; and a set of fatality rates of the different types of computer malware.

[0028] In some embodiments, the time-related values include time-related values for each node and for each type of malware, and comprise: a hospitalisation period from a first time at which the node demonstrates one or more indicators of compromise (IOCS) to a second time at which the node is infected by the malware; the incubation period from the second time to a third time at which the malware commences attacking; an symptomatic period from the third time and during which the malware is undertaking attacking; an infected-to-non-recovered period from the second time to a fourth time at which the node is transitioned to the non-recoverable (N) compartment; and an infected-to-recovery period from the second time to a fifth time at which the node is transitioned to the recovered (R) compartment. [0029] In some embodiments, the method further comprises in response to the estimated risk of computer malware infections being greater than or equal to one or more predefined risk thresholds, controlling the computer network, wherein controlling the computer network comprises one or more of: generating one or more intimations; resetting one or more credentials of the computer network; disconnecting one or more nodes in the computer network; running anti -malware software against the malware; and disabling the computer network.

[0030] In some embodiments, for each respective type of the computer malware, the carrier (C) compartment comprises: a carrier natural host (Cnh) compartment indicating a number of carrier natural host (Cnh) nodes that are not infected by the type of computer malware but can spread the type of computer malware to other nodes in the computer network; a carrier natural reservoir (Cnr) compartment indicating a number of carrier natural reservoir (Cnr) nodes that do not actively spread but allow the persistence of the type of computer malware in the computer network; a carrier vector (Cv) compartment indicating a number of carrier vector (Cv) nodes that are developed, by the type of computer malware, into an intermediate means for spreading the type of computer malware to other nodes in the computer network; a carrier dead end host (Cdeh) compartment indicating a number of carrier dead end host (Cdeh) nodes that are infected with the type of computer malware but are unable to spread the type of computer malware to other nodes due to the isolation or limited connectivity of the nodes; and a carrier amplifier (Ca) compartment indicating a number of carrier amplifier (Ca) nodes that are particularly effective at spreading the type of computer malware to other nodes in the computer network.

[0031] In some embodiments, the method further comprises re-extracting data features as re-extracted data features; re-generating an updated compartmental model derived from the re-extracted data features and the compartmental model; and re- estimating the risk of computer malware infections in the computer network using the updated compartmental model. [0032] There is also provided a computer program comprising machine-readable instructions that, when executed by a computer, causes the computer to perform any of the methods described herein.

[0033] There is further provided a computer system for estimating and controlling risk of one or more computer malware infections in a computer network that comprises a plurality of nodes, the computer system comprising: a processor configured to: extract data features from a network traffic dataset of the computer network, wherein the extracted data features comprise computer network data features and malware data features of different types of computer malware infections; derive risk values of at least infectivity of the different types of computer malware from the malware data features; generate a stochastic model comprising Bayesian probability distributions, a time-based reproduction number, and a node-level dispersion number between each node of the computer network transitioning between a plurality of compartments; generate a spatiotemporal compartmental model including: the compartments; the stochastic model; and an indication of one or more of the derived risk values that are associated with transitioning at least one node of the computer network between at least some of the compartments, wherein the compartmental model accounts for one or more recursive relationships between the different types of computer malware; and estimate the risk of computer malware infections in the computer network using the compartmental model.

Brief Description of Drawings

[0034] Some embodiments are described herein below with reference to the accompanying drawings, wherein:

[0035] Figure la is a schematic diagram of an example computer network comprising a plurality of nodes that have a risk of computer malware infections and a controller computer for analysing, estimating and controlling the computer network; [0036] Figure lb is a block diagram of an exemplary configuration of a controller computer system for analysing, estimating and controlling computer malware infections in a computer network according to one embodiment;

[0037] Figure 2a is a block diagram illustrating a SUEICRN (Susceptible- Unsusceptible-Exposed-Infected-Carrier-Recovered-Non-recover able) compartmental model for estimating a risk of malware infections in a computer network;

[0038] Figure 2b is a block diagram illustrating a variety of risk values for each type of computer malware according to one embodiment;

[0039] Figure 2c is a block diagram illustrating a variety of probability distributions for nodes transitioning between respective compartments according to one embodiment;

[0040] Figure 3a is a flow diagram of a process for estimating a risk of computer malware infections in a computer network and, optionally, controlling the computer network;

[0041] Figure 3b is a flow diagram of a process for updating the generated compartmental model;

[0042] Figure 3c is a block diagram illustrating the control of the computer network in response to the estimated risk of computer malware infections being greater than or equal to one or more predefined risk thresholds according to one embodiment;

[0043] Figure 4a is a block diagram illustrating extracted data features according to one embodiment;

[0044] Figure 4b is a block diagram illustrating time values according to one embodiment; [0045] Figure 4c is a block diagram illustrating time values corresponding to a node transitioning between respective compartments according to one embodiment;

[0046] Figure 5 is a block diagram of malware infections to a node where there are one or more recursive relationships between the different types of computer malware;

[0047] Figure 6a is an example of raw packet capture data from a real-world campus’ DNS network traffic.

[0048] Figure 6b is an example of a JavaScript Object Notation (json) output of a conversion of the raw packet capture data of Figure 6a.

[0049] Figure 7 is a block diagram illustrating the spread of computer malware when the basic reproduction number R o equals to 2.

[0050] Figure 8a is a first set of graphs illustrating simulation results for traditional SEIR compartmental model with varying time intervals;

[0051] Figure 8b is a second set of graphs illustrating simulation results for a basic SUEICRN compartmental model with varying time intervals, wherein the incubation threshold for each node is either fixed or randomised;

[0052] Figure 8c is a third set of graphs illustrating simulation results for a SUEICRN compartmental model with varying time intervals, wherein the incubation threshold for each node is randomised;

[0053] Figure 8d is a fourth set of graphs illustrating simulation results for an extended SUEICRN compartmental model with varying time intervals by malware infections caused by different types of malware; and

[0054] Figure 8e is a fifth set of graphs illustrating simulation results for a SUEICRN compartmental model with varying time intervals considering the recursive relationships of the different types of malware in Figure 8d. Description of Embodiments

[0055] Epidemiological compartmental models provide a theoretical framework for which malware activity (e.g., propagation) can be detected and/or predicted as a function of the technical and environmental factors of the computer network. Each node (analogous to a human in a population) is assigned to a compartment at any given time, where the compartments are labelled to represent corresponding infection states of the malware. Nodes may progress between compartments over time, and the order of the labels usually indicates the flow patterns between the compartments.

[0056] For example, the SIR Model comprises three variables: the number of susceptible individuals (S), the number of infected individuals (I) and the number of recovered individuals (R). Another variant is the SEIR model (Susceptible-Exposed- Infected-Removed) which introduces a fourth variable; ‘exposed’ (E), to consider incubation rates where a host might be infected yet not infectious. The SI model (Susceptible-Infectious) and SIS model (Susceptible-Infectious-Susceptible) are additional variants of conventional compartmental models. The SI model does not account for a recovery phase, and the SIS model accounts for infection which can occur after initial recovery.

[0057] Traditional epidemiological compartmental models such as SIR and SEIR are limited in their ability to profile malicious activity of a cyber-threat, such as malware. Specifically, the conventional models are limited in their ability to account for various factors that are specific to the behaviour of malware and the operation of computing devices connected over a network. For example, conventional compartmental models implicitly assume that the population of computing nodes is static. However, this does not accurately reflect a computer network where devices (nodes) can frequently be removed, turned off, patched or otherwise disconnected.

[0058] Conventional compartmental models also assume that the entire population is susceptible to exposure and therefore infection. However, this does not reflect the operation of most computer networks in which the devices have a range of hardware and software configurations (e.g., operating systems) fundamentally altering their susceptibility to infection. For example, a network with both Windows and Linux operating systems will have a reduced population susceptibility to Windows malware.

[0059] Further, conventional compartmental models also do not differentiate between infectious and infected states, which limits the ability to distinguish or model malware that is designed to propagate other forms of discrete malware. This prevents the effective detection and subsequent control of malware that is designed for persistence and spread like Trojans, rather than conquering the host like Ransomware.

[0060] Furthermore, conventional models do not account for a node or device acting as a carrier, where the node is not necessarily infected itself (e.g., due to its immunity as represented by the unsusceptible state) but is infectious (i.e., is able to propagate malware) to other nodes. The ability of a node to act as a carrier of malware in a network may depend on one or more characteristics of the computer malware (e.g., the targeted operating system of the malware) and the node (e.g., immunity to certain malware types), and the connectivity of the node to the network 110 (e.g., the node has high connectivity to other nodes in the network 110, or the node is isolated or removed from the network as a precaution to hinder the further spread of the malware).

[0061] The ability to address the above drawbacks is limited due to the deterministic definition of the compartmental models (i.e., where node behaviour is represented by a set of differential equations). That is, there is a lack of dynamic models for assessing and predicting malware spread dynamics which account for the specific propagation characteristics of the malware, as observed from a real-time analysis of the operation of the computing network (i.e., based on analysis of the network traffic).

[0062] It is desired to provide methods and systems for detecting and quantifying computer malware activity that ameliorate one or more of the aforementioned drawbacks, or any other drawbacks of the prior art, or that at least provide a useful alternative. Overview

[0063] Disclosed herein are methods and systems for detecting and quantifying computer malware activity in a computer network. A controller computer is configured to receive input network traffic data and to extract respective features of the network (e.g., network configuration and communications) and the malware residing on the network nodes. A network-specific compartmental model is generated that enables the controller computer to detect and profile the malware infection and propagation in the computer network for the purpose of controlling and/or mitigating the malicious effect on the nodes.

[0064] In some embodiments, the systems and methods estimate a risk of computer malware infections in the network, for example by quantifying one or more determined or predicted propagation characteristics of one or more malware types affecting the nodes (e.g., a number of nodes that are exposed or infected by a malware type). In response to determining the estimated risk of infection in the network, the controller computer may be configured to execute one or more actions to ameliorate or correct malicious activity associated with the infection (e.g., by isolating or deactivating one or more nodes).

[0065] The propagation characteristics of the malware are modelled based on the expected behaviour of the nodes of the network (e.g., as communications between connected devices), and on the properties of the malware types. Specifically, an enhanced compartmental model is proposed that combines (i) stochastic modelling of node behaviour, characterized by a compartmental definition of the security state of each node and time -dependent transitions between respective compartments, with (ii) a determination of a degree of the inherent infectiousness of the malware type within the network (i.e., a “risk value”).

[0066] For example, in some embodiments a Markov chain is generated to probabilistically represent transitions in the node compartments over time based on historical communications between the nodes. The risk values are specific to the malware type and indicate a relative propensity of the malware type to cause further infection of hosts within the network according to an intentional behaviour of the malware (e.g., a Trojan malware has a higher propensity to infect compared to Ransomware).

[0067] The compartmental model is further configured to account for the existence of recursive relationships between types of malware affecting the network. This includes the detection and prediction of malware propagation that results from an earlier occurring malware attack on the network (i.e., by another type of malware). This provides an ability to model, for the given computing network, malicious effects of malware that is designed for persistence and spread (e.g., Trojans) rather than malware that is intended to disable the network nodes.

[0068] A unique compartmental model definition is proposed that enables modelling of computer node specific behaviors in the context of a malware infection of the network. The compartmental model definition extends traditional epidemiological compartmental models, such as SIR and SEIR, for example by allowing for explicit modeling of “susceptible” and “unsusceptible” states. This accounts for variation in device characteristics (e.g., operating system, program, and/or hardware configurations) that may translate into relative vulnerability or invulnerability to particular types of malware.

[0069] Furthermore, the proposed compartmental model represents computer malware infections with explicitly defined incubation and symptomatic sub-states. The ability to distinguish between a node that is infected with malware, but not yet infectious to other nodes (i.e., in the incubation state), may simulate the unique behaviour of particular malware within the network (e.g., inducing a predetermined delay in incubation in order to model the malware performing a download or other time -based activity prior to becoming infectious).

[0070] The methods and systems disclosed herein are advantageous in that they provide a platform for flexible and adaptive self-assessment of a malware afflicted computer network, in the form of providing real-time feedback and prediction of malicious activity of one or more nodes (i.e., as a result of a malware infection). The proposed techniques can thereby support rapid and accurate identification and prediction of malware spread and infection.

[0071] The use of a compartmental model where transitions are driven by a combination of probability distribution information and pre-determined risk values integrates the modelling of historical network behaviour with malware type-specific propagation characteristics. By weighing up the probability of transitioning between compartments, the proposed compartmental model is enhanced by considering the stochasticity and changing of conditions of malware infection in the computer network over time. Furthermore, the probability distributions and/or risk values may be adapted as the configuration of the network changes (e.g., hardware and software characteristics of the nodes) in real-time. This enables the computer network to dynamically selfregulate the operation of individual nodes in order to control or limit the impact of malicious activity resulting from malware infections.

[0072] Compared to malware detection and treatment systems and methods which are based on conventional epidemiology models of biological disease transmission, the proposed techniques are advantageous in: (i) providing a framework for the modelling of malware attacks that is uniquely adapted to the properties of the malware programs and the computer network in which they reside; (ii) enabling the modelling of nonstatic populations representing the dynamic state of computing nodes in a network (c.f., disease carrying individuals or animals); and (iii) explicitly accounting for recursive relationships between the different types of computer malware, thereby addressing a unique issue with securing complex computing networks in that the malware susceptibility and malicious behaviour of a node may vary dynamically in time depending on the malware infection(s) of the node.

[0073] For example, the spreading and infection of one malware may change the conditions of the spreading and infections of other malware among individual nodes, and the computer network. As a result, the security of a computer network against malware is advantageously improved by generating a cybersecurity specific compartmental model that accounts for recursive relationships established from data features of different types of computer malware.

Computer network and controller computer

[0074] Fig. la illustrates an example of a platform 100 including a controller computer 120 configured for determining a risk of computer malware infections in a computer network 110. The computer network 110 comprises a plurality of nodes that are interconnected, wherein the connection(s) between devices (depicted as solid lines between devices in Fig. la) are each a physical (i.e., wired) connection, an optical connection, or a wireless connection (e.g., using radio frequency techniques, WiFi, Bluetooth and/or cellular telecommunication methods). In some examples, the computer network 110 is a local area network (LAN) (e.g., the network of a campus, a company or a building containing nodes that are located in close physical proximity), or a wide area network (WAN) where the nodes are located remotely over a large geographic area.

[0075] As shown in Fig. la, the plurality of nodes of the computer network 110, labelled as 114a-l 14h, may each be implemented as any arbitrary computing device including hardware and software configurations that enable the node operate by at least: connecting to the computer network 110; and transmitting and/or receiving signals carrying information over the computer network 110. For example, a node device may be a desktop or laptop computer, a tablet computer, a mobile phone or any other smart device (e.g., a TV). One or more software operating systems may be implemented on a node device depending on the characteristics of the device and users’ needs, such as Windows, Linux, macOS, DOS, Unix, IOS and Android.

[0076] In some embodiments, the number of nodes in the computer network 110 dynamically fluctuates (i.e., is dynamically changeable) overtime. That is, any node can be removed, deactivated (e.g., turned off), patched, or otherwise disconnected from the computer network 110. For example, in computer networks with a “bring your own device” policy, a node may be disconnected/removed when the user of the node relocate their device elsewhere. In some cases, nodes can remain disconnected permanently, for example, in response to a device is permanently damaged beyond repair.

[0077] Nodes can also be added to or reconnected to the computer network 110. For example, node 114d was initially connected to nodes 114a, 114f and 114e, but is temporarily disconnected or removed from the computer network 110 at a given time instant (as depicted by the dashed outline of the node 114d and its connections in Fig. la). Therefore, in response to its disconnection from the network 110, the disconnected node 114d can no longer be exposed or susceptible to the malware spread by nodes 114a, 114f and 114e. In response to a reconnection of node 114d back to the computer network 110, node 114d may be exposed or susceptible to the malware spread by nodes 114a, 114f and 114e and has a chance of becoming infected.

[0078] In an example, nodes in the computer network 110 may have a risk of being infected by one or more types of malware (e.g., malware 116 and malware 118) from a malware carrier device 112. In the example, the nodes 114a and 114b are both infected by malware 116, wherein the malware 116 is attacking the node 114a and has not commenced attacking node 114b. Node 114c is under attack by malware 118, wherein a recursive relationship between malware 116 and malware 118 exists. For example, node 114c may become infected by malware 118 after being attacked by malware 116. The infected nodes 114a-l 14c may further spread the malware 116,118 to the nodes in connection with them, e.g., nodes 114e-l 14g.

[0079] The malware 116,118 can be any type of malware aiming to attack a variety of operating systems. For example, for Windows systems, the malware can be any one of but not limited to the following types: Virut, Necurs, Conficker, Pitou, Suppobox, Tofsee, Modpack and Nymaim. The features of each type of malware is explicated in Table 1.

[0080] As shown in Fig. la, platform 100 further includes controller computer 120 configured to analyze and estimate a risk of malware infections and controlling the computer network 110. Although a singular form is used in this disclosure, the controller computer 120 may include one or more computing devices.

[0081] The controller computer 120 is configured to extract data features from a network traffic dataset 130 (e.g., a Domain Name System (DNS)) via a data flow 111. The controller computer 120 further processes the extracted data features to analyze and estimate the risk of malware infections in the computer network 110 using the proposed compartmental model and, optionally, control the computer network 110 via data flow 115 in response to the estimated risk (i.e., in order to limit the impact of malicious activity resulting from malware infections).

[0082] The data features extracted from the network traffic dataset 130 may comprise historical and real-time data features, including computer network data features and malware data features of different types of computer malware infections.

[0083] In some embodiments, the controller computer 120 may dynamically reextract data features and update the extracted data features via data flow 113. In this way, the generated compartmental model can be updated in terms of dynamic changes of the computer malware infections and computer network characteristics (e.g., the number of nodes connected to the computer network 110). Accordingly, the assessment of the computer network with respect to computer malware infections is also dynamically updated in synchronization with the changes to the network configuration. For example, machine learning techniques may be implemented to track behavioural trends in malware as they evolve, and/or reinforcement learning may support a recalculation of compartmental model distributions to enhance the model in accordance with the trends. Table 1. Features of malware types for Windows systems.

[0084] Fig. lb illustrates a block diagram of an exemplary configuration of a controller computer system 120. The controller computer system 120 comprises a processor 121 connected to a memory 122 configured to store program instructions 122a and data 122b. The memory 122 is a computer-readable medium, such as a hard drive, a solid state disk or CD-ROM. An executable computer program, embodied by instructions 122a, stored on memory 122 causes the processor 121 to perform operations for the analysis and estimation of the risk of malware infections and, optionally, controlling the computer network 110.

[0085] The memory 122 is configured to exchange data with the processor 121 and may store the historical and empirical data features of the computer network and malware. The processor 121 may generate and store the generated compartmental model, as well as the quantified one or more determined or predicted propagation characteristics of one or more types of malware as data 122b, such as within RAM or a processor register of the memory 122.

[0086] In some embodiments, the data 122b may further include data of one or more machine learning classifiers, models and/or recognizers, such as for example a trained neural network, in the form of network parameters that have been optimised for generating the compartmental model. In one example, the processor 121 performs the data training and stores the learned parameters of the machine learning method in memory 122.

[0087] The processor 121 may receive data through different interfaces, including from an access to one or more parts of memory 122, including volatile memory, such as cache or RAM, or non-volatile memory, such as an optical disk drive, hard disk drive, storage server or cloud storage. The controller computer system 120 may further be implemented within a cloud computing environment, such as a managed group of connected servers hosting a dynamic number of virtual machines. In such cases, the processor 121 may send the data via communication port 124 to a server, such as an internet server 126. [0088] A monitor 127, in the form of a computing device including hardware and software components, is configured to present data generated by one or more analysis, estimation and/or prediction operations performed by processor 121 (e.g., to perform a risk estimation on the computer network 110). The monitor 127 receives data via communication port 125 in relation to the real-time and/or historical risk estimation results and then presents the information through images, sounds or videos. In some embodiments, the monitor 127 is configured to present additional data such as for example a status, or an indication of progress, in relation to the one or more analysis, estimation and/or prediction operations.

Enhanced (SUEICRN) compartmental model

[0089] The processor 121 is configured to generate, as part of the data 122b, a compartmental model representing a degree of malware infection and propagation in the computer network 110. The compartmental model represents a security state or “infection state” of each node 114a-h in the network 110, with respect to a given type of malware, by categorizing each node into one of a plurality of ‘compartments’. That is, each compartment of the model ‘contains’ zero, one, or more nodes of the network at any given time, thereby providing an indication of a number of nodes that share the security state associated with the compartment in relation to the particular malware. For example, an “exposed” compartment indicates a number of “exposed nodes” (i.e., nodes that are exposed to a given type of the computer malware) of the network at a given time.

[0090] In some embodiments, the compartmental model is a time-varying model of the state of the nodes, where the set of compartments are defined by a compartmental model definition. Over successive time intervals, each node is assigned to a compartment that represents its state of security against a malware, in terms of whether the node is vulnerable to, or infected by, the malware.

[0091] In some embodiments, the compartmental model is a spatio-temporal compartmental model that describes how malware spreads and evolves in space and time and how to make predictions for the future number of infections across one or more computer networks. For example, the spatio-temporal model may be generated by integrating compartmental modelling in time domain and a point process modelling approach in space-time domain. The mechanistic approach that drives the temporal dynamics into the model that accounts for spatial dependence may be further incorporated into the generation of the spatio-temporal compartmental model.

[0092] In some embodiments, data features from the network traffic dataset may include inbound and outbound IP addresses. Geographical locations of nodes can be determined from the IP addresses (e.g., through a geolocation resolver) at each time step. This spatio-temporal data, including the geographical locations of the nodes at each time step, can be further used to generate the spatio-temporal compartmental model. For example, the derivation of risk values and generation of the stochastic model may be dependent on the spatio-temporal data. By incorporating the spatial information (e.g., geographical locations) and temporal dynamics of the nodes, the spatio-temporal compartmental model is generated, providing comprehensive insights about the computer network and enhancing the estimation of the risk of computer malware infections in the computer network.

[0093] The processor 121 is configured to execute one or more compartmental model generation and analysis routines (e.g., as part of the program instructions 122a) to quantify one or more determined or predicted propagation characteristics of one or more malware types affecting the nodes. In some embodiments, as described below, the processor 121 processes the propagation characteristics determined or predicted by the compartmental model to estimate a risk of computer malware infections in the network 110.

[0094] The generation of the compartmental model by the processor 121 is performed based on an enhanced compartmental model definition that is adapted to account for the behaviour of computing network devices in the presence of malware. In contrast to the traditional SEIR model definition, the proposed SUEICRN compartmental model definition includes additional cybersecurity-specific compartments. As shown in Fig. 2a, the SUEICRN compartmental model includes the following compartments: a susceptible (S) compartment 202, an unsusceptible (U) compartment 204, an exposed (E) compartment 208, an infectious (I) compartment 210, a carrier (C) compartment 220, a recovered (R) compartment 230 and a non-recoverable (N) compartment 240. The infectious (I) compartment 210 further includes an incubation compartment 212 and a symptomatic (Is) compartment 214. The carrier (C) compartment 220 further includes a carrier natural host (Cnh) compartment, a carrier vector (Cv) compartment, a carrier dead end host (Cdeh) compartment, a carrier amplifier (Ca) compartment and a carrier natural reservoir (Cnr) compartment. Table 3 below describes the compartments of the SUEICRN model for each type of computer malware infection.

Table 3 : Compartments of the SUEICRN model

[0095] Computer malware requires some form of exposure and communication to spread, whether directly by an infected node or indirectly through a carrier. A node usually cannot be infected by computer malware if the node’s own operating system is different to a target operating system of the malware (i.e., which is required to provide an execution environment for the malware). For example, an IOS device may carry but not be susceptible to Android malware. [0096] Conventional compartment models for modelling infectious disease dynamics, such as the SEIR model, do not account for carrier behaviour of computer devices in a network. However, in the proposed SUEICRN compartmental model a node can become a carrier of the malware as represented by a transition to the carrier compartment 220.

[0097] In some embodiments, a node may transition to a sub-compartment of the carrier (C) compartment 220 (e.g., the carrier natural host (Cnh) compartment, the carrier vector (Cv) compartment, the carrier dead end host (Cdeh) compartment, the carrier amplifier (Ca) compartment and the carrier natural reservoir (Cnr) compartment) depending on the characteristics of the computer malware, the node, and/or the connectivity the node to the network 110. A node in one sub-compartment of the carrier (C) compartment may be transitioned to another sub-compartment of the carrier (C).

[0098] In some embodiments, the time period in which a node remains a carrier (the carrier time period) is independent of an incubation period (i.e., a period in which the node is exhibiting symptoms of being infected).

[0099] In some embodiments, a node may transition from the incubation compartment 212, indicating a number of incubating nodes that are infected but not attacked by each type of the computer malware, to the symptomatic compartment 214, indicating a number of symptomatic nodes that are both infected and attacked by each type of the computer malware. This transition is defined by the incubation period before the malware commences attacking. For example, a given incubation node may transition from the incubation compartment 212 to the symptomatic compartment 214 in response to the node remaining in the incubation compartment 212 for an incubation period that is greater than or equal to an incubation threshold.

[0100] In some embodiments, the incubation threshold is determined by processing the computer network data features 410 and/or computer malware data features 420, including at least one of: the incubation node; and the malware type(s) that are infecting, but not attacking, the incubation node (e.g., malware behaviour sighed in the control data set). For computer malware that is scripted or pre-defined, the incubation threshold is a static time value determined based on the knowledge of the malware behaviour (e.g., a downloader malware may take hours to access required resources, take control of the host system, and demonstrate “symptoms” (e.g., IOCS) of infection).

[0101] In some embodiments, the incubation threshold dynamically changes based on the one or more infections of a node by respective computer malware(s) in the time interval of relevance, and/or based on one or more recursive relationships between different types of computer malware causing the vulnerabilities or infections.

[0102] As discussed earlier, a node can become disconnected from the computer network 110 (e.g., in networks with a “bring your own device” policy, or caused by automated patching, undergoing malware cleaning or system repairing/restoring, and machines in “off state”). Therefore, in some embodiments, the proposed SUEICRN compartmental model further includes a disconnected compartment 250.

[0103] In some embodiments, the disconnected compartment 250 indicates a number of nodes that are disconnected from the computer network 110 at a current time interval (disconnected nodes). At each time interval, any node not in the disconnected compartment (e.g., in S, U, E, I, C, R and N compartments) has a disconnection probability of transitioning to the disconnected compartment 250. The disconnection probability can be either static or dynamic over time based on the configuration of the network and the individual nodes. A node may become disconnected irrespective of whether or not it is infected with malware.

[0104] The proposed SUEICRN compartmental model accounts for a malware specific security state of each network node. That is, in a given time interval, a node may be assigned to one compartment for one malware type but may be assigned to another compartment for a different malware type. For example, as illustrated in Table 2, at the fifth time interval, a node is in the infected (I) compartment for Malware A and B, the exposed (E) compartment for Malware C and the unsusceptible (U) compartment for Malware D. [0105] The proposed SUEICRN compartmental model further includes: a stochastic model; and an indication of one or more risk values of at least infectivity of the different types of computer malware that are associated with transitioning at least one node of the computer network between at least some of the compartments.

[0106] Fig. 2b illustrates risk values 260 for varying malware types including: i) a susceptible-to-exposed risk value 262 being a rate of the nodes in the susceptible (S) compartment transitioning to the exposed (E) compartment via path 235; ii) an unsusceptible-to-exposed risk value 264 being a rate of the nodes in the unsusceptible (U) compartment transitioning to the exposed (E) compartment via path 203; iii) a symptomatic-to-non-recoverable risk value 265 being a rate of the symptomatic nodes transitioning to the non-recoverable (N) compartment via path 219; iv) a symptomatic - to-recovered risk value 266 being a rate of the nodes in the symptomatic compartment transitioning to the recovered (R) compartment via path 215; and v) a carrier-to- recovered risk value 268 being a rate of the nodes in the carrier (C) compartment transitioning to the recovered (R) compartment via path 223.

[0107] Fig. 2c illustrates probability distributions 280 of a stochastic model generated for the compartmental model including: i) an exposed-to-infected probability distribution 282 of the nodes in the exposed (E) compartment transitioning to the infected (I) compartment path 209; ii) an exposed-to-carrier probability distribution 284 of the nodes in the exposed (E) compartment transitioning to the carrier (C) compartment via path 221; iii) an exposed-to-recovered probability distribution 285 of the nodes in the exposed (E) compartment transitioning to the recovered (R) compartment via path 235 (e.g., in response to the computer malware is cleaned); iv) an incubation-to-recover probability distribution 286 of the nodes in the infected (I) compartment transitioning to the recovered (R) compartment via path 217; v) a recovered-to-unsusceptible probability distribution 288 of the nodes in the recovered (R) compartment transitioning to the unsusceptible (U) compartment via path 233; vi) a recovered-to-carrier probability distribution 289 of the nodes in the recovered (R) compartment transitioning to the carrier (C) compartment via path 231 ; and vii) a recovered-to-susceptible probability distribution 287 of the nodes in the recovered (R) compartment transitioning to the susceptible (S) compartment (pathway not shown).

Data features, risk values, and stochastic model

[0108] Fig. 3a illustrates a process 300 for estimating a risk of computer malware infections in a computer network 110 and, optionally, controlling the computer network 110, as executed by the controller computer system 120. Embodiments of process 300 described herein relate to the extraction of data features, the derivation of risk values, the generation of a stochastic model, the generation of a compartmental model and the estimation of the risk.

[0109] At step 301, processor 121 of the controller computer system 120 extracts data features of the computer network 110 from a network traffic dataset 130 and stores the extracted data features in memory 122, as depicted in Figs, la and lb. The extracted data features may include inherent parameters, historical data and dynamic real-time data of the computer network 110 and different types of malware.

[0110] Fig. 4a shows an example of extracted data features 400 that comprise one or more representative values of computer network data features 410, including the features of each node in the computer network and the overall features of the computer network 110, such as the scalability, speed of communication, data sharing, reliability, software and hardware compatibility and security settings. The extracted data features 400 further comprise one or more values of malware data features 420 of different types of computer malware infections, which may include intrinsic features of the malware, or dynamic features changing with the infections and computer network status.

[0111] In some embodiments, the computer network data features 410 include any one or more of but not limited to: an indication of the number of nodes 412 in the computer network 110, an indication of a number of infected nodes at an initial time 414, historical data 416, and real-time data 418 of the computer network 110. [0112] In some embodiments, the number of nodes 412 in the computer network 110 is static. That is, each of the nodes 114a-h remains connected to the network and no new nodes are added to the computer network 110, such as in a configuration of a LAN for a data center.

[0113] In the other embodiments, the number of nodes 412 in the computer network 110 dynamically fluctuates over time (e.g., as assessed over a relevant interval or period). For example, in some computer networks (e.g., the network of a campus or a company having a “bring your own device” policy), each node device can be inactive (e.g., in a “sleep” mode), turned off, damaged, disconnected, or removed from the computer network 110 (e.g., for malware cleaning or system repairing/restoring). Meanwhile, new nodes can be added, and previously disconnected or removed nodes can be reconnected or reconfigured to the computer network 110.

[0114] In some embodiments, the indication of the number of infected nodes 414 is obtained by calculating the number of nodes that demonstrate indicators of compromise (IOCS), such as unusual traffic, dubious log-ins or access, tampered file and system settings, and suspicious large amount of files. For example, for the malware Necurs, the IOCs can be malicious queries sent from the nodes and activities of nodes browsing BitTorrent and uTorrent sites.

[0115] The historical data 416 of the computer network 110 may include any one or more of but not limited to: the historical traffic/service demands over time, inbound and outbound IP addresses, the organizationally unique identifier (OUI), the payload and the hostnames of nodes, previous infections and recovered records by different types of malware, typical activities of nodes, and operating system and security settings of each node.

[0116] The real-time data 418 of the computer network 110 may include any one or more of but not limited to: the real-time traffic/service demands, the current inbound and outbound IP addresses of connected nodes, the operating system, and the OUI, the payload, the hostname and current activities of each connected node and the speed of communication.

[0117] In some embodiments, the malware data features 420 of different types of computer malware infections include any one or more of: a target operating system 422 of one or more of the different types of computer malware; time-related values 424 indicating one or more time intervals during which one or more of the different types of malware affect a respective node; and a historical and statistical data set 426 for the different types of the computer malware.

[0118] The target operating system 422 of malware is the operating system that the malware aims to infect. That is, computer malware developed for one operating system usually is not able to infect a device using a different operating system. For example, a Windows binary cannot execute within a Macintosh operating system. Therefore, malware designed to infect Windows systems is typically not adapted to also cause infection in a Macintosh or Unix device. Nevertheless, devices can act as carriers for malware targeted at alternative operating systems, e.g., Windows devices can serve as carriers for Macintosh malware, spreading upon contact.

[0119] Referring to Figs. 4b and 4c, time related values 424 indicate the length of time that a node stays in one state. The time related value 424 includes a hospitalisation period 450, an incubation period 452, an infectious period 454, an infected-to-non- recovered period 456 and an infected-to-recovery period 458.

[0120] For each type of malware, the hospitalisation period 450 starts from a first time 45 la at which the node demonstrates one or more IOCS (e.g., malware downloaded on a device) to a second time 45 lb at which the node is confirmed infected by the malware.

[0121] The incubation period 452 starts from the second time 45 lb to a third time 451c, at which the malware commences attacking (i.e., executing unauthorized and malicious actions). During the incubation period 452, malware may perform preparations for commencing attacking, such as gaining access to a computer system (e.g., through backdoors built into software, through unintentional software vulnerabilities, or through flash drives), tampering with system settings to enable further malicious actions and downloading more malware from a server.

[0122] The symptomatic period 455 is the period during which the malware is performing an attack, such as exfiltrating host information (e.g., credentials, payment information and privacy information of end-users), defrauding or blackmailing endusers, disrupting one or more normal operations on the computer system of the node, disabling one or more functionalities of the node, establishing a botnet and spreading unwanted and aggressive advertising (e.g., pop-up ads).

[0123] The infectious period 456 includes an infected-to-non-recovered period 456a and an infected-to-recovery period 456b. During the infectious period, the node may undertake actions to infect other hosts. The infected-to-non-recovered period 456a starts from the second time 45 lb to a fourth time 45 Id, at which the node becomes irreversibly compromised or non-recoverable. The infected-to-recovered period 458 starts from the second time 45 lb to a fifth time 45 le, at which the node recovers back to normal (e.g., indicating a status of “not infected”).

[0124] Referring to Fig. 4a, in some embodiments the historical and statistical data set 426 of the malware data features 420 includes any one or more of but not limited to expected values 428 and a variety of heuristic rates 430. The expected values 428 and heuristic rates 430 may be specific to one or more different types of computer malware. The expected values 428 and the heuristic rates 430 may be derived by applying mathematical methods in probability theory (e.g., Lebesgue integration) to process the statistical and historical data.

[0125] The expected values 428 may include expected values of incubation period 423 caused by one or more different types of computer malware. In some embodiments, the expected value of the incubation period 423 is the weighted average of the length of the period from the time when the node is infected by the malware to the time when the malware commences attacking. The symptomatic period 425 is the weighted average of the length of the period during which the malware is undertaking attacking.

[0126] For each type of computer malware, the variety of heuristic rates 430 include the infection rate 431, the hospitalisation rate 432 and the fatality rate 433, as probabilities in a real number interval between 0 and 1. The heuristic rates 430 may be determined based on extant data sourced on historical spread analysis. The infection rate 431 is a statistical percentage of nodes that would be confirmed as infected after being exposed to malware. The hospitalisation rate 432 is the statistical percentage of nodes that would demonstrate IOCS. The fatality rate 433 is the statistical percentage of nodes that are likely to be irreversibly compromised, unable to perform its functionalities, and/or damaged beyond repair.

[0127] Referring back to Fig. 3a, at step 302, the controller computer 120 derives risk values of one or more different types of computer malware from the malware data features 400 (as extracted at step 301). The risk values indicate the infectious and malicious potential of a malware type to nodes of the computer network, which is based on historical data and malware characteristics defining its propagation behaviours (i.e., malware data features). The risk values may be malware-specific transition rates between 0 and 1, with one or more risk values relating to a state transition to a negative state. For example, a higher risk value indicates that a malware type is more likely to result in propagation transitions (e.g., transition from susceptible state to exposed state) for the majority of interfacing nodes.

[0128] In some embodiments, the risk values quantify at least infectivity (i.e., the capacity of malware to infect a computer system) of the different types of computer malware. The risk values may also quantify the morbidity (i.e., the rate of infections of a computer network) and/or the fatality of each type of computer malware (i.e., the capacity of malware to compromise a computer system irreversibly). [0129] In some embodiments, the risk values indicate a relative propensity of a malware type to cause an infection of a node within the network. The relative propensity to cause an infection is correlated with an intentional behaviour of the malware type. That is, the risk value may be determined based on the intentional behaviour of the malware type. For example, one type of malware may be determined to have a high risk of infectivity and a low risk of fatality as its intentional behaviour is to persist and spread within a computer network to maximize the number of nodes of the computer network through which the malware can launch attacks (e.g., a Trojan- Downloader). In contrast, another type of malware may be determined to have a high risk of fatality as the intentional behaviour of the malware is to conquer the end-point device. However, this other type of malware may have a limited risk of infectivity (i.e., a limited capability of spreading) due to a decreased priority on achieving network persistence.

[0130] The intentional behaviour of a malware type includes one or more types of behaviours, such as, for example, spreading one or more types of malware among the computer network (e.g., a worm that can reproduce itself and spread to other nodes such as the notorious “ILOVEY OU” worm) and introducing further computer malware to the computer network (e.g., a Trojan-Downloader such as Nymaim). The intentional behaviour may also include establishing a botnet using the computer network to carry out further malicious activities (e.g., Distributed denial of service (DDoS) attacks) as a swarm and exfiltrating host information from the nodes of the computer network (e.g., spy ware such as a keylogger). The intentional behaviour may further include controlling the nodes of the computer network (e.g., the Norton virus that can introduce frequent crashes, open undesirable programs, send out unauthorised emails and decrease the performance of a computer system) and disabling one or more functionalities of the nodes of the computer network (e.g., ransomware).

[0131] In some embodiments, the risk values of a type of malware vary for different nodes, as the end-point devices may have different operating systems (e.g., Windows, Linux, macOS, etc.) installed. For example, malware designed for Windows systems may have high risk values on Windows devices but low (even zero) risk values on Macintosh or Unix devices.

[0132] In some embodiments, the risk values are constants determined by the characteristics and definitions of the malware type. In such cases, the risk values can be directly obtained from the malware data features 420 by the controller computer 120.

[0133] In other embodiments, the risk values for a malware type depend on both the characteristics of the malware type but also on the infection status of a node (i.e., the compartment that the node belongs to indicating exposure to or infection by the malware type). For example, the risk value of infectivity of a malware type for one node may be influenced by conditions such as the number of infected nodes and carrier nodes that the node connected to.

[0134] Further, the risk value of infectivity of a malware type may also be influenced by other types of malware. For example, assuming a first node is connected to a second node infected by malware A, the risk value of the first node being infected by malware A may be increased if the first node is simultaneously infected by malware B which may enable the first node to further download malware A (from the second node).

[0135] In such cases, the risk values of each malware type may be modelled as conditional probability distributions, which may be functions accounting for the intentional behaviour (as discussed earlier) and one or more present and/or historical malware infection events. For example, if the risk values are modelled by Markov chains that map complex variables including transitions and dependencies towards an outcome, the next-state conditional probability distributions (i.e., risk values) depend on the malware infection events at the present state.

[0136] At step 304, the controller computer 120 generates a stochastic model comprising probability distributions of each node transitioning between different states (i.e., compartments) or remaining at the present state. For example, a probability distribution Pab defines a linear transition probability of a node from state a (e.g. susceptible) to state b (exposed).

[0137] The probability distributions are influenced by the characteristics of the node (e.g., whether the end-device has installed anti-malware software) and the interactions between nodes (e.g., whether a node is connected to the computer network 110 or receiving information from other nodes).

[0138] Similar to the risk values, the probability distributions may be modelled as conditional probability distributions subject to one or more conditions related to the characteristics and behaviour of the nodes. The probability distributions may be based on historical and/or present data. In some embodiments, the stochastic model is generated using supervised or unsupervised learning.

[0139] In some embodiments, the probability distributions determining transitions between each state are based on Markov chains. In some embodiments, the probability distributions are Markov Bayesian distributions. In one embodiment, discrete time Markov chains (DTMC) are implemented to support the modelling of the transition probabilities between discrete states. In some embodiments, node-to-node interaction data features are informed by epidemiological literature, including nosocomial infection dynamics, co-morbidity models, HIV spread dynamics for human risk-taking behaviours, immune suppression and emerging diseases, to determine Bayesian inference techniques.

[0140] In some embodiments, a stochastic model in the form of a transition matrix is used to define the probability distribution of state transitions. The size of the matrix is defined by the number of possible states (e.g., the size of the matrix would be n x n for n states) where the entries in each row add up to a maximum of 1 (probability distribution). The probability distribution of transitioning from state a to state b is defined as (a, b). There are two possible types of state transitions for each node; remaining in a current state a (denoted as Paa). or transitioning from state a to b (denoted as P a b). Applying a time-based Markov model, for seven exemplary states 0-6, the probability of transition can be defined as:

Too PQI P()2 PQ3 PQ4 PQ5 PQ6

Probability (ff ) = f*io PH P12 P13 P-L4 P15 P16

[0141] In some embodiments, a sensitivity analysis is conducted on the stochastic model to determine how probability distributions are affected based on input variables.

[0142] In some embodiments, the stochastic model includes one or more Bayesian probability distributions, a time-based reproduction number, and a node-level dispersion number between each node of the computer network transitioning between a plurality of compartments.

[0143] In epidemiology, the determination of the infectiousness of a disease is calculated based on its ability to spread and reproduce without hindrance. That is, if the average number of secondary infections caused by an average infective, called the basic reproduction number R o , is less than one, a disease will die out, whereas if R o exceeds one, there will be an epidemic. The basic reproduction number R o can be expressed as a product of three factors:

[0144] In epidemiology, the basic reproduction number R o accounts for both human behaviours and the characteristics of the disease. R o is defined by environmental and population behaviour and is therefore not a biological constant for estimating how fast an infection will spread within a population. R o does, however, indicate the contagiousness or transmissibility of infectious agents. This is exceptionally useful when profiling new ‘strains’ of disease, such as COVID-19. R o also helps quantify the threat of each disease type to populations. Typically, the common influenza has a R o of 2, measles has a R o of 17, and COVID-19 has a R o (estimated) of 3.38±1.40, with a range of 1.90 to 6.49 (see the publication “Alimohamadi, Yousef, Maryam Taghdir, and Mojtaba Sepandi. ‘Estimate of the basic reproduction number for COVID-19: a systematic review and meta-analysis.’ Journal of Preventive Medicine and Public Health 53.3 (2020): 151”).

[0145] Analogously, R o is directly applicable for profding malware in cybersecurity, where a range of human, environmental and machine behaviours are relevant factors in determining the spread and impact of malware within the computer network 110. For example, Fig. 7 illustrates an example of the spread of a type of computer malware where R o = 2. As shown in Fig. 7, at each time step, the average number of secondary infections caused by an infected node is 2.

[0146] Basic modelling estimates that malware R o can be upwards of 27 (see the publication “Ariffin, Muhammad Rezal Kamel, et al. ‘Mathematical epidemiologic and simulation modelling of first wave COVID- 19 in Malaysia.’ Scientific reports 11.1 (2021): 20739”). Predictive modelling demonstrates that when R o < 1, the malware spread decreases in a time period, and, when R o > 1, the malware spread will continue within its epidemic boundary. Typically, R o is limited in its application, especially in cyber security as R o assumes that the entire population is susceptible to infection. ICT, loT and OT networks often comprise a range of hardware and software configurations, which can result in diverse susceptibility based on factors including operating system, patch status, and internet connectivity.

[0147] The time-based reproduction number, denoted as R e or R t , is an effective parameter used in epidemiology to determine whether an epidemic is growing, reducing or remaining at the same case size (see the publication “Gostic, Katelyn M., et al. ‘Practical considerations for measuring the effective reproductive number, R t.’ PLoS computational biology 16.12 (2020): e 1008409”, “Gostic et al., 2020” hereinafter). The time-based reproduction number R t informs epidemic growth behaviours, including the impact of interventions such as policy changes, and population susceptibility and immunity status. The time-based reproduction number R t can present delays, therefore predictive modelling techniques can be applied to ascertain real-time estimates of epidemic growth (Gostic et al., 2020).

[0148] In some embodiments, the processor 121 applies a Nowcasting approach (see the publication “Wu, Joseph T., Kathy Leung, and Gabriel M. Leung. ‘Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study.’ The lancet 395.10225 (2020): 689- 697”) to predict the R t based on key pathogenic, epidemiologic, clinical and socio- behavioural characteristics to generate the compartmental model. The Nowcasting approach can provide near real-time information about malware threats and communicating risks to populations by correctly estimating spread dynamics.

[0149] In some embodiments, the stochastic model comprises a time-based reproduction number generated by using the time-based reproduction number generation approaches in epidemiology with adaptions as necessary for estimating the risk of computer malware infections in the computer network 110.

[0150] In some embodiments, a deterministic calculation of a node-level dispersion number is performed in association with the one or more probability distributions (e.g., Bayesian probability distributions) of the stochastic model. For example, one or more dispersion numbers/parameters may be derived for one or more of the nodes indicating the propensity for the propagation of malware infection between particular nodes, and/or over the computer network. In some embodiments, the node-level dispersion number or dispersion parameter is used to re-calculate the Bayesian probability distributions based on macro-observations of spread variability. In some embodiments, the risk value is generated based on the node-level dispersion number. The node-level dispersion number can quantify the rate at which the nodes transition between different compartments of the compartmental model and/or the rate at which malware propagates between different nodes.

[0151] In some embodiments, a dispersion parameter (k value) is calculated for each type of computer malware. The k value represents the degree of variability in the propagation of malware or other cyber threats throughout the computer network 110. Specifically, the k value represents the dispersion of the number of secondary infections caused by an infected node. The dispersion parameter can be estimated by analysing the propagation patterns of different types of computer malware through the computer network 110 over time. The estimation of dispersion parameter may be conducted by examining the number of infections caused by each type of computer malware and the time it takes for each type of computer malware to propagate through the computer network 110. By analysing the k values over time, insights of the spread of each type of computer malware across the network, such as propagation patterns of the computer malware, can be obtained.

[0152] In some embodiments, the k value of a particular type of computer malware can be used to determine its level of contagiousness or how quickly it spreads in a computer network 110. A higher value ofk indicates a more heterogeneous distribution and a higher degree of overdispersion, indicating that a limited number of nodes are responsible for a disproportionate number of secondary infections. Therefore, the dispersion parameter can be used to perform risk assessments and develop mitigation strategies.

[0153] A high value of dispersion parameter also indicates that a small number of highly infectious malware types are responsible for the majority of infections in the computer network 110, while the other types of computer malware are relatively benign and have minimal impact on the computer network. In this case, network defenders can strategically allocate the available resources by concentrating mitigation efforts on the highly infectious malware types to ensure that the most significant threats to the computer network’s integrity and security are addressed with priority.

[0154] Conversely, a low dispersion parameter suggests a more homogeneous distribution, that is, a more uniform distribution of malware infections across the computer network, where all types of computer malware pose a similar threat. In this case, network defenders might adopt a broader approach for mitigating malware infections, such as implementing network-wide security measures or enhancing overall network hygiene.

[0155] The calculation of the k value requires determining the distribution of the number of secondary infections that are caused by an infected node within the computer network. In some embodiments, this is achieved by analysing the propagation data of the malware in the network traffic dataset 130. For example, one approach for this analysis is to apply the negative binomial distribution, which is defined by two parameters: the mean p and the dispersion parameter k.

[0156] Specifically, the mean p can be calculated as the average number of secondary infections caused by-an infected node. The dispersion parameter (number) k can then be estimated using the following formula: k = (p A 2) / (o A 2 - p), where o is the sample standard deviation of the number of secondary infections. In some embodiments, the mean p and the standard deviation o can be calculated as follows: is the sample value and N is the number of values. In some embodiments, the k value is calculated for each type of computer malware.

[0157] In some embodiments, the node-level dispersion number is based on a known k value for each type of computer malware, and/or the other characteristics of the node, such as the protective measures in place or the connectivity of the node in the computer network. [0158] In some embodiments, the network traffic data is analysed using a rolling window method applied to the time-series compartmental model. The window size is chosen to balance the need for sufficient data points to calculate reliable node-level dispersion number or dispersion parameters while still capturing the dynamics of malware propagation overtime. The analysis of network traffic data is limited by the window size used. That is, changing the window size or the quality of the data may result in different node-level dispersion numbers or dispersion parameters and, consequently, different interpretations of the malware’s behavior and spread. In one embodiment, a 3 -hour rolling window is selected.

[0159] The concept of node-level dispersion number and dispersion parameter offers tangible and actionable information for malware analysis and response, which can be used to inform effective mitigation and cyber defence strategies. For example, the node-level dispersion number can measure how fast the nodes are transitioning between the plurality of compartments, and the dispersion parameter can measure the reach and impact of malware. The node-level dispersion number and dispersion parameter can determine the variability of cyber threats and their impact on the computer network 110.

Determining recursive malware relationships

[0160] At step 306, the controller computer 120 determines one or more recursive relationships between different types of malware affecting the computer network 110. As discussed earlier, a node’s vulnerability to one malware type may be influenced by the infection of one or more other malware types. For example, a malware attack by one type of malware (e.g., Conficker and Suppobox, as shown in Table 1) may result in one or more further malware attacks by one or more other types of malware. A recursive relationship between two types of malware models how a node’s vulnerability to a malware type, as represented by its security state or compartment designation, is influenced by its security state or compartment designation with respect to one or more other malware types. That is, the propensity of a given node to become susceptible to, exposed to, or infected by one type of malware may be influenced by an infection from another malware type, over time.

[0161] Fig. 5 illustrates malware infections to a node 114a where there are one or more recursive relationships between the different types of computer malware affecting the node. Table 2 describes an example of the behaviour of the individual malwares of the recursive malware infection for the node of Fig. 5 during a period (i.e., from time intervals 1 to 10) by different types of malware (Malware A-D):

Table 2. An example of a recursive malware infection of a node by malware types A to

D.

[0162] In this example, the node 114a is unsusceptible to Malware B to D until it is infected by Malware A. For example, Malware A may be a Trojan-Downloader that would download malicious fdes from a remote server and execute the malicious fdes to install further types of malware (e.g., Malware B).

[0163] At time intervals 3 and 4, during which the node 114a is infected by Malware A via data flow 501, the node that is initially unsusceptible to Malware B become exposed and susceptible to Malware B. This may because the infection of Malware A causes the downloading of an installation package for installing Malware B via data flow 505. At time interval 5, the node 114a is infected by Malware B, which causes the exposure of node 114a to Malware C via data flow 505.

[0164] A node may be infected by multiple types of malware at the same time, and there is usually no maximum number of infections by different types of malware. Any node in any compartment (e.g., in compartment A) for a specific type of malware may also be in any compartment (e.g., either in compartment A or another compartment) for another type of malware. For example, at time interval 5, the node 114a is simultaneously infected by Malware A and B; and at time interval 8, the node 114a is simultaneously infected by Malware A, B and C.

[0165] In the case of a node becoming infected by multiple types of malware, the modelling of the recursive relationships enables the controller computer 120 to determine how a change in the security state of one type of malware may cause infections by, or vulnerabilities to, other types of malware (i.e. as represented by a subsequent progression of the security state from Unsusceptible to Exposed to Infected for at least one of the other types of malware). For example, at time intervals 9 and 10, the node 114a is non-recoverable to Malware A. Nevertheless, the node 114a can still recover from infections of Malware B and C at time interval 10.

[0166] The one or more recursive relationships can be affected by the dynamically fluctuating number of nodes in the computer network, i.e., whether an infected node is connected to the computer network. For example, if a node is disconnected from the computer network, the recursive infections caused by different types of malware may be suspended (e.g., the downloading of further malware executable fdes may be paused).

[0167] The above example illustrates effect of recursive relationships of four types of malware on the security state of one node in network 110. It is to be understood that the number of types of malware and the recursive relationships that exist between any combination of the malware types are not limited, and that the controller computer 120 is configurable to dynamically account for the relationships according to any arbitrary configuration of the network 110 and the individual nodes 114a-h.

[0168] In some embodiments, the controller computer 120 is configured to determine the one or more recursive relationships by processing the malware data features 420. This can be achieved by detecting the IOCS of malware infection, or by applying signal processing methods to the extracted raw data features including, for example, feature extraction and classification algorithms to identify different types of malware.

[0169] In some embodiments, the controller computer 120 processes the malware data features to determine the one or more recursive relationships by: i) identifying a first type of computer malware infection (e.g., by Malware A in the example of Table 2) in at least one node of the computer network 110; and ii) a second type of computer malware infection (e.g., by Malware B in the example of Table 2) in the least one node that occurs sequentially following the first type of computer malware infection. The controller computer 120 may further identify other types of computer malware infections in a similar process. That is, the controller computer 120 may identify whether one or more types of malware infections occur sequentially (e.g., by Malware C and D in the example of Table 2) following the infection by the first and/or second type of malware.

[0170] The detection and identification of the one or more recursive relationships may include performing data collection and analysis processes at one or more time intervals, such as: determining the total number of infections by each type of malware, calculating one or more correlation coefficients of the occurrence of each type of malware infection, and conducting regression analysis for malware data features of each type of malware.

[0171] In some embodiments, the one or more recursive relationships determined at step 306 are processed to determine the risk values associated transitioning at least one node of the computer network 110 between at least some of the compartments for each type of malware. For example, in the cases where the risk values are modelled as conditional probability distributions, the one or more recursive relationships may become the condition(s) that influence the probability density function(s) of one or more risk values.

[0172] In some embodiments, the one or more recursive relationships are also processed to determine the probability distributions of the stochastic model generated at step 304. For example, similarly to the risk values, when the probability distributions are modelled as conditional probability distributions, the one or more recursive relationships may become the condition(s) that influence the probability density function value(s) of one or more probability distributions. For example, in embodiments where the time-based Markov model is used as the stochastic model, the next-state (e.g., n + 1) probability distributions (e.g., p- ) will depend on the one or more recursive relationships derived at the present state n (e.g., c^, as one or more conditions) as Pij n+1 \c rrl n , c rr2 n •••.

[0173] In some embodiments, the k value is used to estimate the overall transmissibility of different types of computer malware having recursive relationships. For example, consider two malware types A and B, where infecting malware type A may cause the infection of malware type B, and vice versa. In this case, the k value for each type of computer malware can be calculated separately. It should be note that the overall transmissibility of the computer malware in the computer network 110 depends on both k values of the respective malware types A and B. Specifically, the overall transmissibility would be high if both malware types A and B have high respective k values, as the computer network 110 would be more likely to encounter a chain of infections. On the other hand, if one or both k values are low, the computer network 110 may be less likely to sustain a chain of infections, potentially resulting in a less severe outbreak of malware.

[0174] In some embodiments, a graph-based approach with recursive traversal is used to model the recursive relationships between different types of computer malware in the computer network 110. For example, consider a scenario where a node is initially susceptible to malware type A, and malware type B can only infect nodes that have already been infected with malware type A. The node becomes a target for malware type B following its infection by malware type A. To implement this logic, a recursive function is used to determine the susceptibility of each node to different malware types. The function can take the node and each malware type as input, and recursively traverse the graph to determine if the node is susceptible to the malware type. If the node is susceptible to a malware type (e.g., malware type A) , the function then checks whether the malware type can cause the infection of another malware type (e.g., malware type B). The recursive function further checks whether each malware type can infect other nodes in the computer network 110 and will recursively traverse those nodes as well.

Generating the compartmental model and estimating computer malware infections

[0175] At step 308, the controller computer 120 generates a compartmental model according to the SUEICRN compartmental model definition, as described above. The compartmental model includes a plurality of compartments, the stochastic model, and an indication of one or more of the derived risk values.

[0176] The controller computer 120 is configured to generate the one or more risk values based, at least in part, on the one or more recursive relationships that are determined to exist between one or more of the malware types. The recursive relationships are determined with respect to the SUEICRN compartmental model definition which uniquely encapsulates the behaviour of computer network devices that are affected by malware with particular characteristics (i.e., as determined dynamically from a network traffic analysis). Accordingly, the compartmental model generated by the controller computer 120 to control network 110 inherently accounts for the recursive relationships that exist between the different types of computer malware that are dynamically determined to impact on the network 110.

[0177] At step 310, the controller computer 120 is configured to estimates a risk of malware infections for the network 110 by processing the compartmental model generated at step 308. In some embodiments, the controller computer 120 calculates the risk from one or more determined or predicted propagation characteristics of one or more malware types affecting the nodes of the computer network 110.

[0178] In some embodiments, the estimated risk is determined from a weighted summation of one or more risk factor values, such as a number of active malware types on the computer network, a number/ratio of nodes that are exposed by one or more malware, a number/ratio of nodes that are infected by one or more malware, a number/ratio of non-recovered nodes and a number/ratio of vulnerable nodes (e.g., the devices without anti-malware software installation).

[0179] In some embodiments, the risk factors further include one or more malware specific risk factors (e.g., the values quantifying infectivity, morbidity and fatality of the particular malware). In some embodiments, as the risk values for each node and each type of malware are derived at step 302, the individual risk values are further normalized and averaged to derive the malware specific risk factor of one or more of the malware types, and/or of the network 110 (e.g., by processing the malware specific risk factors to generate an aggregated value representing a generalized risk against the known/determined malware type).

[0180] In some embodiments, the estimated risk R is calculated by

R = w 2 r 2 - 1- w H - 1- w N r N , wherein denotes each type of risk factor, such as, for example, a risk factor associated with a malware type i, and viy is the corresponding weight for indicating the significance of the risk factor . The weight Wj is pre-determined and, optionally, updated based on the characteristics of the computer network 110, the real-time infection status of the computer network 110, or the practical needs/demands for the computer network 110.

[0181] In some embodiments, the controller computer 120 further analyses the risk factors and estimates the intentional behaviour of the one or more types of malware on the computer network. The analysis and estimation may consider the recursive relationships between different types of malware. For example, if the infection by a malware type causes a considerable number of subsequent infections by other types of malware, the intentional behaviour of the malware type may be estimated as to propagate further types of malware.

[0182] In some embodiments, the controller computer 120 dynamically re-extracts data features at step 315 based on the real-time malware propagation status of the computer network 110. At this step, the previously extracted data features are updated by the re-extracted data features. Based on the re-extracted data features and the compartmental model generated previously at step 316, the compartmental model is readily re-generated as an updated compartmental model.

[0183] In some embodiments, based on steps 315 and 316, machine learning techniques are implemented to enhance the generated compartmental model and improve the accuracy for which malware propagation and infection risk is determined for the computer network 110. For example, by implementing machine learning techniques (e.g., deep learning), the extracted data features may be analysed (e.g., labelled and classified), and a model can be trained to analyse and predict the behavioural trends in malware as they evolve. Further, reinforcement learning may support the re-generating of the compartmental model (i.e., step 316) to enhance the model.

[0184] In some embodiments, at step 318, the controller computer 120 re-calculates the risk of computer malware infections in the computer network 110 by processing the updated compartmental model. This step further improves the accuracy of the determined risk based on the dynamic demands and real-time functionality of the computer network and malware (e.g., as reflected in the data traffic of the network). The controller computer 120 utilizes the risk predictions to enhance the security of the network 110 by controlling the operation of one or more nodes 114a-h in a manner that mitigates or eliminates malicious effects associated with the determined malware infection risks. [0185] In some embodiments, heterogeneous data retrieved from different sources and/or in various formats, such as DNS, PCAP, Network Sensor/IDP/IPS data, is used to generate the compartmental model. In some embodiments, in a complex multi -model system, heterogeneous data may be used to generate a plurality of heterogeneous compartment models with a plurality of compartments and nodes.

[0186] A data shift between a training data set associated with the compartmental and/or stochastic models and an execution data set (i.e., a network traffic dataset of the computer network) may occur due to changes in the joint probability distribution of the underlying input variables and targets.

[0187] In some embodiments, Bayesian model averaging is used on the plurality of heterogeneous compartment models to combine the estimates according to the posterior probabilities (e.g., the Bayesian probability distributions) of the heterogeneous compartment models. By applying Bayesian model averaging to the plurality of heterogeneous compartment models, the data shift between a training data set and an execution data set can be corrected.

Controlling the computer network

[0188] In some embodiments, at step 312, the controller computer 120 compares the estimated risk (e.g., R) with one or more predefined risk thresholds (e.g., T rl , T r2 , etc.). At step 314, the controller computer 120 is configured to, in response to the estimated risk being greater than or equal to the one or more predefined risk thresholds, take one or more actions to control the computer network 110.

[0189] Referring to Fig. 3c, the process of controlling the computer network 314 may include one or more of: i). generating one or more intimations 332 (e.g., warning of malware infection risk from administrator) to one or more nodes of the computer network; ii). resetting one or more credentials of the computer network 334 (e.g., passwords for administrator and other system accounts); iii). disconnecting one or more nodes in the computer network 336 to impede the propagation of malware (e.g., cut off the internet and/or other core network connections of the infected devices); iv). running anti-malware software 338 against the detected types of malware (e.g., either through centralised operations by the administrator or individually on each infected devices); and v). when the estimated risk is considerably high (e.g., larger than a pre-determined threshold), disabling the computer network 339 to prevent further damages.

[0190] Operation of the control computer 120 to control the computer network 110 further enhances the security of the computer network 110 against malware infections. For example, the control computer 120 can perform real-time detection of malware propagation and attacks, and eliminate or at least reduce the damage caused by the malware (e.g., information leakage, defrauding or blackmailing, disabling device functionalities, etc.).

[0191] Figs. 3a, 3b and 3c are to be understood as a blueprint for the one or more software programs executed by the control computer 120, which may be implemented step-by-step, such that each step in Figs. 3a, 3b and 3c is represented by a function in a programming language, such as C++ or Java. The resulting source code is then compiled and stored as computer executable instructions 122a in memory 122.

Experimental evaluation of the SUEICRN based compartmental model

[0192] Experimental evaluations including a comparative simulation and analysis were conducted to assess the performance of the disclosed SUEICRN compartmental model, against conventional compartmental models such as the SEIR compartmental model. The limitations of traditional epidemiological compartmental models are illustrated and discussed. The results demonstrate the efficacy of the enhanced SUEICRN compartmental model for cyber security epidemiology.

[0193] Four types of compartmental models were used for the comparison simulations as follows: • Traditional SEIR compartmental model;

• Basic SUEICRN compartmental model with a fixed number of nodes;

• Extended SUEICRN compartmental model which allows for non-static and non-homogenous network nodes (population); and

• Enhanced SUEICRN compartmental model, which supports recursive relationships between malware (disease variants).

[0194] The above models divide the network nodes up into mutually exclusive groups or “compartments”, as previously discussed. The simulations examined the dynamics of movement of network nodes between these compartments or states based on the network traffic dataset as a control dataset for demonstrating malware traversing a network.

Setup

[0195] To analyse the disclosed compartmental model, experiments were performed based on a network traffic dataset (i.e., a DNS dataset in the described simulations) that has been used to extract empirical data, the number of nodes, state transitions and malware data features. This data was used to simulate malware propagation through compartments for each compartmental model. Node communications were generated at each time interval since the proposed SUEICRN compartmental models are stochastic models that comprise node communications to capture the complexity between the node relationships, in contrast to a deterministic SEIR compartmental model.

[0196] In the simulations, network data (Packet Capture) was used as it provides a rich source of data features to describe epidemiological dynamics to input into a compartmental model. This dataset has been used as an exemplar for rich DNS data. The dataset comprises a campus’ DNS network traffic consisting of more than 4000 active users (in peak load hours) for random days in the month of April -May 2016. The dataset comprises DNS data (.pcap) from 23 April 2016 to 9 May 2016. A preliminary analysis was conducted on the dataset to identify the categories of risk factors and the relationships between these risk factors.

[0197] Fig. 6a is an example of raw packet capture (.pcap) data of a simulation at day 0. Fig. 6b is an example of the data of Fig. 6a converted to a JavaScript Object Notation (.json) format. With such data, the relevant data features, including inbound and outbound IP addresses, time, OUI, payload and hostname, are then extracted. In one example, Host name (e.g. scontent.finaal-l.fha.fbcdn.net) is sent to the Google Safety Lookup API which will return information regarding the threats. This process provides a “tag” to categorise malicious and non-malicious queries. In some examples, the IP Addresses are also sent to a geolocation resolver, such as the IP Geo API (e.g., https://db-ip.com/api/), to confirm the geographical locations of nodes.

[0198] A random computer network was generated and used by all the models in the simulations so that the computer network (structure) is consistent across all simulations. Communications (between neighbours) were generated based on the connections in the computer network. This supports consistency for comparing different compartmental models, as the same assumptions apply to simulations across the compartmental models. In addition, the computer network can also be randomly generated by updating the network variables.

[0199] Different malware parameter combinations were used to run the simulations. These were determined based on malware profiles from the sample DNS data set. Other simulation variables, namely, the number of nodes, the simulated timeframe, the number of time intervals, the maximum number of nodes a node can communicate with during a time interval, the initial percentage of nodes in the S compartment, the initial percentage of nodes in the I compartment were defined based on the DNS data set sample. In the simulations, the status of each node was recorded per time interval in a dataset export. [0200] Node communications were generated at each time interval to simulate a stochastic model instead of a deterministic model as node communications can capture the complexity between the node relationships. The following conditions also apply: a) A node communicates only with its direct neighbours (i.e., the one or more nodes to which it is directly connected to). b) At a given time interval, a node can communicate with up to a maximum number of nodes c) The actual number of nodes that node communicated with at a given time interval would be a random value between 0 and T cmax . d) Any number of communications with the same node is only considered as a single communication. That is, only the unique nodes a node communicated with is recorded, and the unique nodes should be less than the T cmax . e) No communication happens with non-recoverable nodes.

[0201] The following Table 4 explicates the general assumptions applied in the simulations.

Table 4: General assumptions. [0202] The following Table 5 lists the simulation settings. table 5: Simulation settings. [0203] In one example, “Virut” (see Table 1) as a form of botnet used for cybercrime activities such as fraud, DDoS and data theft is prevalent in the DNS dataset. The overall mean p is calculated as 5.642297650130549, and the overall standard deviation o is calculated as 7.208923131609546. Accordingly, the overall dispersion parameter k is 0.6872022999130206.

SEIR model

[0204] The SEIR model is a deterministic model based on a series of Ordinary Differential Equations (ODE), with stochastic states simulated through node communications. The SEIR model was simulated using a randomly generated computer network. Each node was assigned a value between a pre-determined minimum and maximum value. As the network is connected, all nodes are connected in a linear way.

[0205] The following parameter values were initially selected. The number of nodes S, E, I, R in each time interval were recorded in a data file accessible to the simulation routines. The parameter values were predefined as shown in the following Table 6 based on the control dataset:

Table 6: Simulation input values for SEIR compartmental model. [0206] The Exposed (E) period in the SEIR compartmental model is different for each node rather than a fixed value. In the simulations, the randomly assigned values were less than or equal to the control dataset values. This maintained a curved simulation rather than a stepped simulation graphical representation. The sub-figures in Fig. 8a show the difference between defined and randomised exposed (E) state compartments in the SEIR compartmental model according to the simulations.

[0207] The simulations for the SEIR compartmental model and the basic compartmental SUEICRN model proceeded according to the following assumptions regarding the number of nodes and the behaviour of each node in the network:

Table 7. Simulated number of nodes for the SEIR compartmental model and the basic compartmental SUEICRN model.

[0208] Fig. 8a illustrates simulation results in the form of graphs 800, 802, 804 demonstrating the flow and movement of the host nodes between S-E-I-R compartments as malware propagates through the network. It can be seen that almost the entire network population (3998 nodes) commences in the susceptible state. Fig. 8a- i shows a graph 800 of the initial time interval compartment breakdowns. At the initial time interval (interval 0), 3998 nodes are in the Susceptible compartment, one node is Exposed, and one node is Infectious. This infectious node is “patient 0”, that is, the first node to be infected in the network.

[0209] Over time, a percentage of the susceptible population becomes exposed, and then another percentage of the exposed population becomes infectious. Eventually, the nodes move to the recovered compartment. Fig. 8a-i shows a graph 802 that demonstrates the compartments in which the population is divided at approximately the 40 th time interval (i.e., 40 hours after the initial time).

[0210] Figs. 8a-i, 8a-ii and 8a-iii illustrate the difference between randomised, fixed and defined exposed (E) compartments in the simulation modelling. Defined E periods demonstrate an unnatural movement pattern, which may be applicable to some cybersecurity scenarios where the behaviour of malware is defined. Such scenarios are seen in the DNS dataset where nodes are in the incubation period as introduced earlier.

[0211] The limitations of the SEIR compartmental model are evident in the lack of compartment state definitions to consider the fluctuation of the number of nodes, the heterogeneous risk factors for each node (e.g., the different target operating systems of various malware types), or recursive relationships between different nodes. In epidemiology, the majority of the human population is susceptible to most communicable diseases, which is unlikely to be applicable to computer networks, as susceptibility or vulnerability to malware is often based on operating system type, and/or other computer network configurations and behaviour of nodes.

Basic SUEICRN compartmental model [0212] The basic SUEICRN compartmental model also assumes that the number of nodes 412 is static and the nodes are homogenous in the computer network 110. In contrast to the SEIR model, which is a fully interconnected network, a second evaluation of the SUEICRN model introduces carrier and non-recoverable nodes. This means that N nodes were effectively removed from the computer network 110 and not regarded as active contacts.

[0213] Both SEIR and SUEICRN compartmental models consider node interactions or communications that may spread malware. The following behavioural characteristics were programmed for evaluation of the models. A node in the S or U compartment will transition to the E compartment if the node has communicated with at least one node in the C or I compartment, and based on the node’s risk value.

[0214] At the initial time interval, there will be nodes in S, U and C compartments. Each node in S and U compartments is assigned a risk value between 0 to 1 for moving into the E compartment. The nodes in the S and U compartments having communicated with at least one node in the C or I compartments also have increased risk values of transitioning to the E compartment (i.e., risk values that are greater than the corresponding risk value of transitioning to the E compartment if there was no communication with any nodes in the C or I compartments). Nodes transitioning from the U compartment to the E compartment will remain in the E compartment, as they are unsusceptible to infection despite being exposed. Nodes transitioning from the S compartment to the E compartment will remain in the E compartment or transition into the I or C compartments. These transition probabilities are determined based on a percentage-based probability distribution (e.g., 10% stays in E, 40% transit to C, and 50% transit to I). These percentages are predetermined based on malware behaviour sighted in the dataset. From here, nodes transitioning into the C compartment will either remain in the C compartment or transition to the R compartment, based on probability distribution.

[0215] All nodes that transition to the I compartment from E will transition into the incubation compartment and remain in the incubation compartment for a defined incubation threshold (e.g., t inc ). In some simulations, the incubation threshold is randomly assigned to those nodes that have incubated for a time between 0 and a maximum incubation threshold (e.g., t incmax ), depending on malware characteristics. The t inc can be defined as any positive integer including zero (e.g., {0, 1,2,3, . . . }). If the incubation threshold is zero, the node will transit to the I compartment immediately. Once the incubation period of a node reaches the incubation threshold, it could transit to the I or R compartments. Accordingly, a random probability value (r inc2R ) is assigned to the current nodes in the incubation compartment satisfying t inc = 0. The node will transition to the R compartment if r inc2R is higher than or equal to a predetermined incubation-to-recovered threshold. Otherwise, the node will transition to the I compartment. The incubation-to-recovered threshold value is fixed throughout the simulations.

[0216] Nodes transitioning to the I compartment are assigned a random risk value r I2N . If r i2N is l ess th an a risk threshold, they will transition to the N compartment. Otherwise, they will continue to stay in I compartment or transition to the R compartment according to one or more probability distributions for infected-to- recovered risk value. All R nodes will transition back to the S compartment. In the simulations, at each time interval, the susceptible-to-exposed risk value of each node in the S compartment and the infected-to-non-recoverable risk value of each note varies, i.e., new values are assigned.

[0217] Fig. 8b illustrates the flow and movement of the host nodes between SUEICRN compartments as malware propagates through the network. Fig. 8b-i shows a graph 806 of the simulation results when the incubation thresholds for each node are fixed. Fig. 8b-ii shows a graph 808 of the simulation results for a randomised incubation period, with numbers generated between 0 and a t incmax . Fig. 8b-ii shows a graph 810 of the simulation results for a more stable spread of infection over nodes compared to Fig. 8b-i. In cyber security practice, many types of malware have defined incubation thresholds, as they are often created with intent.

Extended SUEICRN compartmental model [0218] Traditional compartmental models like SIR and SEIR support pandemic, endemic and epidemic use cases. However, these traditional compartmental models are based on the constraining assumption that population numbers are static (no births, deaths, or people leaving the boundary of the disease area). This assumption is not practical or correct in most cyber security applications, as most large-scale networks involve connections between multiple hosts that can be turned off, disconnected, damaged, or reconfigured to be non-communicative. The DNS dataset used in the evaluation demonstrates a fluctuation of the number of nodes over time (at each time interval), which could be attributed to a range of environmental and behavioural factors (e.g., in a “bring your own device” environment). Accordingly, the simulated disconnected compartment was necessary to capture these fluctuations.

[0219] Traditional epidemiological models (e.g., SIR and SEIR compartmental models) assume that the population is homogenous. In cyber security applications, this is not necessarily the case, as the use case behaviour of the computer device(s) is one of the largest risk factors in malware spread. That is, the way in which a human user operates the computing device is often deemed the highest risk to cyber resilience and can intentionally or unintentionally compromise cyber security.

[0220] The evaluation demonstrates that the proposed extended SUEICRN compartmental model improves the basic SUEICRN compartmental model, where the number of nodes is not fixed nor homogenous. The extended SUEICRN compartmental model demonstrates the requirement to consider nodes having dynamic behaviour. This accurately simulates the cyber networks where devices are frequently added or disconnected from the computer network.

[0221] In the evaluation, the overall population consisted of known nodes throughout the simulation (i.e., the same set of devices). In the simulations, 0% - 10% of the total number of nodes (i.e., 4000) could be removed at any time interval. The removed nodes could also be added back at any time interval. Accordingly, the number of nodes at any time interval varied between 3600 - 4000. In the simulations, only nodes in E, I, C, R or N compartments were removed at any time interval. [0222] Nodes removed were marked as A-away. If they returned to the computer network, they returned to the same location in the network structure and to the compartment they were already in. Accordingly, the returning nodes had the same neighbours and status as before being removed. The incubation period was also paused while the node was away (e.g., a user not using the laptop after taking it home). In some scenarios, which were not simulated, the incubation period can be continued while the node is away. In the simulations, there was no communication between removed nodes (A nodes and N nodes) and the nodes in other compartments.

[0223] The other compartment transition definitions and assumptions were held to be the same as those for the basic SUEICRN compartmental model.

[0224] Fig. 8c is a graph 810 illustrating the resulting compartment occupancy over time when fluctuation in the node population was simulated according to the network behaviours discussed above. Although the graph 810 of Fig. 8c demonstrates a similar overall trend in the compartment occupancy as graph 808 of Fig. 8b, it is noted that the extended SUEICRN compartmental model is able to track the corresponding changes in the number of nodes that are Exposed, Infectious, and Non-recoverable at any given time with increased fidelity.

Enhanced SUEICRN compartmental model

[0225] The enhanced SUEICRN compartmental model is an enhancement to the basic SUEICRN and extended SUEICRN compartmental models. The enhanced compartmental SUEICRN model simulates recursive relationships between different malware types, which supports the simulation of multiple malware attacks that occur in sequence or one or more malware types and introduce infections by other malware types. The disclosed recursive relationships are unique to the field of cybersecurity.

[0226] Using the DNS dataset as a sample of these malware spread dynamics, a malware type named “modpack” was simulated as a Trojan-Downloader that initiates the download of other malware variants, e.g., Necurs and Virut. Accordingly, the recursive relationships were simulated by this enhanced SUEICRN compartmental model.

[0227] The following Table 8 explicates the assumptions for simulating the enhanced SUEICRN compartmental model.

Table 8. Simulation assumptions for enhanced SUEICRN compartmental model.

[0228] The following Table 9 shows the simulated data features of the typical Windows malware as explicated in Table 1.

Table 9. Simulated data features of the typical Windows malware.

[0229] A malware attack sequence was randomly generated for the simulation of n time intervals. As shown in Table 10 below, an example attack sequence was generated for a simulation spanning 10 time intervals, where four types of malware attack the computer network at several predetermined time intervals. Each malware attack caused the attacked node to experience at least one compartment transition which enabled the node to be simultaneously infected by at least one other malware type.

Time 1 2 3 4 5 6 7 8 9 10

Interval

Table 10. Malware attack sequence in a period (10 time interval).

[0230] The simulations were run with the 1 st , 3 rd , 5 th and 8 th Malware Types as explicated in Table 1, that is, malware Virut, Conficker, Suppobox and Nymaim, respectively. This demonstrates the different intentional behaviours based on the malware characteristics. [0231] Fig. 8d includes graphs 812, 814, 816, 818 of the respective flow and movement of nodes between compartments for each malware type. Fig. 8e is a graph 820 of the flow and movement of nodes between compartments accounting the four types of malware together, which demonstrates the recursive relationships. The improved results of Figs. 8d and 8e illustrate the benefit of the simulation of recursive relationships for detecting security threats in computer networks where nodes can be susceptible to multiple types of malware at any time.

[0232] To summarise, the evaluation results illustrate the benefit of generating and using the proposed compartmental models to enable a controller computer system to predict and detect malware behaviour, and to thereby mitigate its spread, through a connected computer network with a complex configuration and a diverse node population.

[0233] The simulations of the basic SUEICRN compartmental model, extended SUEICRN compartmental model and enhanced SUEICRN compartmental model demonstrate the efficiency of the additional compartments, in contrast to traditional SIR and SEIR compartmental models used in traditional epidemiology. The extended SUEICRN compartmental model and enhanced SUEICRN compartmental model also demonstrate the relevance of disconnected nodes. The enhanced SUEICRN compartmental model detects the recursive relationships between different malware types for additional benefits.

In the simulations, the behavioural assumptions are modelled on average. The assumptions in the examples are not to be interpreted as limiting, but merely as a representative basis for teaching one skilled in the art to variously employ the methods and structures of the present disclosure. For example, additional datasets can be considered for estimating a risk of computer malware infections, and further characteristics in node and node interactions can be taken into account.

[0234] It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the above-described embodiments, without departing from the broad general scope of the present disclosure. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.