Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DEVICE AND METHOD FOR FILE RECOVERY AT A PRODUCTION SITE USING A FILE LEVEL CONTINUOUS DATA PROTECTION SYSTEM
Document Type and Number:
WIPO Patent Application WO/2024/132184
Kind Code:
A1
Abstract:
A device and a method for file recovery at a production site using a continuous data protection (CDP) system are provided. The CDP system comprises a plurality of snapshots of a replica copy of a file system, each snapshot generated at a different point in time, and a journal comprising one or more sets of file operations, each set of file operations performed in a time range between two snapshots. The device is configured to: receive a ransomware notification indicating one or more infected files of user data at the production site; determine a point in time at which the files are not infected, and determine a notification time range; replay one or more file operations from the set of file operations performed in the notification time range; and restore the file system by restoring the files of the user data at the production site to the corresponding latest version.

Inventors:
NATANZON, Assaf (Riesstr. 25, Munich, DE)
Application Number:
PCT/EP2022/087768
Publication Date:
June 27, 2024
Filing Date:
December 23, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HUAWEI TECHNOLOGIES CO., LTD. (Longgang DistrictShenzhen, Guangdong 9, CN)
NATANZON, Assaf (Riesstr. 25, Munich, DE)
International Classes:
G06F21/55
Attorney, Agent or Firm:
HUAWEI EUROPEAN IPR (Riesstr. 25, Munich, DE)
Download PDF:
Claims:
CLAIMS

1. A device (100) for file recovery at a production site (110) using a file level continuous data protection, CDP, system (120), wherein the CDP system (120) comprises a plurality of snapshots (122) of a replica copy of a file system, each snapshot (122) generated at a different point in time, and a journal (124) comprising one or more sets of file operations (126), each set of file operations (126) performed in a time range between two snapshots (122), and wherein the device (100) is configured to: receive a ransomware notification (101) at a notification point in time, wherein the ransomware notification (101) indicates that one or more files of user data (103) at the production site (110) are infected by a ransomware; determine a point in time at which the one or more files (103) are not infected by the ransomware, wherein the determined point in time is earlier than the notification point in time, and determine a notification time range, wherein the notification time range is a time difference between the determined point in time and the notification point in time; replay, starting from the snapshot (122) of the replica copy of the file system generated at the determined point in time, one or more file operations from the set of file operations (126) performed in the notification time range to obtain a latest version of the one or more files of user data (105); and restore the file system by restoring the one or more by restoring the one or more files of the user data (103) at the production site (110) to the corresponding latest version (105).

2. The device (100) according to claim 1, wherein the latest version of a file of user data (105) comprises a respective clear file obtained after replaying the one or more file operations from the set of file operations (126) performed in the notification time range, wherein the clear file is the file of the user data (103) that is not infected by the ransomware.

3. The device (100) according to claim 1 or 2, wherein replaying the one or more file operations from the set of file operations (126) performed in the notification time range to obtain a latest version of the file of user data (105), comprises: determining whether one or more file operations of the set of file operations (126) performed in the notification time range are legal operations; ignoring each file operation that is not a legal operation; and replaying the determined one or more legal operations.

4. The device (100) according to claim 3, wherein a legal operation is a file operation that is not performed by the ransomware.

5. The device (100) according to claims 2 to 4, wherein when a determined non legal operation comprises that the ransomware deleted the one or more files of user data (103) after writing them in an infected file, the replaying the determined one or more legal operations comprises: replaying the one or more legal operations performed in the notification time range before the one or more files (103) are deleted by the ransomware; and the latest version of the file of user data (105) comprises the file of user data obtained after replaying the one or more legal operations before the file was deleted by the ransomware.

6. The device (100) according to one of the claims 1 to 5, wherein the device (100) is further configured to retrieve, from the CDP journal (124), the set of file operations (126) performed in the notification time range.

7. The device (100) according any one of claims 1 to 6, wherein the device (100) is further configured to determine a traversal pattern (307) of the ransomware.

8. The device (100) according claim 7, wherein the traversal pattern (307) of the ransomware comprises at least one of an order of access to file directories, an order of access to each of the one or more files of the user data (103), information that indicates if each of the one or more files of the user data (103) is encrypted, and information indicating if one or more files of the user data (103) at the production site (110) is created with an unknown data format.

9. The device (100) according to one of the claims 1 to 8, further configured to receive the ransomware notification (101) at the notification point in time from a ransomware scanning entity (430).

10. The device (100) according to claim 9, wherein the ransomware scanning entity (430) is configured to: detect the ransomware in the production site (110); and generate the ransomware notification (101) when the ransomware is detected in the production site (110).

11. The device (100) according to one of the claims 9 to 10, further configured to send the traversal pattern (307) of the ransomware to the ransomware scanning entity (430).

12. The device (100) according to one of the claims 9 to 11, wherein the ransomware scanning entity (430) is updated based on the received traversal pattern (307) of the ransomware.

13. The device (100) according to one of the claims 9 to 12, wherein the ransomware scanning entity (430) is implemented inside the production site (110) or outside the production site (110).

14. The device (100) according to one of the claims 1 to 13, wherein the device (100) is implemented inside the production site (110) or outside the production site (110).

15. A method (500) for file recovery at a production site (110) using a file level continuous data protection, CDP, system (120), wherein the CDP system(120) comprises a plurality of snapshots (122) of a replica copy of a file system, each snapshot (122) generated at a different point in time; and a journal (124) comprising one or more sets of file operations (126), each set of file operations (126) performed in a time range between two snapshots (122), and wherein the method (500) comprises: receiving (S501) a ransomware notification (101) at a notification point in time, wherein the ransomware notification (101) indicates that one or more files (103) of user data at the production site (110) is infected by a ransomware; determining (S502) a point in time at which the one or more files (103) are not infected by the ransomware, wherein the determined point in time is earlier than the notification point in time, and determining a notification time range, wherein the notification time range is a time difference between the determined point in time and the notification point in time; replaying (S503), starting from the snapshot (122) of the replica copy of the file system generated at the determined point in time, one or more operations from the set of file operations (126) performed in the notification time range to obtain a latest version of the one or more files of user data (105); and restoring (S504) the file system by restoring the one or more files of the user data at the production site (110) to the corresponding latest version (105).

16. A computer program product comprising a program code for carrying out, when implemented on a processor, the method (500) according to claim 15.

Description:
DEVICE AND METHOD FOR FILE RECOVERY AT A PRODUCTION SITE USING A FILE LEVEL CONTINUOUS DATA PROTECTION SYSTEM

TECHNICAL FIELD

The present disclosure relates to a device for file recovery at a production site using a file level continuous data protection system (CDP). The disclosure further provides a corresponding method for file recovery at a production site using a file level CDP system and a computer program product to perform the method.

BACKGROUND

Ransomware is a type of malicious software (malware) that threatens to publish or block access to data or a computer system, usually by encrypting it, until the victim pays a ransom fee to the attacker. In many cases, a ransom demand comes with a deadline. If the victim does not pay in time, the data is lost forever or the ransom increases.

The two predominant types of ransomware are screen lockers and encryptors. The former block access to a system with a "lock" screen, asserting that the system is encrypted. Encryptors, on another hand, encrypt data on a system, making the content useless without a decryption key.

For both encryptors and screen lockers, victims are often notified on a lock screen to purchase a cryptocurrency to pay the ransom fee. Once the fee is paid, the victims receive the decryption key and may attempt to decrypt their files. However, decryption is not guaranteed, as multiple sources report varying degrees of success with decryption after paying ransom fees. Sometimes the victims never receive the decryption keys. Some attacks install malware on the computer system even after the ransom fee is paid and the data is released.

Enterprise ransomware infections or viruses usually start with a malicious email. An unsuspecting user opens an attachment or clicks on a URL that is malicious or has been compromised. At that point, a ransomware agent is installed and begins encrypting key files on the victim’s PC and any attached file shares. After encrypting the data, the ransomware displays a message on the infected device. The message explains what has occurred and how to pay the attackers. If the victims pay, the ransomware promises they will get a code to unlock their data. In order to prevent ransomware attacks, users may:

• Defend their email against ransomware. Secure Email Gateways with targeted attack protection are crucial for detecting and blocking malicious emails that deliver ransomware. These solutions protect against malicious attachments, malicious documents, and URLs in emails delivered to user computers.

• Defend their mobile devices against ransomware. Mobile attack protection products, when used in conjunction with mobile device management (MDM) tools, can analyze applications on users’ devices and immediately alert users and IT to any applications that might compromise the environment.

• Defend the web surfing against ransomware. Secure web gateways can scan users’ web surfing traffic to identify malicious web ads that might lead them to ransomware.

• Monitor their server, network and backup key systems. Monitoring tools can detect unusual file access activities, viruses, network C&C traffic and CPU loads, possibly in time to block ransomware from activating. Keeping a full image copy of crucial systems can reduce the risk of a crashed or encrypted machine causing a crucial operational bottleneck.

Like most malware, ransomware is designed to infect a computer and remain undetected until its objective is achieved. In the case of ransomware, the attacker’s goal is for the victims to be aware of the infection only when they receive the ransom demand.

Conventional anti-ransomware solutions are designed to identify the infection earlier in the ransomware infection process. To that end, a number of ransomware detection techniques are used to overcome ransomware’s stealth and defense evasion functionality.

Early ransomware detection is of utmost importance since the damage performed by the ransomware may be irreversible. If ransomware encrypts data that is not included in a secure backup, then it may be irrecoverable even if the victim pays the ransom fee. Identifying and eradicating the ransomware infection before encryption begins, thus, is essential to minimize its impact.

Modern ransomware variants commonly exfiltrate a company’s sensitive data before encrypting it. If the ransomware can be detected before this data theft occurs, then the company avoids a data breach that could be expensive. There are multiple software solutions available aiming to detect ransomware in a run time environment, in which a tracker looks at the amount of changes occurring to files, to delete operations, as well as to the entropy of the data. In most cases, however, these solutions detect ransomware after it already encrypted at least a portion of the files. As a result, a recovery of the files is needed.

Moreover, once a ransomware is detected, it is not easy to find the latest images containing the non-encrypted data. In particular, if the encryption process is long, the data of several files may change during the encryption process. Consequently getting an image before the encryption means losing important data.

Continuous data protection (CDP) is a method for creating a copy of user data with the ability to restore the user data to any previous time.

File level CDP creates continuous data protection by logging all file operations such as write, delete, link, unlink, copy file range, rename etc. and by replaying said file operations on a target file system. Further, file level CDP keeps a continuous journal of the file operations and enables to restore the file system to any previous point in time.

In an exemplary solution, a block level CDP system is used to perform selective recovery of the files of user data after a ransomware attack was detected by recovering each file from a different point in time, by means of mounting multiple point in time and detecting the files that are clear (i.e., not infected by ransomware). However, this method requires very intensive IOs and CPU operation to perform the recovery.

SUMMARY

In view of the above, this disclosure aims to improve conventional solutions for restoring user data using a CDP system after a ransomware is detected. An objective is to provide a device and a corresponding method for file recovery at a production site using a CDP system that leverages the ability of file level CDP to track the full life cycle of each file, from creation until deletion, including all the changes performed to the file in between, and to restore the clean (i.e., non-infected by the ransomware) files of the user data. These and other objectives are achieved by the solutions of this disclosure as described in the independent claims. Advantageous implementations are further defined in the dependent claims.

According to a first aspect, a device for file recovery at a production site using a file level CDP system is provided. The CDP system includes a plurality of snapshots of a replica copy of a file system, each snapshot generated at a different point in time, and a journal comprising one or more sets of file operations, each set of file operations performed in a time range between two snapshots. The device is configured to receive a ransomware notification at a notification point in time, where the ransomware notification indicates that one or more files of user data at the production site are infected by a ransomware; determine a point in time at which the one or more files are not infected by the ransomware and determine a notification time range; replay, starting from the snapshot of the replica copy of the file system generated at the determined point in time, one or more file operations from the set of file operations performed in the notification time range to obtain a latest version of the one or more files of user data; and restore the file system by restoring the one or more files of the user data at the production site to the corresponding latest version. The determined point in time is earlier than the notification point in time and the notification time range is a time difference between the determined point in time and the notification point in time.

This provides the advantage of avoiding the loss of important data due to a ransomware attack by leveraging the ability of file level CDP to track the full life cycle of each file, from its creation until its deletion, including also all the changes performed to the file. Further advantageously, the device enables to restore of each of the one or more files of user data in an automatic and efficient manner without data loss, by generating the latest version of each of the one or more files of the user data obtained by replaying one or more file operations performed in the notification time range.

In this disclosure, the terms clear file and clean file may be used interchangeably. Additionally, the terms file level CDP and CDP may be used interchangeably.

In an implementation form of the first aspect, the latest version of a file of user data includes a respective clear file obtained after replaying the one or more file operations from the set of file operations performed in the notification time range, where the clear file is the file of the user data that is not infected by the ransomware.

This provides the advantage that each file of the one or more files of the user data can be individually restored from its corresponding clean version in which the file is not infected by the ransomware. In other words, the device enables to individually recover the most updated version of each file of user data.

In an implementation form of the first aspect, replaying the one or more file operations from the set of file operations performed in the notification time range to obtain the latest version of the file of user data, includes: determining whether one or more file operations of the set of file operations performed in the notification time range are legal operations; ignoring each file operation that is not a legal operation; and replaying the determined one or more legal operations.

In an implementation form of the first aspect, a legal operation is a file operation that is not performed by the ransomware.

That is, a legal operation may refer to any operation performed by the user on the file in the notification time range.

This enables to selectively replay the one or more file operations from the set of file operations performed in the notification time range; thereby, only the operations that were not performed by the ransomware are replayed, ensuring that a clean version of the file is obtained after replaying the operations. This further enhances the efficiency and accuracy on file recovery.

In an implementation form of the first aspect, when a determined non legal operation includes that the ransomware deleted the one or more files of user data after writing them in an infected file, the replaying the determined one or more legal operations includes: replaying the one or more legal operations performed in the notification time range before the one or more files are deleted by the ransomware; and the latest version of the file of user data comprises the file of user data obtained after replaying the one or more legal operations before the file was deleted by the ransomware. This provides the advantage of preventing the deletion of the original file by the ransomware, as the illegal delete operation is ignored, and may further prevent overwriting a file with encrypted data.

In an implementation form of the first aspect, the device is further configured to retrieve, from the CDP journal, the set of file operations performed in the notification time range.

In an implementation form of the first aspect, the device is further configured to determine a traversal pattern of the ransomware.

By leveraging ability of the file level CDP system to track the full life cycle of each file, the ransomware behavior may be traced back.

In an implementation form of the first aspect, the traversal pattern of the ransomware comprises at least one of an order of access to file directories, an order of access to each of the one or more files of the user data; information that indicates if each of the one or more files of the user data is encrypted; and information indicating if one or more files of the user data in the CDP system is created with an unknown data format.

This enables to understand the manner in which the ransomware works while attacking the one or more files of the user data at the production site.

In an implementation form of the first aspect, the device is further configured to receive the ransomware notification at the notification point in time from a ransomware scanning entity.

This may enable to integrate ransomware detection tools with the capabilities of the file level CDP system for file recovery.

In an implementation form of the first aspect, the ransomware scanning entity is configured to detect the ransomware in the production site, and generate the ransomware notification when the ransomware is detected in the production site.

In an implementation form of the first aspect, the device is further configured to send the traversal pattern of the ransomware to the ransomware scanning entity. In an implementation form of the first aspect, the ransomware scanning entity is updated based on the received traversal pattern of the ransomware.

This provides the advantage of enhancing the available information and understanding of known ransomwares or of a new ransomware, which can be used by the scanning entity in order to avoid and/or detect future infections by the same ransomware or by a similar ransomware.

In an implementation form of the first aspect, the ransomware scanning entity is implemented inside the production site or outside the production site.

In an implementation form of the first aspect, the device is implemented inside the production site or outside the production site.

In the first aspect and its implementations, the functions described may be implemented in hardware, software, firmware, or any combination thereof.

According to a second aspect, a method for file recovery at a production site using a file level CDP system is provided. The CDP includes a plurality of snapshots of a replica copy of a file system, each snapshot generated at a different point in time, and a journal comprising one or more sets of file operations, each set of file operations performed in a time range between two snapshots. The method includes: receiving a ransomware notification at a notification point in time, where the ransomware notification indicates that one or more files of user data at the production site is infected by a ransomware; determining a point in time at which the one or more files are not infected by the ransomware and determining a notification time range; replaying, starting from the snapshot of the replica copy of the file system generated at the determined point in time, one or more operations from the set of file operations performed in the notification time range to obtain a latest version of the one or more files of user data; and restoring the file system by restoring the one or more files of the user data at the production site to the corresponding latest version. The determined point in time is earlier than the notification point in time and the notification time range is a time difference between the determined point in time and the notification point in time. This provides the advantage of avoiding the loss of important data due to a ransomware attack by leveraging the ability of file level CDP to track the full life cycle of each file, from its creation until its deletion, including also all the changes performed to the file. Further advantageously, it is possible to restore of each of the one or more files of user data in an automatic and efficient manner without data loss, by generating the latest version of each of the one or more files of the user data obtained by replaying one or more file operations performed in the notification time range.

In an implementation form of the second aspect, the latest version of a file of user data includes a respective clear file obtained after replaying the one or more file operations from the set of file operations performed in the notification time range, where the clear file is the file of the user data that is not infected by the ransomware.

This provides the advantage that each file of the one or more files of the user data can be individually restored from its corresponding clean version in which the file is not infected by the ransomware. In other words, the method enables to individually recover the most updated version of each file of user data.

In an implementation form of the second aspect, replaying the one or more file operations from the set of file operations performed in the notification time range to obtain the latest version of the file of user data, includes: determining whether one or more file operations of the set of file operations performed in the notification time range are legal operations; ignoring each file operation that is not a legal operation; and replaying the determined one or more legal operations.

In an implementation form of the second aspect, a legal operation is a file operation that is not performed by the ransomware.

That is, a legal operation may refer to any operation performed by the user on the file in the notification time range.

This enables to selectively replay the one or more file operations from the set of file operations performed in the notification time range; thereby, only the operations that were not performed by the ransomware are replayed, ensuring that a clean version of the file is obtained after replaying the operations. This further enhances the efficiency and accuracy on file recovery.

In an implementation form of the second aspect, when a determined non legal operation includes that the ransomware deleted the one or more files of user data after writing them in an infected file, the replaying the determined one or more legal operations includes: replaying the one or more legal operations performed in the notification time range before the one or more files are deleted by the ransomware; and the latest version of the file of user data comprises the file of user data obtained after replaying the one or more legal operations before the file was deleted by the ransomware.

This provides the advantage of preventing the deletion of the original file by the ransomware, as the illegal delete operation is ignored, and may further prevent overwriting a file with encrypted data.

In an implementation form of the second aspect, the method further includes retrieving, from the CDP journal, the set of file operations performed in the notification time range.

In an implementation form of the second aspect, the method further includes determining a traversal pattern of the ransomware.

By leveraging ability of the file level CDP system to track the full life cycle of each file, the ransomware behavior may be traced back.

In an implementation form of the second aspect, the traversal pattern of the ransomware comprises at least one of: an order of access to file directories, an order of access to each of the one or more files of the user data; information that indicates if each of the one or more files of the user data is encrypted; and information indicating if one or more files of the user data in the CDP system is created with an unknown data format.

This enables to understand the manner in which the ransomware works while attacking the one or more files of the user data at the production site. In an implementation form of the second aspect, the method further includes receiving the ransomware notification at the notification point in time from a ransomware scanning entity.

This may enable to integrate ransomware detection tools with the capabilities of the file level CDP system for file recovery.

In an implementation form of the second aspect, the ransomware scanning entity is configured to detect the ransomware in the production site, and generate the ransomware notification when the ransomware is detected in the production site.

In an implementation form of the second aspect, the method further includes sending the traversal pattern of the ransomware to the ransomware scanning entity.

In an implementation form of the second aspect, the ransomware scanning entity is updated based on the received traversal pattern of the ransomware.

This provides the advantage of enhancing the available information and understanding of known ransomwares or of a new ransomware, which can be used by the scanning entity in order to avoid and/or detect future infections by the same ransomware or by a similar ransomware.

In an implementation form of the second aspect, the ransomware scanning entity is implemented inside the production site or outside the production site.

The method according to the second aspect comprises the features of the corresponding implementation forms of the device of the first aspect.

According to a third aspect, a computer program product is provided, including a program code for carrying out, when implemented on a processor, the method according to the second aspect and its implementation forms.

The computer program product according to the third aspect comprises the features of the corresponding implementation forms of the method of the second aspect. The method according to the second aspect and the computer program product according to the third aspect and their implementation forms provide the same advantages and effects as described above for the device of the first aspect and its respective implementation forms.

It has to be noted that all devices, elements, units and means described in the present application could be implemented in the software or hardware elements or any kind of combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be performed by external entities is not reflected in the description of a specific detailed element of that entity which performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements, or any kind of combination thereof.

BRIEF DESCRIPTION OF DRAWINGS

The above described aspects and implementation forms will be explained in the following description of specific embodiments in relation to the enclosed drawings, in which:

FIG. 1 shows a device for file recovery at a production site using a file level CDP system according to this disclosure;

FIG. 2 schematically shows the replaying of one or more file operations, according to this disclosure;

FIG. 3 shows a device for file recovery at a production site using a CDP system according to this disclosure;

FIG. 4 shows a device for file recovery at a production site using a CDP system according to this disclosure;

FIG. 5 shows a method for file recovery at a production site using a CDP system according to this disclosure.

DETAILED DESCRIPTION OF EMBODIMENTS

FIG. 1 shows an exemplary embodiment of a device 100 for file recovery at a production site 110 according to this disclosure. The device 100 uses a file level CDP system 120 that comprises a plurality of snapshots 122 of a replica copy of a file system, where each snapshot 122 is generated at a different point in time, and a journal 124. The journal 124 comprises one or more sets of file operations 126, and each set of file operations 126 is performed in a time range spanned between two snapshots 122.

Hereinafter in this disclosure, the terms clear file and clean file may be used interchangeably. Additionally, the terms file level CDP and CDP may be used interchangeably.

The device 100 may be included or may be part of the production site 110. Alternatively, the device 100 may be implemented outside the production site 110.

The device 100 is configured to receive a ransomware notification 101 at a notification point in time. The ransomware notification 101 indicates that one or more files of user data 103 at the production site 110 are infected by a ransomware.

The device 100 may receive the ransomware notification 101, for example but not as a limitation, from the production site 110. That is, the production site 110 may support ransomware detection in user data, or the production site 110 may be in communication with a ransomware scanning entity. As a further example, the device 100 may receive the ransomware notification 101 from a ransomware scanning entity, as will be explained later in this disclosure.

Since a ransomware customarily creates encrypted copies of one or more files 103 of the user data and subsequently deletes the original (non-encrypted) files 103, the ransomware notification 101 may indicate that a ransomware attack was performed in the user data if, for example, one or more files 103 of the user data were found to be encrypted. In another example, the ransomware notification 101 may indicate that a ransomware attack was performed in the user data by detecting that the one or more files 103 have a suspicious suffix or a suspicious file extension.

The suspicious suffix or the suspicious file extension may comprise a suffix or a file extension that is known for conventional ransomwares. Alternatively, the suspicious suffix or the suspicious file extension may comprise an unknown suffix or an unknown file extension.

The device 100 is configured to determine a point in time at which the one or more files 103 are not infected by the ransomware. The determined point in time is earlier than the notification point in time. Next, the device 100 is configured to determine a notification time range, where the notification time range is a time difference between the determined point in time and the notification point in time. That is, the notification time range may refer to a time range spanned between the point in time at which the file 103 was not infected by the ransomware and the time at which it was found to be infected.

Then, the device 100 may be configured to retrieve, from the CDP journal 124, the set of file operations 126 that were performed in the notification time range.

Further, the device 100 is configured to replay, starting from the snapshot 122 of the replica copy of the file system that was generated at the determined point in time, one or more file operations from the set of file operations 126 that were performed within the notification time range to obtain a latest version of the one or more files of user data 105.

The latest version of each file of user data 105 comprises a respective clear file obtained after replaying the one or more file operations from the set of file operations 126 performed in the notification time range, where the clear file is the file of the user data 103 that is not infected by the ransomware.

The device 100 is further configured to restore the file system by restoring the one or more files of the user data 103 at the production site to the corresponding latest version 105.

Thereby, the device 100 leverages the ability of file level CDP to track the full life cycle of each file and replays, on a replica copy, the file operations that may have been performed by the user on each file, starting from a point in time at which the file is not infected until a point in time in which the ransomware attacked it. In this manner, this disclosure enables to generate the latest copy of the clean (i.e., non-infected by the ransomware) file and recover it, instead of simply choosing an earlier clean copy of the file from a snapshot 122 generated at an earlier point in time, which could be much older than the point in time at which the ransomware infection was performed.

Notably, as the CDP system 120 comprises a plurality of snapshots 122 of a replica copy of the file system that are generated at different points in time, the latest version 105 of the one or more files of user data can alternatively be generated and recovered from a point in time that may be different from another point in time corresponding to another file. This further allows to recover specific files and not the whole system, enhancing the efficiency and speed of the device 100.

For example, the device 100 may determine that a first file of the user data is not infected by the ransomware at a first point in time, and a second file of the user data is not infected by the ransomware at a second point in time, where the first point in time and the second point in time are different from each other, and the first point in time and the second point in time are earlier than the notification point in time.

Then, the device 100 may determine a first notification time range, being the time difference between the determined first point in time and the notification point in time. Further, the device 100 may determine a second notification time range, being the time difference between the determined second point in time and the notification point in time.

Next, the device 100 may be configured to replay, starting from a first snapshot 122 of the replica copy of the file system generated at the determined first point in time, one or more file operations from the set of file operations 126 performed in the first notification time range to obtain the latest version of the first file of user data 105, as explained above. The device 100 may be further configured to replay, starting from a second snapshot 122 of the replica copy of the file system generated at the determined second point in time, one or more file operations from the set of file operations 126 performed in the second notification time range to obtain the latest version of the second file of user data 105. This process can be performed for each of the files of user data, so that a single file can be restored at a time.

The device 100 enables to restore the full file system form a clean copy of the file system (i.e., from a snapshot 122) at its corresponding point in time, by subsequently replaying the one or more file operations from the set of file operations 126 performed in the notification time range to obtain the latest version of the files or user data 105.

Thus, this disclosure leverages the ability of file level CDP to track the full life cycle of each file.

The replaying the one or more file operations from the set of file operations 126 performed in the notification time range to obtain a latest version of the one or more files of user data 105, comprises that the device 100 is configured to determine whether one or more file operations of the set of file operations 126 performed in the notification time range are legal operations. Further, the device 100 is configured to ignore each file operation that is not a legal operation. The device 100 is further configured to replay the determined one or more legal operations.

Notably, a legal operation is a file operation that is not performed by the ransomware. Accordingly, an illegal operation is a file operation performed by the ransomware.

Thereby, the device 100 according to this disclosure selectively replays the file operations performed in the notification time range. That is, the device 100 only replays the file operations that were not performed by the ransomware during the time range spanned between the determined point in time and the notification point in time. This ensures that a clean copy of the one or more files of user data 105 is indeed generated by replaying the file operations.

Each set of file operations 126 may comprise, for example but not limited to, at least on of following file operations:

• Write

• Rename

• Link

• Unlink

• Copy_file_range

• Create file/directory/symlink

• Truncate

• Change attributes/xattributes

• Read

Conventional file level CDP systems do not track read operations. However, as noted above, each set of file operations 126 may also comprise read operations. This is beneficial, as the accuracy on file recovery is enhanced.

As mentioned above, conventional ransomwares create (write) an encrypted copy of a file of user data 103 and afterwards deletes the file 103, living only the encrypted version at the production site 110. Further, ransomwares typically encrypt specific file types, and traverse over the files in a well-defined order following, for example, a Breadth First Search (BFS) algorithm or a Depth First Search (DFS) algorithm. Further, in some cases, a ransomware writes the encrypted copies of the file of with a specific suffix, or uses random suffixes.

Thus, when replaying the one or more file operations from the set of file operations 126 performed in the notification time range, the device 100 may be configured to selectively apply the legal file system operations in the following exemplary manner:

1. Delete: If a file delete looks suspicious, i.e., if it may be suspected to be performed by a ransomware, the delete operation may be identified as a non-legal operation and, thus, it may not be replayed. That is, the file 103 may not be deleted. The device 100 may be configured to move the file 103 to a suspected deletion. Alternatively, the device 100 may keep the file 103 in place.

2. Write: If the write is of data that is encrypted, then the write operation may be determined as non-legal and may be therefore ignored. That is, the device 100 may not replay this write operation.

3. File create: If a created file is suspected to be an encrypted copy of a file of user data 103, the create operation may be determined as non-legal and may be ignored. Thereby, the device 100 may block the file creation and, moreover, can ignore subsequent write operations, performed on the encrypted file, if any.

In the exemplary embodiment of FIG. 1, when a determined non-legal operation comprises that the ransomware deleted the one or more files of user data 103 after writing each one of them in an infected file, for example but not limited to, in an encrypted file, the replaying the determined one or more legal operations comprises replaying the one or more legal operations performed in the notification time range before the one or more files 103 are deleted by the ransomware. Further, the latest version of the one or more files of user data 105 comprises the file of user data obtained after replaying the one or more legal operations before the file was deleted by the ransomware.

Thereby, the device 100 may prevent the deletion of the one or more original files 103 by the ransomware, as the illegal delete operations are ignored, and may further prevent overwriting a file 103 with encrypted data. The device 100 may replay the one or more file operations from the set of file operations 126 that were performed within the notification time range by using, for example and not as a limitation, a re-player. The re-player may be part of the device 100 or the re-player may be part of the file level CDP 120.

In case that the re-player is part of the file level CDP 120, the device 100 may be configured to enable the CDP system 120 to replay, starting from the snapshot 122 of the replica copy of the file system generated at the determined point in time, the one or more file operations from the set of file operations 126 performed in the notification time range to obtain a latest version of the one or more files of user data 105, as explained before.

Further, the device 100 can be implemented inside the production site 110 or outside the production site 110.

The device 100 according to this disclosure may comprise a processor or processing circuitry (not shown) configured to perform, conduct or initiate the various operations of the device 100 described herein. The processing circuitry may comprise hardware and/or the processing circuitry may be controlled by software. The hardware may comprise analog circuitry or digital circuitry, or both analog and digital circuitry. The digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or multi-purpose processors. The device 100 may further comprise memory circuitry, which stores one or more instruction(s) that can be executed by the processor or by the processing circuitry, in particular under control of the software. For instance, the memory circuitry may comprise a non-transitory storage medium storing executable software code which, when executed by the processor or the processing circuitry, causes the various operations of the device 100 to be performed. The processing circuitry may comprise one or more processors and a non-transitory memory connected to the one or more processors. The non-transitory memory may carry executable program code which, when executed by the one or more processors, causes the device 100 to perform, conduct or initiate the operations or methods described herein.

The present disclosure is discussed with respect to the following example. In this example, the user data at the production site may comprise three exemplary files 103:

1. A.txt 2. B.txt

3. C.txt

The ransomware notification 101 may indicate that one or more files 103 of the user data were encrypted by a ransomware at the notification point in time, for example, the ransomware notification 101 may indicate that exemplary files A.enc, B.enc and C.enc are found at the production site 110. In this disclosure, the file extension "enc" indicates that the respective file is encrypted, and the file extension "txt" indicates that the respective file may not be encrypted. Although in this it example it is considered that a file with an extension different than "enc" may not be encrypted by the ransomware, it is to be mentioned that some ransomwares may encrypt the content of a file but keeping its original name and respective suffix.

Then, the device 100 may determine a point in time, earlier than the notification point in time, at which each of the files A.enc, B.enc and C.enc 103 are not infected by the ransomware. Further, the device 100 may determine a notification time range, where the notification time range is the time difference between the determined point in time and the notification point in time.

A first snapshot 122 of a replica copy of the file system was generated by the CDP system 120, and a second snapshot 122 of a replica copy of the file system has been generated by the CDP system 120 at the notification point in time. Thus, the device 100 may retrieve, from the CDP journal 124, the set of file operations 126 performed in the notification time range. That is, the device 100 may determine the set of file operations 126 performed on the files of user data 103 between the first snapshot 122 and the second snapshot 122 generated at the determined point in time and at the notification point in time, respectively.

The set of file operations 126 performed in the notification time range may comprise, for example and not as a limitation, the following operations:

1. Read A.txt

2. Write A.enc

3. Write to B.txt

4. Delete A.txt

5. Write to C.txt

6. Read B.txt 7. Write B.enc

8. Delete B.txt

9. Read C.txt

10. Write C.enc

11. Delete C.txt

It can be noticed that, in this example, there is a ransomware encrypting the three exemplary files A.enc, B.enc and C.enc and subsequently deleting the original files A.txt, B.txt and C.txt. Thus, the device 100 may determine that the file operations write A.enc, write B.enc and write C.enc are not legal operations, which may be ignored. That is, the device 100 may not replay the operations that result in the encrypted (infected) files.

Moreover, the device 100 may determine that the later file operations delete A.txt, delete B.txt and delete C.txt are also not legal operations, as they may be performed by the ransomware after encrypting the respective files. Then, the device 100 may not replay said illegal delete operations.

Notably, while the ransomware is running the file operation write A.enc above, a new write operation arrives to the files B.txt and A.txt.

Then, in this example, the device 100 may be configured to ignore all the operations performed to the "enc" files, and may replay only the write operations to the clear files, so that the file operations that may be replayed are:

1. Write to B.txt

2. Write to C.txt

There rest of the exemplary file operations may be ignored.

FIG. 2 schematically depicts an example of the replaying of one or more file operations, according to this disclosure. Same elements are labelled with the same reference signs.

In FIG. 2, a set of file operations 126 that have been performed on one or more of the files of user data 103 in the determined notification time range is shown. The set of file operations 126 comprise exemplary a write operation, a delete operation, a rename operation and a write operation. Then, the device 100 may selectively replay, on the snapshot of the replica copy of the file system taken at the production site 110, the set of file operations 126. This is depicted as "target FS" in FIG. 2. Thereby, the latest version of each file of user data 103 may be generated and further restored at the production site 110. By restoring the one or more files of user data 103, the entire file system is also restored.

In the example of FIG. 2, the device 100 may replay the set of file operations 126 by using an exemplary re-player 228.

FIG. 3 shows an exemplary embodiment of a device 100 for file recovery at a production site 110 according to this disclosure, which builds on the device 100 shown in FIG. 1. Same elements are labelled with the same reference signs.

In the exemplary embodiment of FIG. 3, the device is further configured to determine, based on the one or more file operations from the set of file operations 126 in the notification time range, a traversal pattern 307 of the ransomware that infected the one or more files 103.

The traversal pattern 307 of the ransomware comprises at least one of: an order of access to file directories, an order of access to each of the one or more files 103 of the user data infected by the ransomware, information that indicates if each of the one or more files 103 of the user data is encrypted, and information indicating if one or more files 103 of the user data in the CDP system 120 is created with an unknown data format.

The order of access to directories may comprise, for example, whether the ransomware scans a file system following a BFS algorithm or a DFS algorithm, or another algorithm.

The order of access to each of the one or more files 103 of the user data infected by the ransomware may indicate, for example, whether the one or more files 103 are accessed by the ransomware by file size, by file name and/or by file type.

That is, the device 100 may be configured to determine the traversal pattern 307 of the ransomware by analyzing, for each file of user data 103 infected by the ransomware, the one or more file operations from the set of file operations 126 performed in the notification time range. Notably, the device 100 may additionally or alternatively determine the traversal pattern 307 of the ransomware by analyzing, for each file of user data 103 infected by the ransomware, the one or more file operations from the set of file operations 126 performed in the notification time range that are not legal operations.

Thereby, this exemplary embodiment may leverage the ability of the file level CDP to track the full life cycle of each file of user data 103 in order to trace back the behavior of the ransomware.

For example, in case that one or more files 103 of user data are created in the production site 110, the device 100 may determine whether the files 103 are suspected as being infected by a ransomware by detecting one or more write operations of an encrypted file performed in the notification time range, or by detecting one or more write operations of a file having a suspicious suffix or a suspicious file extension performed in the notification time range. The suspicious suffix or the suspicious file extension may comprise a suffix or a file extension that is known for conventional ransomwares, or may comprise an unknown suffix or an unknown file extension.

Thereby, for each of the files 103 infected by the ransomware, the device 100 may determine changes performed by the ransomware in the notification time range, and may further determine a pattern in which the ransomware accesses the file system.

By determining the traversal pattern of the ransomware 307, this embodiment may allow a future detection and blocking of the ransomware by the ransomware scanning entity may be performed in an effective and fast way.

FIG. 4 shows an exemplary embodiment of a device 100 for file recovery at a production site 110 according to this disclosure, which builds on the device 100 shown in FIG. 3. Same elements are labelled with the same reference signs.

The device 100 of FIG. 4 further comprises a ransomware scanning entity 430. The ransomware scanning entity 430 may be, for example but not limited to, an online ransomware detection system. Further, the ransomware scanning entity 430 may be included or be part of the production site 110, or may be included or be part of the device 100. The ransomware scanning entity 430 is configured to detect a ransomware in the production site 110. Further, the ransomware scanning entity 430 is configured to generate the ransomware notification 101 when the ransomware is detected in the production site 110, and to send it to the device 100.

That is, in the exemplary embodiment of FIG. 4, the device 100 is configured to receive the ransomware notification 101 form the ransomware scanning entity 430.

Further, the device 100 of FIG. 4 is configured to send the traversal pattern 307 of the ransomware to the ransomware scanning entity 430. Then, the ransomware scanning entity 430 is updated based on the received traversal pattern 307 of the ransomware.

Updating the ransomware scanning entity 430 may comprise, for example, to determine relevant information and/or parameters of the ransomware based on the determined traversal pattern 307, and to subsequently store them in the scanning entity 430, so that the ransomware scanning entity 430 may detect and block all the ransomware operations in a further attack by the same ransomware or by a similar ransomware.

In some cases, only an updated version of the ransomware scanning entity 430 may detect the ransomware that attacks the production site 110. Thus, if the ransomware attack occurred before the ransomware scanning entity 430 is able to detect it, the attack will be successful and the user will need to recover the data from the attack, which implies that the user may potentially suffer some damages and/or data loss. Thus, knowing the traversal pattern 307 and updating the ransomware scanning entity 430 is of paramount importance in order to detect and block the ransomware.

Upon updating the ransomware scanning entity 430, the device 100 may be further configured to enable the ransomware scanning entity 430 to re-start its operation at a point in time before the ransomware started the attack (i.e., at a point in time earlier than the notification time). Thereby, the ransomware scanning entity 430 may be able to efficiently detect the ransomware and to block all the ransomware operations.

Thus, this exemplary embodiment provides the advantages of providing an existing ransomware scanning entity 430 with the traversal pattern 307 and the parameters of a ransomware, and subsequently re-run the scanning entity 430 allowing immediate ransomware detection. Additionally or alternatively, the device 100 may also use the updated ransomware scanning entity 430 on a retroactively attacked system.

Accordingly, this disclosure may enable to integrate ransomware detection tools with the capabilities of the file level CDP system 120 for file recovery at the productions site 110.

The advantages of the solutions according to this disclosure may be summarized as follows:

• The device 100 can selectively replay one or more file operations to prevent deletion of files which have been encrypted by the ransomware, and prevent overwriting a file with encrypted data.

• The solutions enable an efficient and automatic recovery of the file system by recovering all the files of user data to its latest version.

• The solutions may further enable a fast recovery of the files by allowing to recover and restore each file of user data individually instead is of recovering the whole system.

• The ransomware behavior can be traced, allowing an easier and efficient detection in the future.

FIG. 5 shows an exemplary embodiment of a method 500 for file recovery at a production site 110 using a CDP system 120. The CDP system 120 comprises a plurality of snapshots 122 of a replica copy of a file system, each snapshot 122 generated at a different point in time, and a journal 124 comprising one or more sets of file operations 126, each set of file operations 126 performed in a time range between two snapshots 122.

The method 500 may be performed by the exemplary embodiments of the device 100 as disclosed above.

The method 500 comprises a step S501 of receiving a ransomware notification 101 at a notification time point. The ransomware notification 101 indicates that one or more files 103 of the user data at the production site 110 are infected by a ransomware.

Further, the method 500 comprises a step S502 of determining a point in time at which the one or more files 103 are not infected by the ransomware, and determining a notification time range. The determined point in time is earlier than the notification point in time. The notification time range is a time difference between the determined point in time and the notification point in time.

The method 500 further comprises a step S503 of replaying, starting from the snapshot 122 of the replica copy of the file system generated at the determined point in time, one or more file operations from the set of file operations 126 performed in the notification time range to obtain a latest version of the one or more files of user data 105.

Further, in step S504, the method 500 comprises restoring the file system by restoring the files of the user data 103 at the production site 110 to the corresponding latest version 105

The method 500 may further comprise actions according to the exemplary embodiments of the device 100. Hence, the method 500 achieves the same advantages as the device 100.

The present disclosure further provides a computer program product comprising a program code for carrying out, when implemented on a processor, the method 500 shown in FIG. 5. The computer program may be included in a computer readable medium of the computer program product. The computer readable medium may comprise essentially any memory, such as a ROM (Read-Only Memory), a PROM (Programmable Read-Only Memory), a 15 EPROM (Erasable PROM), a Flash memory, an EEPROM (Electrically Erasable PROM), or a hard disk drive.

The present disclosure has been described in conjunction with various embodiments as examples as well as implementations. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed matter, from the studies of the drawings, this disclosure and the independent claims. In the claims as well as in the description the word "comprising" does not exclude other elements or steps and the indefinite article "a" or "an" does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.