Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND APPARATUS FOR DETERMINING EXISTENCE OF DEPENDENCE VIOLATION, ELECTRONIC DEVICE, MEDIUM AND PROGRAM
Document Type and Number:
WIPO Patent Application WO/2020/058120
Kind Code:
A1
Abstract:
The present invention relates to a method and apparatus for determining existence of a dependence violation, an electronic device, a medium and a program. The method includes: inputting an architecture design document for developing a source code into an entity extraction model to extract a word vector of each word included in the architecture design document, where words include entity words and non-entity words; converting each sentence into sequence data represented by the word vectors of the words according to an order of the words in each sentence of the architecture design document, and respectively inputting the sequence data, obtained by conversion, of each sentence into a relationship extraction model to extract a relationship between all entities in the architecture design document; generating a dependence design rule that represents a relationship between the entities included in the architecture design document based on the extracted relationship between the entities; converting the source code developed based on the architecture design document into a first dependence tree; and comparing the first dependence tree with the dependence design rule to determine whether a dependence violation exists in the source code.

Inventors:
HAN KE (CN)
GAO LIANG (CN)
Application Number:
PCT/EP2019/074490
Publication Date:
March 26, 2020
Filing Date:
September 13, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SIEMENS AG (DE)
International Classes:
G06F17/27
Other References:
SAMRONGSAP PARINYA ET AL: "A tool for detecting dependency violation of layered architecture in source code", 2014 INTERNATIONAL COMPUTER SCIENCE AND ENGINEERING CONFERENCE (ICSEC), IEEE, 30 July 2014 (2014-07-30), pages 130 - 133, XP032697161, DOI: 10.1109/ICSEC.2014.6978182
LI YANG ET AL: "Extracting features from requirements: Achieving accuracy and automation with neural networks", 2018 IEEE 25TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER), IEEE, 20 March 2018 (2018-03-20), pages 477 - 481, XP033343033, DOI: 10.1109/SANER.2018.8330243
"RATIONAL ROSE/C++", RATIONAL ROSE/CC++, XX, XX, 1 January 1996 (1996-01-01), pages 5 - 51, 53, XP002941063
Attorney, Agent or Firm:
ISARPATENT - PATENT- UND RECHTSANWÄLTE BEHNISCH BARTH CHARLES HASSA PECKMANN UND PARTNER MBB (DE)
Download PDF:
Claims:
CLAIMS

1. A method for determining existence of a dependence violation in a source code, comprising:

inputting an architecture design document for developing a source code into to a pre stored entity extraction model to extract a word vector of each word included in the architecture design document, wherein words include entity words and non-entity words; converting each sentence into sequence data represented by the word vectors of the words according to an order of the words in each sentence of the architecture design document, and respectively inputting the sequence data, obtained by conversion, of each sentence into a pre- stored relationship extraction model to extract a relationship between all entities in the architecture design document;

generating a dependence design rule that represents a relationship between the entities included in the architecture design document based on the extracted relationship between the entities;

converting the source code developed based on the architecture design document into a first dependence tree; and

comparing the first dependence tree with the dependence design rule to determine whether a dependence violation exists in the source code.

2. The method according to claim 1, wherein

after the step of generating a dependence design rule that indicates a relationship between the entities included in the architecture design document, the method further comprises: converting the dependence design rule into a second dependence tree;

the step of comparing the first dependence tree with the dependence design rule comprises: comparing the first dependence tree with the second dependence tree.

3. The method according to claim 1, wherein the dependence design rule is formatted into a triple form, and the triple form comprises an entity 1, an entity 2 and a relationship between the entity 1 and the entity 2.

4. The method according to any one of claims 1 to 3, wherein the step of converting the source code developed based on the architecture design document into a first dependence tree comprises:

scanning the source code by using a static code analysis tool to obtain a dependence structure matrix, and building the dependence tree that represents a dependence relationship between the entities included in the architecture design document based on the dependence structure matrix.

5. The method according to claim 4, wherein the step of scanning the source code comprises: scanning the source code stored in a file system or scanning the source code stored in a version control system.

6. The method according to any one of claims 1 to 5, wherein the entity extraction model and the relationship extraction model are obtained by pre-training a plurality of tagged architecture design documents serving as training data sets by using a neural network.

7. An apparatus (500, 600) for determining existence of a dependence violation in a source code, comprising:

an entity extraction unit (502, 602), configured to input an architecture design document for developing a source code into a pre-stored entity extraction model to extract a word vector of each word included in the architecture design document, wherein words include entity words and non-entity words;

a relationship extraction unit (504, 604), configured to convert each sentence into sequence data represented by the word vectors of the words according to an order of the words in each sentence of the architecture design document, and respectively input the sequence data, obtained by conversion, of each sentence into a pre-stored relationship extraction model to extract a relationship between all entities included in the architecture design document;

a dependence design rule generation unit (506, 606), configured to generate a dependence design rule that represents a relationship between the entities included in the architecture design document based on the extracted relationship between the entities;

a first dependence tree conversion unit (508, 608), configured to convert the source code developed based on the architecture design document into a first dependence tree; and

a dependence violation determination unit (510, 610), configured to compare the first dependence tree with the dependence design rule to determine whether a dependence violation exists in the source code.

8. An electronic device (700), comprising

at least one processor (702); and a memory (704) coupled to the at least one processor (702), wherein the memory (704) has an instruction stored therein, and when the instruction is executed by the at least one processor (702), the electronic device (700) is enabled to implement the method according to any one of claims 1 to 6.

9. A non-transient machine readable medium, having a computer executable instruction stored therein, wherein when the computer executable instruction is executed, at least one processor is enabled to implement the method according to any one of claims 1 to 6.

10. A computer program, comprising a computer executable instruction, wherein when the computer executable instruction is executed, at least one processor is enabled to implement the method according to any one of claims 1 to 6.

Description:
METHOD AND APPARATUS FOR DETERMINING EXISTENCE OF DEPENDENCE VIOLATION, ELECTRONIC DEVICE, MEDIUM AND

PROGRAM

BACKGROUND

Technical Field

The present invention generally relates to the field of software engineering, in particular, to a method and apparatus for determining existence of a dependence violation in a source code, an electronic device, a computer readable medium and a program.

Related Art

Software architecture design documents generally use a natural language to define dependence relationships between entities such as modules, components, layers, categories, method and subsystems. However, these dependence relationships are possibly changed in software codes developed based on the software architecture design documents.

It is desirable to provide a method for checking whether the software codes developed based on the software architecture design documents has a dependence violation in comparison with the software architecture design documents.

SUMMARY

A brief summary of the present invention is set forth below in order to provide basic understandings of certain aspects of the present invention. It should be understood that this summary is not an exhaustive overview of the present invention. It is neither intended to determine key or crucial portions of the present invention, nor intended to limit the scope of the present invention. The objective of the summary is only to provide some concepts in a simplified way, and this is used as an introduction of more detailed descriptions of subsequent discussions.

According to one aspect of the present invention, a method for determining existence of a dependence violation in a source code includes: inputting an architecture design document for developing a source code into a pre-stored entity extraction model to extract a word vector of each word included in the architecture design document, where words include entity words and non-entity words; converting each sentence into sequence data represented by the word vectors of the words according to an order of the words in each sentence of the architecture design document, and respectively inputting the sequence data, obtained by conversion, of each sentence into a pre-stored relationship extraction model to extract a relationship between all entities in the architecture design document; generating a dependence design rule that represents a relationship between the entities included in the architecture design document based on the extracted relationship between the entities; converting the source code developed based on the architecture design document into a first dependence tree; and comparing the first dependence tree with the dependence design rule to determine whether a dependence violation exists in the source code.

In this way, by comparing the dependence tree converted based on the source code with the dependence design rule generated based on the architecture design document, whether the relationship between the entities implemented by the source code and the relationship between the entities defined in the architecture design document are inconsistent can be determined. That is, whether the dependence violation exists is determined.

Preferably, in one example of the above aspect, after the step of generating a dependence design rule that indicates a relationship between the entities included in the architecture design document, the method further includes: converting the dependence design rule into a second dependence tree. The step of comparing the first dependence tree with the dependence design rule includes: comparing the first dependence tree with the second dependence tree.

In this way, it is more visualized and convenient to convert the dependence design rule into the second dependence tree and then compare the first dependence tree with the second dependence tree.

Preferably, in one example of the above aspect, the dependence design rule is formatted into a triple form (an entity 1, an entity 2 and a relationship between the entity 1 and the entity 2).

In this way, the dependence design rule in the architecture design document can be expressed in a clearer form.

Preferably, in one example of the above aspect, the step of converting the source code developed based on the architecture design document into a first dependence tree includes: scanning the source code by using a static code analysis tool to obtain a dependence structure matrix, and building the dependence tree that represents a dependence relationship between the entities included in the architecture design document based on the dependence structure matrix. The entities may include modules, components, layers, categories, methods, subsystems and the like. In this way, the source code developed based on the architecture design document may be converted into the dependence tree, so as to further facilitate the comparison.

Preferably, in one example of the above aspect, the step of scanning the source code includes: scanning the source code stored in a file system or scanning the source code stored in a version control system.

In this way, source codes stored in different file systems can be scanned, and source codes in different versions can be scanned as needed.

Preferably, in one example of the above aspect, the entity extraction model and the relationship extraction model are obtained by pre-training a plurality of tagged architecture design documents serving as training data sets by using a neural network.

In this way, the entity extraction model and the relationship extraction model can be obtained by training by using the neural network.

According another aspect of the present invention, an apparatus for determining existence of a dependence violation in a source code is provided, including: an entity extraction unit, configured to input an architecture design document for developing a source code into a pre stored entity extraction model to extract a word vector of each word included in the architecture design document, where words include entity words and non-entity words; a relationship extraction unit, configured to convert each sentence into sequence data represented by the word vectors of the words according to an order of the words in each sentence of the architecture design document, and respectively input the sequence data, obtained by conversion, of each sentence into a pre- stored relationship extraction model to extract a relationship between entities included in the architecture design document; a dependence design rule generation unit, configured to generate a dependence design rule that represents a relationship between the entities included in the architecture design document based on the extracted relationship between the entities; a first dependence tree conversion unit, configured to convert the source code developed based on the architecture design document into a first dependence tree; and a dependence violation determination unit, configured to compare the first dependence tree with the dependence design rule to determine whether a dependence violation exists in the source code.

In this way, by comparing the dependence tree converted based on the source code with the dependence design rule generated based on the architecture design document, whether the relationship between the entities implemented by the source code and the relationship between the entities defined in the architecture design document are inconsistent can be determined,. That is, whether the dependence violation exists is determined.

Preferably, in one example of the above aspect, the apparatus also includes: a second dependence tree conversion unit, configured to convert the dependence design rule into a second dependence tree. The dependence violation determination unit is further configured to compare the first dependence tree with the second dependence tree.

In this way, it is more visualized and convenient to convert the dependence design rule into the second dependence tree and then compare the first dependence tree with the second dependence tree.

Preferably, in one example of the above aspect, the dependence design rule generation unit is further configured to: format the dependence design rule into a triple form (an entity 1, an entity 2 and a relationship between the entity 1 and the entity 2).

In this way, the dependence design rule in the architecture design document can be expressed in a clearer form.

Preferably, in one example of the above aspect, the first dependence tree conversion unit is further configured to: scan the source code by using a static code analysis tool to obtain a dependence structure matrix, and build the dependence tree that represents a dependence relationship between the entities included in the architecture design document based on the dependence structure matrix. The entities may include modules, components, layers, categories, methods, subsystems and the like.

In this way, the source code developed based on the architecture design document may be converted into the dependence tree, so as to further facilitate the comparison.

Preferably, in one example of the above aspect, the first dependence tree conversion unit is further configured to: scan the source code stored in a file system or scan the source code stored in a version control system.

In this way, source codes stored in different file systems can be scanned, and source codes in different versions can be scanned as needed.

Preferably, in one example of the above aspect, the entity extraction model and the relationship extraction model are obtained by pre-training a plurality of tagged architecture design documents serving as training data sets by using a neural network.

In this way, the entity extraction model and the relationship extraction model can be obtained by training by using the neural network. According to another aspect of the present invention, an electronic device is provided, including: at least one processor; and a memory coupled to the at least one processor. The memory has an instruction stored therein. When the instruction is executed by the at least one processor, the electronic device is enabled to implement the aforementioned method for determining existence of a dependence violation in a source code.

According to another aspect of the present invention, a non-transient machine readable storage medium is provided, having an executable instruction stored therein. When the instruction is executed, a machine is enabled to implement a method for adjusting operation parameters of a power utilization device.

A computer program is provided, including a computer executable instruction. When the computer executable instruction is executed, at least one processor is enabled to implement the aforementioned method.

BRIEF DESCRIPTION OF THE DRAWINGS

The essence and advantages of the content of the present disclosure may be more comprehensible from the accompanying drawings below. In the accompanying drawings, similar components or features may have same or similar reference numerals.

Fig. 1 is a flow chart of a method 100 for determining existence of a dependence violation in a source code according to one embodiment of the present invention;

Fig. 2 is a schematic diagram of a neural network structure for training an entity extraction model;

Fig. 3 is a schematic diagram of a neural network structure for training a relationship extraction model;

Fig. 4 illustrates a block diagram of exemplary configuration of an apparatus 500 for adjusting operation parameters of a power utilization device according to another embodiment of the present invention;

Fig. 5 illustrates a block diagram of an apparatus 500 for determining existence of a dependence violation in a source code according to one embodiment of the present invention;

Fig. 6 illustrates a block diagram of exemplary configuration of an apparatus 600 for determining existence of a dependence violation in a source code according to another embodiment of the present invention; and

Fig. 7 illustrates a block diagram of an electronic device 700 for determining existence of a dependence violation in a source code according to one embodiment of the present invention.

Reference numerals in the drawings:

100, 400: method for determining existence of a dependence violation in a source code S102, S104, S106, S108, S110, S402, S404, S406, S407, S408, S410: step

202: input layer

204: hidden layer W (V*N dimension)

206: output layer W’ (N*V dimension)

302: input layer

304: convolutional layer

306: max-pooling layer

308: fully connected layer

310, 312: entity word vector

500, 600: apparatus for determining existence of a dependence violation in a source code

502, 602: entity extraction unit

504, 604: relationship extraction unit

506, 606: dependence design rule generation unit

508, 608: first dependence tree conversion unit

510, 610: dependence violation determination unit

607 : second dependence tree conversion unit

700: electronic device

702: processor

704: memory

DETAILED DESCRIPTION

The subject matter of the present disclosure will be described through exemplary implementations. It should be understood that, these implementations are merely for the purpose of helping a person skilled in the art to better understand the subject matter of the present disclosure, and are not intended to limit the protection scope, applicability, or examples set forth in the claims. Functions and arrangement of the discussed elements may be changed without departing from the protection scope of the content of the present disclosure. For each example, various processes or components may be omitted, replaced, or added as required. For example, the described method may be performed in an order different from that described herein, and steps may be added, omitted, or combined. In addition, features described in some examples may also be combined in other examples.

As used herein, terms such as "comprise", "include" and variants thereof are open terms, and have the meaning of "including, but not limited to". The phrase "based on" means "at least partially based on". The phrases "an embodiment", "some embodiment" and "one embodiment" have the meaning of "at least one embodiment". The phrase "another embodiment" has the meaning of "at least one other embodiment". The terms such as "first" and "second" are used to describe different or same objects. Other definitions, either explicit or implicit, may be included below. Unless indicated explicitly in the context, the definition of a term is consistent throughout the specification.

The present invention provides a solution for determining, according to an architecture design document, whether a source code developed based on the architecture design document has a dependence violation in comparison with the architecture design document. The architecture design document is usually implemented by an NLP (Natural Language Processing) method. The method according to the present invention firstly extracts a dependence design rule between modules, components, layers, categories, methods, subsystems and the like from the architecture design document, then scans software codes developed according to the architecture design document to generate a dependence tree of the software codes, and then performs mapping comparison on the dependence tree and the dependence design rule to determine whether the developed source code has the dependence violation in comparison with the architecture design document. Those skilled in the art can understand that the dependence violation described herein means that: relationships between entities in the source code developed based on the architecture design document and relationships between the entities defined in the architecture design document are compared one to one. If there is any inconsistency, it is indicated that the dependence violation exists, otherwise, it is indicated that no dependence violation exists.

Before the method according to the exemplary embodiment of the present invention is described, definitions of two related terms are provided at first.

Natural Language Processing (NLP): the NLP is a term in the field of computer science and artificial intelligence, which involves interaction between a computer and human (natural) languages, and is especially to identify, understand and generate natural languages and to solve how to program the computer to process and analyze a large amount of natural language data.

Version Control System (VCS): the VCS is usually operated as a stand-alone application, but it is also embedded in various types of software such as a word processor, a spreadsheet program, a collaborative web document and various content management systems (e.g., page histories of Wikipedia). The VCS can convert documents to previous versions, which is important for allowing an editor to track editing and correction errors of everyone else and prevent damage and junk mails.

A method and apparatus for determining whether a source code and an architecture design document have a dependence violation according to the embodiments of the present invention are described in combination with the accompanying drawings.

Fig. 1 is a flow chart of a method 100 for determining existence of a dependence violation in a source code according to one embodiment of the present invention.

As shown in Fig. 1, in block S102, an architecture design document for developing a source code is input into to a pre- stored entity extraction model to extract a word vector of each word included in the architecture design document, where words include entity words and non-entity words.

A word vector of an UNKNOWN class may be set. All unknown words, the classes of which cannot be determined through the entity extraction model, correspond to this UNKNOWN word vector.

In block S104, each sentence is converted into sequence data represented by the word vectors of the words according to an order of the words in each sentence of the architecture design document, and the sequence data, obtained by conversion, of each sentence are respectively input into a pre- stored relationship extraction model to extract a relationship between all entities in the architecture design document.

The entity extraction model and the relationship extraction model which are used in the method according to one embodiment of the present invention may be obtained by training by using a neural network. A training method of the entity extraction model according to one embodiment of the present invention is specifically described below.

Firstly, tags are respectively defined for entity classes (such as modules, components, layers, categories, methods and subsystems) in several architecture design documents and relationships (including encompassing, calling, independence and the like) between the entities. Then, each word in the architecture design documents is annotated, where the words include words representing entities and words representing non-entities (i.e., all words except the entity words), each of the non-entity words is annotated with a name, the entity words are annotated with tags such as modules, components, layers, categories, methods and subsystems that represent the entity classes, and each sentence is annotated with a tag that represents the relationship such as encompassing, calling and independence between the included entities. The tagged words and sentences are used as a training data set to train the entity extraction model.

Specifically, the architecture design document is divided into a sentence set and a word set. The sentence set includes all sentences in the document, and the word set includes all words appearing in the document. Each word includes information of the tag of the word and also includes information of the tags of the contexts of sentences including this word.

Then, the word is converted into a vector by a one-hot encoding method. The dimension of the vector of the one-hot encoded word is expressed as V*l, and V is the number of all the words appearing in the training set.

After an input of the entity extraction model is generated, the input is converted into word embedding (distributed representation of words), namely a digital vector representing a word. The meaning of the word is inferred from the context of the word (C represents the number of words before and after a target word), so that there is a hypothesis that words having similar contexts should have similar meanings.

Preferably, a Continuous Bag of Words (CBOW) algorithm of Word2Vee may be used as a training algorithm to train the entity extraction model. Fig. 2 is a schematic diagram of a neural network structure for training an entity extraction model.

In Fig.2, on an input layer 202, the word set is traversed. For each word (namely the target word) in the word set, one-hot encoded vectors of C context words of the word are used as inputs, namely the dimension of the inputs is C*V. The total number of the inputs is a total number of sentences including this target word.

On a hidden layer 204, the one -hot encoded vectors of the context words are multiplied by a weight matrix W having a dimension of V*N to obtain a C*N matrix. Here, N is a customized dimension of word embedding, and may be set as needed. The value of N represents the complexity of a word vector. All word vectors of a context transformed from the weight matrix W are summed to obtain a l*N vector.

On an output layer 206, a result from the hidden layer is multiplied by another weight matrix W’ having a dimension of N*V to obtain a l*V vector. This vector is a one -hot encoded vector of the target word. Preferably, a forecast result vector used as a representation of the target word may be generated by using Hierachical Softmax. In this vector, the maximum element represents a probability that the target word possibly belongs to a certain entity.

In a back propagation process, the vector obtained based on forecasting is compared with a real tag of the word to calculate a loss function, and then the weights W and W’ are updated by a gradient descent algorithm based on the loss of the forecast vector. The above training process is repeated till the loss function is converged to obtain the entity extraction model.

Those skilled in the art can understand the specific process of training the entity extraction model by the above description. In addition, those skilled in the art can understand that, in the method of the present invention, an entity extraction model for converting an entity word in an architecture design document into a word vector may also be pre-stored, and the entity extraction model is not limited to being generated by the above training method.

A training method of the relationship extraction model according to one embodiment of the present invention is specifically described below.

In the method of the present invention, preferably, the relationship extraction model may be trained by using a Convolutional Neural Network (CNN). Fig. 3 is a schematic diagram of a neural network structure for training a relationship extraction model.

On an input layer 302, for each sentence in the architecture design document, each word included therein is represented by the word vector extracted by the entity extraction model, so that this sentence may be expressed as a matrix. Each row represents one word, and each column represents a dimension of the word vector. If the length (namely the number of the words included) of the sentence is expressed as M, and the dimension number of each word is expressed as N, the sentence can be expressed as an M*N matrix which is used as an input.

The reference numerals 310 and 312 in Fig. 3 schematically represent the word vectors of two entities.

On a convolutional layer 304, the input matrix is scanned by a convolution kernel, and order information of a sequence is integrated by the convolution kernel.

Then, on the max-pooling layer 306, a feature graph generated by each convolution kernel is combined into a number by using a l-max pooling layer.

Finally, on a fully connected layer 308, a feature vector is converted into a category. Preferably, the feature vector may be processed by a softmax function.

An index of the maximum element in the finally obtained feature vector is the class of the relationship of the sentence.

In a back propagation process, the probability vector obtained based on forecasting is compared with a real tag of the sentence to calculate a loss function, and then the weight is updated by the gradient descent algorithm based on the forecast loss. The above training process is repeated till the loss function is converged to obtain the relationship extraction model.

Those skilled in the art can understand the specific process of training the entity extraction model by the above description. In addition, those skilled in the art can understand that, in the method of the present invention, a relationship extraction model for extracting the relationship between the entities in the architecture design document may also be pre-stored, and the relationship extraction model is not limited to being generated by the above training method.

As mentioned above, the relationship between the entities included in the architecture design document can be extracted by using the entity extraction model and the relationship extraction model. Then, in block 106, a dependence design rule that represents a relationship between the entities included in the architecture design document can be generated based on the extracted relationship between the entities.

In one example, the dependence design rule may be formatted into a triple form (an entity 1, an entity 2 and a relationship between the entity 1 and the entity 2).

In block S108, the source code developed based on the architecture design document is converted into a first dependence tree.

An exemplary dependence tree based on source code conversion is illustrated below. Those skilled in the art can check the dependence tree to leam about a dependence relationship between the entities in the source code.

Components " : {

" Component" : {

" name " : " com. company. mo du les " ,

" Component" : [ "name" : " com. company. modules .DeviceConfigurationOperation" ,

"Component" : {

"name" : " co m. company. modules .DeviceConfigurationOperation.Interf aces " ,

"Class " : { name" : " com. company. modules .DeviceConfigurationOperation.IDeviceConfigurationOperation " ,

"Property" : {

"name" : "abstract" ,

"value" : "true"

1.

"Refs":[

{

" targetName" : " co m. company mo du le s . ADCDri ver. I ADCDri ver" ,

"ref Type" : "Include" ,

" strength" : " 1 " ,

"Property" : {

"name" : "linenumber" ,

"value" : " 13 "

targetName" : " com. company. modules .B ackgroundDiagno sis .IB ackgroundDiagno sis " , ref Type" : "Include" ,

strength" : " 1 " ,

Property" : {

"name" : "linenumber" ,

value" : " 14

}

Specifically, the step that the source code developed based on the architecture design document is converted into a first dependence tree may include that: the source code is scanned by using a static code analysis tool to obtain a dependence structure matrix, and the dependence tree that represents a dependence relationship between the entities included in the architecture design document is built based on the dependence structure matrix.

In the method according to the embodiment of the present invention, the source code stored in a file system may be scanned, and the source code stored in a Version Control System (VCS) may also be scanned.

Those skilled in the art can understand the specific process of scanning the source code by using the static code analysis tool such as LATTIX and converting it into the dependence tree, so that the descriptions thereof are omitted here.

Finally, in block S110, the first dependence tree is compared with the dependence design rule to determine whether a dependence violation exists in the source code.

Specifically, the relationships between the entities in the first dependence tree converted from the source code and the relationships between the entities in dependence design rule generated based on the architecture document can be compared one to one. If there is any inconsistency, it is indicated that the dependence violation exists, otherwise, it is indicated that no dependence violation exists.

Fig. 4 is a flowchart illustrating a method 400 for determining existence of a dependence violation in a source code according to another embodiment of the present invention.

In block S402 of Fig. 4, an architecture design document for developing a source code is input into to a pre-stored entity extraction model to extract a word vector of each word included in the architecture design document, where words include entity words and non entity words. In block S404, each sentence is converted into sequence data represented by the word vectors of the words according to an order of the words in each sentence of the architecture design document, and the sequence data, obtained by conversion, of each sentence are respectively input into a pre-stored relationship extraction model to extract a relationship between entities included in the architecture design document. In block S406, a dependence design rule that represents a relationship between the entities included in the architecture design document based on the extracted relationship between the entities. In block S408, the source code developed based on the architecture design document is converted into a first dependence tree.

It can be seen that the processing in the blocks S402, S404, S406 and S408 in the method 400 in Fig. 4 are similar to that in the blocks S102, S104, S106 and S 108 in Fig. 1, so that the descriptions thereof are omitted herein.

After the processing of S406, in block S407, the dependence design rule obtained in step S406 is converted into a second dependence tree.

Those skilled in the art can understand the specific operation of converting the dependence design rule into the second dependence tree, so that the descriptions thereof are omitted herein.

In block S410, the first dependence tree generated in block S408 is compared with the second dependence tree generated in block S407 to determine whether the dependence violation exists in the source code.

Specifically, the relationships between the entities in the first dependence tree and the relationships between the entities in the second dependence tree can be compared one to one to determine whether the dependence violation exists.

In the present embodiment, it may be more visualized to determine whether the dependence violation exists in the source code by comparing the first dependence tree generated based on the source code with the second dependence tree generated based on the architecture design document.

Fig. 5 illustrates a block diagram of an apparatus 500 for determining existence of a dependence violation in a source code according to one embodiment of the present invention. As shown in Fig. 5, the apparatus 500 for determining existence of a dependence violation in a source code includes an entity extraction unit 502, a relationship extraction unit 504, a dependence design rule generation unit 506, a first dependence tree conversion unit 508 and a dependence violation determination unit 510.

The entity extraction unit 502 is configured to input an architecture design document for developing a source code into to a pre- stored entity extraction model to extract a word vector of each word included in the architecture design document, where words include entity words and non-entity words.

The relationship extraction unit 504 is configured to convert each sentence into sequence data represented by the word vectors of the words according to an order of the words in each sentence of the architecture design document, and respectively input the sequence data, obtained by conversion, of each sentence into a pre- stored relationship extraction model to extract a relationship between entities included in the architecture design document.

The dependence design rule generation unit 506 is configured to generate a dependence design rule that represents a relationship between the entities included in the architecture design document based on the extracted relationship between the entities.

The first dependence tree conversion unit 508 is configured to convert the source code developed based on the architecture design document into a first dependence tree.

The dependence violation determination unit 510 is configured to compare the first dependence tree with the dependence design rule to determine whether a dependence violation exists in the source code. The dependence design rule generation unit 506 is further configured to: format the dependence design rule into a triple form (an entity 1, an entity 2 and a relationship between the entity 1 and the entity 2).

The first dependence tree conversion unit 508 is further configured to: scan the source code by using a static code analysis tool to obtain a dependence structure matrix, and build the dependence tree that represents a dependence relationship between the entities included in the architecture design document based on the dependence structure matrix.

The first dependence tree conversion unit 508 is further configured to: scan the source code stored in a file system or scan the source code stored in a version control system.

The entity extraction model and the relationship extraction model are obtained by training a plurality of tagged architecture design documents serving as training data sets by using a neural network.

Fig. 6 illustrates a block diagram of exemplary configuration of an apparatus 600 for determining existence of a dependence violation in a source code according to another embodiment of the present invention.

In the example as shown in Fig. 6, the apparatus 600 includes an entity extraction unit 602, a relationship extraction unit 604, a dependence design rule generation unit 606, a second dependence tree conversion unit 607, a first dependence tree conversion unit 608 and a dependence violation determination unit 610.

The configurations of the entity extraction unit 602, the relationship extraction unit 604, the dependence design rule generation unit 606 and the first dependence tree conversion unit 608 in the apparatus 600 are similar to the configurations of the entity extraction unit 502, the relationship extraction unit 504, the dependence design rule generation unit 506 and the first dependence tree conversion unit 508 in the apparatus 500, so that the descriptions thereof are omitted herein.

The apparatus 600 for determining existence of a dependence violation in a source code as shown in Fig. 6 also includes the second dependence tree conversion unit 607, configured to: convert the dependence design rule generated in the dependence design rule generation unit 606 into a second dependence tree.

The dependence violation determination unit 610 is configured to compare the first dependence tree converted by the first dependence tree conversion unit 608 with the second dependence tree converted by the second dependence tree conversion unit 607 to determine whether the dependence violation exists in the source code.

The details of operations and functions of all portions of the apparatuses 500 and 600 for determining existence of a dependence violation in a source code may be, for example, the same as or similar to the related portions of the method for determining existence of a dependence violation in a source code described with reference to Figs. 1 to 4 according to the embodiment of the present invention, so that no more details will be described herein.

It should be noted here that the apparatuses 500 and 600 for determining existence of a dependence violation in a source code as shown in Figs. 5 and 6 and the structures of constituting units thereof are merely exemplary, and those skilled in the art can modify the structural block diagrams shown in Figs. 5 and 6 as needed.

The above describes the embodiments of the method and apparatus for determining existence of a dependence violation in a source code according to the present invention with reference to Figs. 1 to 6. The aforementioned apparatus for determining existence of a dependence violation in a source code may be implemented via hardware, and may also be implemented via software or a combination of hardware and software.

In the present invention, the apparatuses 500 and 600 for determining existence of a dependence violation in a source code may be implemented by an electronic device. Fig. 7 illustrates a block diagram of an electronic device 700 for determining existence of a dependence violation in a source code according to one embodiment of the present invention. According to one embodiment, the electronic device 700 may include at least one processor 702. The processor 702 executes at least one computer readable instruction (e.g., an element implemented in the form of software) stored or encoded in a computer readable storage medium (e.g., a memory 704).

In one embodiment, the memory 704 stores a computer executable instruction. When the computer executable instruction is executed, the at least one processor 702 is enabled to implement the following steps that: an architecture design document for developing a source code is input into to a pre- stored entity extraction model to extract a word vector of each word included in the architecture design document, where words include entity words and non entity words; each sentence is converted into sequence data represented by the word vectors of the words according to an order of the words in each sentence of the architecture design document, and the sequence data, obtained by conversion, of each sentence are respectively input into a pre- stored relationship extraction model to extract a relationship between entities included in the architecture design document; a dependence design rule that represents a relationship between the entities included in the architecture design document is generated based on the extracted relationship between the entities; the source code developed based on the architecture design document is converted into a first dependence tree; and the first dependence tree is compared with the dependence design rule to determine whether a dependence violation exists in the source code.

It should be understood that when the computer executable instruction stored in the memory 704 is executed, the at least one processor 702 is enabled to implement the above various operations and functions described in combination with Figs. 1 to 4 in the various embodiments of the present invention.

According to one embodiment, a program product such as a non-transient machine readable medium is provided. The non-transient machine readable medium may have an instruction (e.g., the above element implemented in the form of software). When the instruction is executed by a machine, the machine is enabled to implement the above various operations and functions described in combination with Figs. 1 to 4 in the various embodiments of the present invention.

According to one embodiment, a computer program is provided, including a computer executable instruction. When the computer executable instruction is executed, at least one processor is enabled to implement the above various operations and functions described in combination with Figs. 1 to 4 in the various embodiments of the present invention.

Although exemplary embodiments have been described above through the specific implementations set forth with reference to the accompanying drawings, they are not to be construed as all embodiments that can be implemented or fall within the protection scope of the claims. The term "exemplary" as used throughout the specification means "as an example", "for instance", or "for example", and does not indicate any "preference" or "advantage" over other embodiments. For the purpose of providing a understanding of the described technologies, the specific implementations include specific details. However, these technologies may also be implemented without these specific details. In some examples, to avoid difficulties in understanding the concept of the described embodiments, known structures and apparatuses are shown in the form of block diagrams.

The foregoing descriptions of the content of the present disclosure are provided to enable any person of ordinary skill in the art to implement or use the content of the present disclosure. It is obvious that a person of ordinary skill in the art may make various modifications to the content of the present disclosure, and may also apply the general principles defined in this specification to other variations without departing from the protection scope of the content of the present disclosure. Therefore, the content of the present disclosure is not limited to the examples and designs described in the present disclosure, but is intended to be interpreted in its broadest sense consistent with the principles and novel features of the present disclosure.