Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS FOR PRODUCING PEPTIDES USING ENGINEERED INTEINS
Document Type and Number:
WIPO Patent Application WO/2014/191455
Kind Code:
A1
Abstract:
The present invention provides a method for producing peptides by recombinant means. The peptides are expressed as part of a fusion protein comprising the target peptide and an engineered intein. The invention also provides the engineered inteins, fusion proteins comprising these, and DNA constructs coding for these fusion proteins. Upon thiol-induced cleavage of the fusion protein the carboxy-terminal a-thioester of the target peptide is obtained. The carboxy-terminal α-thioester can in principle react with any nucleophile and the strategy therefore allows a wider range of carboxy-terminal modifications such as chemical ligation, bioconjugation, or amidation. The engineered inteins of the present invention are minimized in size and has a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ) leading to increased expression levels of the fusion protein and higher yields of the isolated target peptide, thus making the method of the invention suitable for production scale.

Inventors:
SHAW ALLAN CHRISTIAN (DK)
NORRILD JENS CHRISTIAN (DK)
ALBERTSEN LOUISE (DK)
Application Number:
PCT/EP2014/061048
Publication Date:
December 04, 2014
Filing Date:
May 28, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NOVO NORDISK AS (DK)
International Classes:
A61K38/22; C07K7/00; C07K14/575; C12N15/09; C12N15/62
Domestic Patent References:
WO2004029245A12004-04-08
WO2012100176A22012-07-26
WO2013045632A12013-04-04
Other References:
CUI C ET AL: "Elimination of in vivo cleavage between target protein and intein in the intein-mediated protein purification systems", PROTEIN EXPRESSION AND PURIFICATION, ACADEMIC PRESS, SAN DIEGO, CA, vol. 50, no. 1, 1 November 2006 (2006-11-01), pages 74 - 81, XP024908862, ISSN: 1046-5928, [retrieved on 20061101], DOI: 10.1016/J.PEP.2006.05.019
P. L. STAROKADOMSKYY ET AL: "Utilization of protein splicing for purification of the human growth hormone", MOLECULAR BIOLOGY, vol. 42, no. 6, 1 December 2008 (2008-12-01), pages 966 - 972, XP055087672, ISSN: 0026-8933, DOI: 10.1134/S0026893308060198
IAN R. COTTINGHAM ET AL: "A method for the amidation of recombinant peptides expressed as intein fusion proteins in Escherichia coli.", NATURE BIOTECHNOLOGY, vol. 19, no. 10, 1 October 2001 (2001-10-01), pages 974 - 977, XP055087005, ISSN: 1087-0156, DOI: 10.1038/nbt1001-974
BJORKLUND M ET AL: "Use of intein-directed biosynthesis to improve serum stability and bioactivity of a gelatinase inhibitory peptide", COMBINATORIAL CHEMISTRY AND HIGH THROUGHPUT SCREENING, BENTHAM SCIENCE PUBLISHERS, NL, vol. 6, 1 January 2003 (2003-01-01), pages 29 - 35, XP002984195, ISSN: 1386-2073
GERRIT VOLKMANN ET AL: "Protein C-Terminal Labeling and Biotinylation Using Synthetic Peptide and Split-Intein", PLOS ONE, vol. 4, no. 12, 21 December 2009 (2009-12-21), pages e8381, XP055087008, ISSN: 1932-6203, DOI: 10.1371/journal.pone.0008381
CHEN Y Q ET AL: "Expression of a cytotoxic cationic antibacterial peptide in Escherichia coli using two fusion partners", PROTEIN EXPRESSION AND PURIFICATION, ACADEMIC PRESS, SAN DIEGO, CA, vol. 57, no. 2, 1 February 2008 (2008-02-01), pages 303 - 311, XP026864280, ISSN: 1046-5928, [retrieved on 20080201], DOI: 10.1016/J.PEP.2007.09.012
SHARMA S S ET AL: "Intein-mediated protein purification of fusion proteins expressed under high-cell density conditions in E. coli", JOURNAL OF BIOTECHNOLOGY, ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM, NL, vol. 125, no. 1, 20 August 2006 (2006-08-20), pages 48 - 56, XP024956691, ISSN: 0168-1656, [retrieved on 20060820], DOI: 10.1016/J.JBIOTEC.2006.01.018
TELENTI ET AL: "The Mycobacterium xenopi GyrA protein splicing element: characterization of a minimal intein.", JOURNAL OF BACTERIOLOGY, vol. 179, no. 20, 1 October 1997 (1997-10-01), pages 6378 - 6382, XP055087035, ISSN: 0021-9193
HIRAGA ET AL: "Minimization and stabilization of the Mycobacterium tuberculosis recA intein", JOURNAL OF MOLECULAR BIOLOGY, ACADEMIC PRESS, UNITED KINGDOM, vol. 354, no. 4, 9 December 2005 (2005-12-09), pages 916 - 926, XP005216839, ISSN: 0022-2836, DOI: 10.1016/J.JMB.2005.09.088
LOUISE ALBERTSEN ET AL: "Recombinant Production of Peptide C -Terminal [alpha]-Amides Using an Engineered Intein", BIOCONJUGATE CHEMISTRY, 6 November 2013 (2013-11-06), XP055087207, ISSN: 1043-1802, DOI: 10.1021/bc4002689
Download PDF:
Claims:
CLAIMS

1. A method for producing a peptide, which comprises the step of expressing the

peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide and an engineered intein, wherein said intein is minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ).

2. A method according to claim 1 , wherein said intein is a GyrA intein.

3. A method according to claim 2, wherein said intein is a Mxe GyrA intein.

4. A method according to claim 3, wherein said intein is a Mxe GyrA intein minimized in size by excision of the residues corresponding to residues 107-164 of Mxe GyrA intein (SEQ ID NO:1 ) or a part of the residues corresponding to residues 107-164 of Mxe GyrA intein (SEQ ID NO:1 ) and where the excised residues are replaced by a linker comprising 1 to 10 amino acids.

5. A method according to claim 4, wherein the linker comprises from 6 to 10 amino

acids, and wherein at least 6 of the amino acids of the linker are glycine and/or serine.

6. A method according to claim 5, wherein the sequence of said intein is SEQ ID NO:

20.

7. A method according to any of the preceding claims, wherein the target peptide is an a-amidated peptide, such as PYY, PP, α-CGRP, CT and amylin or analogues thereof

8. A method according to any of the preceding claims, wherein the fusion protein further comprises a purification tag and optionally a protease site, which allows for identification and/or purification by affinity chromatography or other chromatographic methods, and where said protease site allows for detachment of such purification tag.

9. A method according to any of the preceding claims, further comprising thiol-induced cleavage of the fusion protein resulting in the a-thioester of the target peptide.

10. A method for producing an α-amidated peptide, which comprises the step of

expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide and an intein of SEQ ID NO: 20.

1 1 . A method according to any of the preceding claims, wherein the fusion protein is expressed in bacteria, yeast, mammalian cells or in a body fluid of a transgenic mammal.

12. A fusion protein comprising a target peptide and an intein, wherein said intein is minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ).

13. An intein which is minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ).

14. An intein according to claim 13, wherein the sequence of said intein is SEQ ID NO:

20.

15. A DNA construct coding for a fusion protein as defined in claim 12.

Description:
DESCRIPTION

TITLE: METHODS FOR PRODUCING PEPTIDES USING ENGINEERED INTEINS TECHNICAL FIELD

The present invention relates to methods of producing peptides comprising expression of the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide and an engineered intein.

BACKGROUND

Peptides are a rapidly growing class of therapeutics with more than 50 peptide- based products currently on the market and even more in development covering disease areas such as immunology, oncology, neurology and endocrinology. Peptides regulate a plethora of physiological functions, mainly by interactions with specific cellular receptors, whereby they induce cellular signalling events, such as neurotransmission and release of hormones. Endogenous peptides have been associated with challenges as therapeutics due to their limited in vivo stability and bioavailability. However, the high specificity and low toxicity combined with improved ability to selectively modify and improve therapeutic properties of peptides has increased the relevance of peptides in drug development.

In endocrinology, diseases are often caused by or associated with an imbalance of the level of peptide hormones, as seen in diseases such as diabetes and obesity. Notably, about half the peptide hormones in the endocrine and nervous systems are oamidated in their C-terminal and the oamide moiety is often crucial for biological activity and stability. Certain therapeutic peptides including peptide hormones involved in obesity and diabetes (e.g. peptide YY (PYY), pancreatic peptide (PP), a-calcitonin gene related peptide (a-CGRP), calcitonin (CT), and amylin) require an a-amide moiety in the C-terminal to obtain full biological activity.

The most widely used technologies for production of peptide therapeutics are microbial expression systems and chemical synthesis. While a peptide C-terminal amide is easily achieved by chemical synthesis, it is not readily introduced into recombinant peptides derived from microbial hosts, which lack an a-amidating enzymatic machinery. Therefore, the a-amide has to be introduced as a post translational modification.

Inteins are autocatalytic protein domains which are expressed in unicellular organisms with flanking protein sequences at both amino- and carboxy-termini. The amino- and carboxy-terminal sequences have been named exteins in keeping with the DNA nomenclature of exons and introns. A seemingly typical member of the emerging family of inteins is the GyrA gene product from Mycobacterium xenopi (Mxe GyrA). This is

approximately 22 kDa in molecular mass and contains a number of crucial amino acids at the amino-terminus (cysteine) and at the carboxy-terminus (histidine and asparagine). In addition, the carboxy-terminal extein must start with a cysteine, serine or threonine. At some point after translation is completed, the peptide bond between the amino-terminal extein and the intein is converted into a thioester bond by an N-to-S acyl shift involving the cysteine at the amino-terminal of the intein. This bond is then exchanged with the nucleophilic residue (serine, threonine or cysteine) at the start of the carboxy-terminal extein and then, with participation of the asparagine at the C-terminus of the intein, the intein excises itself out, while a second acyl shift generates a native peptide bond between the amino- and carboxy- terminal exteins. The overall effect of these concerted reactions is that the two exteins are seamlessly joined and the intein is released.

Mutant inteins have been designed where the self-splicing function has been disabled by a mutation to allow cleavage at either the amino- or carboxy-terminal splice junctions. For the Mxe GyrA intein, amino-terminal cleavage has been enabled by a N198A mutation. Replacement of the amino-terminal extein by another polypeptide sequence, the target peptide, enables preparation of the target peptide with a reactive carboxy-terminal a- thioester handle after cleavage of the resulting fusion protein with a nucleophilic chemical agent such as sodium 2-mercaptoethanesulfonate (MESNa). Such intein-derived a- thioesters can be reacted with any nucleophile and is useful as a chemical handle for chemical ligation, bioconjugation or amidation. The intein-based approach has been used to generate a-amidated peptides recombinantly in a laboratory scale (WO 98/50563 A1 ; WO 00/00625 A1 ; Cottingham I. R. et al., Nat. Biotechnol. 2001 , 19, 974-977).

The primary limitation of using this technology for large scale production of C- terminally a-amidated peptides is the low yields generally observed, which may be ascribed to a combination of the large size of the intein and hydrolytic instability of the intein fusion protein. Introduction of a T3C mutation in the Mxe GyrA intein has been shown to be associated with reduced premature cleavage (Cui C. et al. Protein Expr. Purif. 2006, 50, 74- 81). Furthermore, the size of the intein is large relative to that of the peptide hormones and a reduction in intein size could potentially improve the final yield of the peptide hormone by a more economical usage of the host protein synthesis machinery. SUMMARY

The present invention provides a method for producing peptides by recombinant means. The peptides are expressed as part of a fusion protein comprising the target peptide and an engineered intein and upon thiol-induced cleavage of the fusion protein the carboxy- terminal a-thioester of the target peptide is obtained. The carboxy-terminal a-thioester can in principle react with any nucleophile and the strategy therefore allows a wider range of carboxy-terminal modifications such as chemical ligation, bioconjugation, or amidation. Another advantage of the intein-based strategy is that the peptide α-thioester is generated by thiolysis of the peptide-intein fusion protein, potentially avoiding the need for processing enzymes.

In one aspect, the invention provides a method for producing a peptide, which comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide and an intein, wherein said intein is minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ).

In another aspect, the invention provides a fusion protein comprising a target peptide and an intein, wherein said intein is minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ). In another aspect, the invention provides an intein which is minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ).

In another aspect, the invention provides a DNA construct encoding a fusion protein comprising at least a target peptide and an intein, wherein said intein is minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ).

The recombinant production of peptide a-thioesters using an intein-based approach involving expression of the target peptide as an amino-terminal fusion to the intein has been limited to laboratory scale due to low yields of target peptides.

The engineered inteins of the present invention are minimized in size and has a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ) leading to increased expression levels of the fusion protein and higher yields of the isolated target peptide, thus making the method of the invention suitable for production scale. BRIEF DESCRIPTION OF DRAWINGS

Figure 1 : Representative pET1 1 a vector map of plasmid 1A encoding [Gly4]-/?PYY4-36 (SEQ ID NO: 10) fused between an amino-terminal tag (SEQ ID NO: 4) containing a protease site (SEQ ID NO: 8) and a carboxy-terminal GyrA intein variant (SEQ ID NO: 17). The entire fusion protein is marked with the feature Protein 1A. Ndel, Nhel, Nsil, Xhol and BamHI restriction enzyme sites are depicted. T7 promoter region, Ampecillin resistance gene, lacl repressor region and origin of replication site are also shown in the vector map.

Figure 2: Schematic representation of the basic principal of making carboxy-terminal modified target peptides. The target peptide is fused to a carboxy-terminal intein that can be liberated by nucleophilic cleavage generating the target peptide as an a-thioester, a-amide or with a synthetic carboxy-terminal fragment. As shown here, the target peptide is further fused to an amino-terminal affinity tag, which can be removed by enzymatic cleavage. This purification tag could also be placed carboxy-terminally to the intein, avoiding the need for processing enzymes.

DESCRIPTION

Peptides are of increasing interest as therapeutics in a wide range of diseases, including metabolic diseases such as diabetes and obesity. Certain therapeutic peptides including peptide hormones involved in obesity and diabetes (e.g. PYY, PP, a-CGRP, CT, and amylin) require an a-amide moiety in the carboxy-terminal to obtain full biological activity. A challenge is to generate such peptides by recombinant means and particularly in a production scale.

The invention provides engineered inteins, which has been minimized in size and carries a cysteine point mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ). The inteins can be used for recombinant production of peptides, where the peptides are expressed as part of a fusion protein comprising the target peptide and an engineered intein of the invention (see Fig. 2). The combination of the cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ) and the minimization of the intein increase the expression yield of intein fusion proteins and hence the yield of the target peptide. The latter is obtained by thiol- induced cleavage of the fusion protein leading to the a-thioester of the target peptide, which can be converted to the corresponding a-amide of the target peptide by addition of an ammonia nucleophile such as ammonia bicarbonate. The engineered inteins of the invention thus provide an optimised strategy for production of a-amidated peptide hormones such as PYY, PP, amylin, and a-CGRP. Due to the higher expression yields and higher yields of the isolated target peptides, the invention thus provides a method for recombinant production of a-amidated peptides in production scale.

The a-thioester generated by this method can also be used for chemical ligation or bioconjugations, and in one aspect, the invention provides a method for recombinant production of carboxy-terminal bioconjugated proteins and peptides.

In one aspect, the present invention relates to a method for producing a peptide, which comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide and an intein, wherein said intein is minimized in size and has a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ).

In one aspect, the invention provides a fusion protein comprising a target peptide and an intein, wherein said intein is minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ). In one aspect, the invention provides an intein which is minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ). In one aspect, the invention provides a DNA construct encoding a fusion protein comprising a target peptide and an intein, wherein said intein is minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ).

Inteins are autocatalytic protein domains which are expressed in unicellular organisms with flanking protein sequences at both amino- and carboxy-termini. In one aspect, the intein is a GyrA intein. The GyrA inteins are a class of inteins that are inserted within the gene sequence coding for DNA gyrase subunit A of some bacteria, typically

Mycobacterium species. This class of inteins consist of 16 different inteins that share high sequence homology and are characterized by having a Cys in the amino-terminal and His- Asn in the carboxy-terminal. Moreover, all of these inteins are recognized by having a Tyr in the carboxy-terminus of the amino-terminal-extein and a Thr in the amino-terminus of the carboxy-terminal-extein. Generally, these inteins have a size of approximately 420 amino acids and the amino- and carboxy-terminal splicing regions are interrupted by a DOD homing endonuclease domain. However, the Mxe GyrA intein lacks a DOD homing endonuclease domain and has compared to other GyrA inteins a smaller amount of residues between the amino- and carboxy splicing domains, resulting in an overall size of 198 amino acids. An X- ray crystal structure of this intein has shown that this region, which corresponds to amino acid residues 107-164, consist of two ohelices as well as an unstructured region that projects out from and is not part of the characteristic HINT splicing domains of inteins.

In one aspect, the intein is the Mxe GyrA intein, which has been extensively used at a laboratory scale to generate protein othioesters for chemical ligation purposes. This intein possesses several advantageous properties such as the ability to be refolded from bacterial inclusion bodies, retains activity in the presence of denaturants and low sequence requirements to the carboxy-terminal residue of the amino terminal-extein. The amino acid sequence of Mxe GyrA intein is shown in SEQ ID NO: 1 .

In one aspect, the intein is a Mxe GyrA intein minimized in size by excision of a part of the residues 107-164 or replacement of the excised residues with a linker of sequence GSGSGSGS.

In one aspect, the intein is a Mxe GyrA intein minimized in size by excision of residues 107-160 and replacement of the excised residues with a linker of sequence

GSGSGSGS.

In one aspect, the intein is a minimised Mxe GyrA intein carrying a T3C mutation with the sequence of SEQ ID NO: 20.

In one aspect, the target peptide is a peptide hormone that is oamidated in its carboxy-terminal. Around 50% of all mammalian peptide hormones are oamidated in their carboxy-terminal and this oamide functionality is often crucial for the biological activity. Several peptide hormones are today used as drugs in the treatment of metabolic diseases such as diabetes and/or obesity. An example is amylin (e.g. Symlin®, pramlintide acetate, an analogue of human amylin). Human amylin is a 37 residues oamidated peptide hormone that can be used to treat or prevent diabetes and /or obesity. Accordingly, the carboxy- terminal of amylin needs to be oamidated to obtain full biological activity. Likewise PYY, PP, CT and oCGRP should be oamidated to obtain full biological activity. The carboxy-terminal oamide moiety may also serve as protection against proteolytic degradation. A comparison of carboxy-terminal oamidated and non-amidated glucagen-like peptide-1 (GLP-1 ) indicates that the amidation does not affect overall biological effects observed on insulin and glucagon secretion. However, extensive carboxy-terminal degradation occurs in plasma for the non- amidated variant. Thus, for some peptide and proteins carboxy-terminal amidation may be beneficial to introduce as a mean of prolonging the presence of the full-length and

biologically active protein in plasma. The present invention therefore provides an alternative method for obtaining peptides or proteins, which are carboxy-terminal oamidated for biological activity or to protect against carboxypeptidase degradation in plasma.

The target peptide is released from the fusion protein as a carboxy-terminal o thioester, which can react with many nucleophiles via nucleophilic acyl substitution reactions and the invention therefore further provides a method for producing a wider range of peptides comprising carboxy-terminal modifications such as peptide elongation by chemical ligation, bioconjugation with biological or non-biological entities, and, amidation.

Peptide YY (PYY) and pancreatic peptide (PP) both belong to a group of peptides of the PP- fold family to which neuropeptide Y (NPY) also belongs. They are all naturally secreted as 36 amino acid peptides with a carboxy-terminal amide. They are characterised by a common three-dimensional fold, the PP-fold, which is considered as a stabilizing element important for their biological function. The amino acid sequence of human PYY(1-36) and human PP(1-36) are shown in SEQ ID NO: 2 and SEQ ID NO: 14, respectively. PP-fold peptides or analogues thereof have been suggested for use in the treatment of obesity and associated diseases based on the demonstrated anorectic effects of certain of the these peptides in animal models and in man.

By PYY is meant the human peptide hormone and species variants thereof. In one aspect, the target peptide is PYY or an analogue thereof. PYY is released during a meal from L-cells in the distal small intestine and the colon. PYY is released as PYY(1-36) but is cleaved by dipeptidyl peptidase IV (DPP IV) to PYY(3-36) which constitutes approximately 50% of the circulating PYY. PYY(3-36) is known to have peripheral effects in the Gl-tract and also act centrally as a satiety signal. The terms "human PYY" and "/?PYY" are intended to mean JPYY(1 -36) according to SEQ ID NO: 2, or alternatively /?PYY(3-36) according to SEQ ID NO: 11 , which have a deletion of the amino-terminal amino acids in position 1 and 2. In one aspect the term PYY is intended to refer to human PYY. In one aspect, the target peptide is PYY(3-36). In one aspect, the target peptide is /?PYY(3-36). In one aspect the target peptide is an analogue of PYY. In one aspect the target peptide is an analogue of /?PYY. In one aspect, the target peptide is [Gly4]-PYY(4-36). In one aspect, the target peptide is [Gly4]-/?PYY(4-36) of SEQ ID NO: 10. In one aspect, the target peptide is [Arg4, Gln18, Lys30]-PYY(3-36). In one aspect, the target peptide is [Arg4, Gln18, Lys30]-/?PYY(3- 36) of SEQ I D NO: 15.

In one aspect, the target peptide is PP or an analogue thereof. Pancreatic

Polypeptide (PP) PP is a hormone secreted from the endocrine cells in pancreatic islets and release is stimulated by food intake. PP is known to reduce food-intake and potentially increase energy expenditure. By PP is meant the human peptide hormone and species variants thereof. In one aspect, the target peptide is PP. In one aspect the target peptide is hPP. In one aspect, the target peptide is /?PP(1 -36) according to SEQ I D NO: 14.

Calcitonin gene-related peptide (CGRP) is a peptide which, in several species, exists in two forms, designated α-CGRP and β-CGRP (or CGRP-I and CGRP-II,

respectively). CGRP peptides are highly conserved within species. CGRP is released from, e.g., sensory, motor and enteric nerves. CGRP triggers various pharmacological effects, e.g.: 1 ) vasodilation, 2) muscle and liver AMP kinase (AMPK) activation and lipolysis and/or fat oxidation, 3) reduction in food intake, 4) inhibition of gastric emptying and modification of gut function and 5) increasing glycolysis and inhibition of glycogen synthesis. Whereas AMPK activation, fat oxidation and reduced food intake may be beneficial in metabolic diseases, glycolysis and inhibition of glycogen synthesis has been suggested to mediate insulin resistance.

By a-CGRP is meant the human peptide hormone and species variants thereof. In one aspect, the target peptide is α-CGRP or an analogue thereof. In one aspect, the target peptide is α-CGRP. In one aspect, the target peptide is ha-CGRP of SEQ I D NO: 12.

Calcitonin (CT) has been used clinically for treatment of disorders of calcium metabolism and pain, and its relationship to increased glucose levels in mammals has been the subject of varying reports. The use of calcitonins in the treatment of diabetes mellitus has also been described.

Calcitonin is a small peptide produced by the parafollicular cells of the thyroid gland in mammals and by the ultimobranchial glands of birds and fish. Many types of calcitonin have been isolated, such as human calcitonin, salmon calcitonin, eel calcitonin, elkatonin, porcine calcitonin, and chicken calcitonin. There is significant structural non-homology among the various calcitonin types. For example, there is only 50% identity between the amino acids making up human calcitonin and those making up salmon calcitonin. Human calcitonin (/?CT) is a peptide hormone containing 32 amino acid residues which is produced primarily by the Parafollicular (also known as C) cells of the thyroid. Salmon calcitonin is also a 32-mer polypeptide.

By "calcitonin" or "CT" is meant the human peptide hormone and species variants thereof, including human calcitonin (/?CT) or salmon calcitonin (sCT). In one aspect, the target peptide is CT or an analogue thereof. In one aspect, the target peptide is CT. In one aspect, the target peptide is human CT. In one aspect, the target peptide is hCT of SEQ ID NO: 13.

In one aspect, the target peptide is amylin or an analogue thereof. Human amylin (frAmylin) is a 37 amino acid long polypeptide of SEQ ID NO: 3 that binds to two distinct receptor complexes. These two complexes contain the calcitonin receptor plus a receptor activity-modifying protein, RAMP1 or RAMP3. From the close relationship between the calcitonin receptor and the amylin receptor some cross-reactivity to the calcitonin receptor may be expected of amylin receptor agonist. The calcitonin receptor is found in many tissues throughout the body and it is believed to be involved in regulation of bone metabolism.

Polypeptides with activity at the calcitonin receptor could be useful in the treatment of hypercalcaemia, osteoporosis, Pagets disease, obesity or obesity related diseases as well as in the prevention of obesity related diseases.

By amylin is meant the human peptide hormone and species variants thereof. In one aspect, the target peptide is amylin. In one aspect, the target peptide is [Asp14, Arg17, Pro21 , Pro27, Arg35]-Amylin. In one aspect, the target peptide is [Asp14, Arg17, Pro21 , Pro27, Arg35]-/?Amylin of SEQ ID NO: 16.

In one aspect, the intein sequence is located at the carboxy-terminus of the target peptide. The term "polypeptide" and "peptide" as used herein means a compound composed of at least five constituent amino acids connected by peptide bonds. All amino acids for which the optical isomer is not stated is to be understood to mean the L-isomer. However, also contemplated within the scope of the invention are D-amino acid residues of one or more of the amino acids.

The constituent amino acids of the peptides according to the invention may be from the group of the amino acids encoded by the genetic code and they may be natural amino acids which are not encoded by the genetic code, as well as synthetic amino acids. Natural amino acids which are not encoded by the genetic code are e.g., γ-carboxyglutamate, ornithine, phosphoserine, D-alanine and D-glutamine. Synthetic amino acids comprise amino acids manufactured by chemical synthesis, i.e. D-isomers of the amino acids encoded by the genetic code such as D-alanine and D-leucine, Aib (oaminoisobutyric acid), Abu (o aminobutyric acid), Tie (tert-butylglycine), β-alanine, 3-aminomethyl benzoic acid, anthranilic acid.

The 22 proteinogenic amino acids are: Alanine, Arginine, Asparagine, Aspartic acid, Cysteine, Cystine, Glutamine, Glutamic acid, Glycine, Histidine, Hydroxyproline, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Proline, Serine, Threonine, Tryptophan, Tyrosine, Valine.

Thus a non-proteinogenic amino acid is a moiety which can be incorporated into a peptide via peptide bonds but is not a proteogenic amino acid. Examples are v- carboxyglutamate, ornithine, phosphoserine, the D-amino acids such as D-alanine and D- glutamine, synthetic non-proteogenic amino acids comprise amino acids manufactured by chemical synthesis, i.e. D-isomers of the amino acids encoded by the genetic code such as D-alanine and D-leucine, Aib (oaminoisobutyric acid), Abu (oaminobutyric acid), Tie (tert- butylglycine), 3-aminomethyl benzoic acid, anthranilic acid, des-amino-Histidine, the beta analogues of amino acids such as β-alanine etc., D-histidine, desamino-histidine, 2-amino- histidine, β-hydroxy-histidine, homohistidine, Noacetyl-histidine, ofluoromethyl-histidine, o methyl-histidine, 3-pyridylalanine, 2-pyridylalanine or 4-pyridylalanine, (1 -aminocyclopropyl) carboxylic acid, (1-aminocyclobutyl) carboxylic acid, (1 -aminocyclopentyl) carboxylic acid, (1- aminocyclohexyl) carboxylic acid, (1-aminocycloheptyl) carboxylic acid, or (1 - aminocyclooctyl) carboxylic acid.

Unnatural amino acids for use in the invention include but is not limited to, - thiotyrosine, ornithine, 3-mercaptophenylalanine, 3- or 4-aminophenylalanine, 3- or 4- acetylphenylalanine, 2- or 3- hydroxyphenylalanine (o- or m-tyrosine), hydroxymethylglycine, aminoethylglycine, 1-methyl-1 -mercaptoethylglycine, aminoethylthioethylglycine and mercaptoethylglycine. Many of the unnatural amino acids useful in the present invention are commercially available. Others may be prepared by methods known in the art.

The term "analogue" as used herein referring to a peptide means a peptide wherein at least one amino acid residue of the peptide has been substituted with another amino acid residue and/or wherein at least one amino acid residue has been deleted from the peptide and/or wherein at least one amino acid residue has been added to the peptide and/or wherein at least one amino acid residue of the peptide has been modified. Such addition, substitution or deletion of amino acid residues can take place at the amino-and/or carboxy- terminus of the peptide and/or at within the polypeptide sequence. The term "substitution" is intended to mean the change of one amino acid in the native sequence with another amino acid.

The term "deletion" is intended to mean the removal of one or more amino acids from the native sequence.

The term "insertion" is intended to mean the addition of one or more amino acid into the native sequence.

The term "modification" is intended to mean alterations covalently attached to the side chain of one or more amino acids or the alpha nitrogen atom of one or more amino acid in the native peptide sequence.

A simple nomenclature is used to describe the peptides according to the invention, e.g., [Gly4]-/?PYY(4-36) designates an analogue of the human PYY (ΛΡΥΥ) ID: SEQ ID NO: 2 wherein the naturally occurring lysine in position 4 has been substituted with glycine and the naturally occurring tyrosine, proline and isoleucine in position 1 , 2 and 3, respectively, have been deleted. In one aspect the target peptide may be derived from vertebrates, such as a mammal, including human, mouse, sheep, goat, cow or horse. The term "vertebrate" means members of the subphylum Vertebrata, a primary division of the phylum Chordata that includes the fish, amphibians, reptiles, birds, and mammals, all of which are

characterized by a segmented spinal column and a distinct well-differentiated head. The term "mammal" means humans as well as all other warm-blooded members of the animal kingdom possessed of a homeostatic mechanism in the class Mammalia, e.g., companion mammals, zoo mammals, and food-source mammals. Some examples of companion mammals are canines (e.g., dogs), felines (e.g., cats) and horses; some examples of food- source mammals are pigs, cattle, sheep, and the like. In one aspect the mammal is a human or a companion mammal. In one aspect the mammal is a human, male or female.

In a further refinement of the system the fusion protein also comprises a tag, which allows for identification and/or purification of the fusion protein, and thus the peptide, by affinity or other chromatographic methods. Examples of a suitable tag include a specific chitin-binding domain, or part thereof, a repeat of acidic or basic amino acids, glutathione transferase tags, tags recovered with antibodies such as FLAG tag, HA tag, MYC tag, biotin or streptavidin and a small polypeptide sequence containing at least five histidines, a His-tag, for immobilized metal affinity chromatography . The purification tag may also comprise a highly basic ribosomal protein derived from thermophilic bacteria as described in

international patent applications published under number WO 2006/108826 and WO

2008/043847. For example, the fusion protein may include a His-tag that binds tightly to an immobilized metal ion affinity chromatography column that can be used for the affinity purification of the intact fusion protein.

In one aspect, the fusion protein further comprises a purification tag and optionally a protease site, which allows for identification and/or purification by affinity chromatography or other chromatographic methods. In one aspect, the purification tag is alkaline. In one aspect, the purification tag comprises a histidine tag. In one aspect, the purification tag and optionally protease site is located at the amino-terminus of the target peptide.

Any expression system which can operate on a commercial scale is suitable although the intein based vector described above is designed for use in E. coli. Other vectors can be designed for optimal use in a particular expression system. For example, if a mammalian expression system was chosen, then protein-encoding regions should have optimised codon usage for that particular system. Examples of expression systems which could be used to express peptide fusion proteins include bacteria (E. coli, B.subtilis etc.), yeast (S. cerevisiae, P.pastoralis etc.), insect cells (S. frugiperda), mammalian expression systems (Chinese hamster ovary, baby hamster kidney etc.), transgenic mammalian expression in milk or other body fluids (preferably pig, cow, sheep, goat, rabbit etc). In the case of an E.coli expression system, the initiator methionine may be retained in the expression product. One example is PYY(3-36) that has an amino-terminal isoleucine, which is not expected to result in efficient removal of the initiator methionine by methionine aminopeptidase after expression in E. coli. However, the native amino-terminal of the target peptide can be obtained by fusing the amino-terminal to a label such as a purification tag with an intervening linker containing a protease site. Subsequent removal of the purification tag by use of an appropriate protease (e.g. alp, enterokinase or human rhinovirus-14 3C (HRV14-3C)) or peptidase (e.g. dipeptidyl aminopeptidase 1 (DAP1 )) will generate the native amino-terminal of the target peptide. Alternatively, the initiator methionine can be removed chemically by contacting with cyanogen bromide.

In one aspect, the fusion protein is expressed in bacteria, yeast, mammalian cells or in a body fluid of a transgenic mammal. In one aspect, the fusion protein is expressed in bacteria or yeast. In one aspect, the fusion protein is expressed in bacteria. In one aspect, the fusion protein is expressed in E. coli.

Thioesters are relatively reactive chemical groups, compared to either peptide bonds or oxygen-esters, and are therefore readily converted to amides under mild reaction conditions. The preferred reagent is MESNa but many other mercaptogroup (sulfhydryl-, thiol-group) containing reagents could also function effectively. The released MES a-thioester is relatively stable to hydrolysis by water (which irreversibly would generate the unwanted free acid) and is suitable for reaction with any chemical conditions that will promote amide formation.

In one aspect, the method comprises thiol-induced cleavage of the fusion protein resulting in the α-thioester of the target peptide. In one aspect, the method further comprises conversion of the α-thioester to the corresponding a-amide of the target peptide. In one aspect, the amidation step occurs in the presence of an ammonium nucleophile. In one aspect, the ammonium nucleophile is provided as an aqueous solution of ammonium bicarbonate.

In one aspect, the invention provides a method for producing the [Gly4]-/?PYY(4-36) of SEQ I D NO: 10, which comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide of SEQ I D NO: 10, an intein of SEQ I D NO: 20, a purification tag of SEQ I D NO: 4 and a protease site of SEQ I D NO: 8, wherein said intein sequence is located at the carboxy- terminus of the target peptide and said purification tag and protease site is located at the amino-terminus of the target peptide.

In one aspect, the invention provides a method for producing /?PYY(3-36) of SEQ I D NO: 1 1 , which comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide of SEQ I D NO: 1 1 , an intein of SEQ ID NO: 20, a purification tag of SEQ I D NO: 5 and a protease site of SEQ I D NO: 9, wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag and protease site is located at the amino-terminus of the target peptide.

In one aspect, the invention provides a method for producing fra-CGRP of SEQ ID NO: 12, which comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide of SEQ I D NO: 12, an intein of SEQ ID NO: 20, a purification tag of SEQ I D NO: 5 and a protease site of SEQ I D NO: 9, wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag and protease site is located at the amino-terminus of the target peptide.

In one aspect, the invention provides a method for producing hPP of SEQ I D NO:

14, which comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide of SEQ I D NO: 14, an intein of SEQ I D NO: 20, a purification tag of SEQ I D NO: 5 and a protease site of SEQ ID NO: 9, wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag and protease site is located at the amino-terminus of the target peptide. In one aspect, the invention provides a method for producing [Arg4, Gln18, Lys30]- /?PYY(3-36) of SEQ ID NO: 15, which comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide of SEQ ID NO: 15, an intein of SEQ ID NO: 20, and a purification tag of SEQ ID NO: 6, wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag is located at the amino-terminus of the target peptide.

In one aspect, the invention provides a method for producing [Asp14, Arg17, Pro21 , Pro27, Arg35]-/?Amylin of SEQ ID NO: 16, which comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide of SEQ ID NO: 16, an intein of SEQ ID NO: 20, and a purification tag of SEQ ID NO: 7, wherein said intein sequence is located at the amino-terminus of the target peptide and said purification tag is located at the carboxy-terminus of the target peptide.

Expression could be optimised for any of these systems, and for intracellular or extracellular production, by the appropriate selection of leader sequence, codon usage, intein or mutant thereof, and purification strategy. The skilled person will appreciate that this invention is not tied to any particular target peptide or any species as a source.

EMBODIMENTS OF THE INVENTION

Non-limiting embodiments of the invention are:

1. A method for producing a peptide, which comprises the step of expressing the

peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide and an intein, wherein said intein is minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ).

2. A method according to embodiment 1 , wherein said intein is a GyrA intein.

3. A method according to any of the preceding embodiments, wherein said intein is a Mxe GyrA intein.

4. A method according to any of the preceding embodiments, wherein said intein is a Mxe GyrA intein minimized in size by excision of residues 107-164 or a part of residues 107-164 of the Mxe GyrA intein (SEQ ID NO:1 ).

5. A method according to any of the preceding embodiments, wherein said intein is a Mxe GyrA intein minimized in size by excision of residues 107-164 or a part of residues 107-164 of the Mxe GyrA intein (SEQ ID NO:1 ) and where the excised residues are replaced by a linker comprising from 1 to 10 amino acids . 6. A method according to any of the preceding embodiments, wherein said intein is a Mxe GyrA intein minimized in size by excision of residues 107-164 or a part of residues 107-164 of the Mxe GyrA intein (SEQ ID NO:1 ) and where the excised residues are replaced by a linker comprising from 6 to 10 amino acids, wherein at least 6 of the amino acids of the linker are glycine and/or serine.

7. A method according to any of the preceding embodiments, wherein said intein is a Mxe GyrA intein minimized in size by excision of residues 107-164 of the Mxe GyrA intein (SEQ ID NO:1 ) and where the excised residues are replaced by a linker of sequence GSGSGSGS.

8. A method according to any of the preceding embodiments, wherein the sequence of said intein is SEQ ID NO: 20.

9. A method according to any of the preceding embodiments, wherein said intein

sequence is located at the carboxy-terminus of the target peptide.

10. A method according to any of the preceding embodiments, wherein the target peptide is a peptide hormone involved in diabetes or obesity control.

1 1 . A method according to any of the preceding embodiments, wherein the target peptide is a peptide hormone involved in diabetes control.

12. A method according to any of the preceding embodiments, wherein the target peptide is a peptide hormone involved in obesity control.

13. A method according to any of the preceding embodiments, wherein the target peptide is an a-amidated peptide.

14. A method according to any one of the preceding embodiments, wherein an

endogenous form of the target peptide is an a-amidated peptide.

15. A method according to any of the preceding embodiments, wherein the target peptide is a carboxy-terminal bioconjugated peptide.

16. A method according to any of the preceding embodiments, wherein the target peptide is selected from PYY, PP, α-CGRP, CT and amylin or analogues thereof.

17. A method according to any of the preceding embodiments, wherein the target peptide is selected from PYY, PP, α-CGRP, and amylin or analogues thereof.

18. A method according to any one of the preceding embodiments, wherein the target peptide is PYY or an analogue thereof.

19. A method according to any one of the preceding embodiments, wherein the target peptide is /?PYY or an analogue thereof.

20. A method according to any one of the preceding embodiments, wherein the target peptide is [Gly4]-PYY(4-36). 21 . A method according to any one of the preceding embodiments, wherein the target peptide is [Gly4]-/?PYY(4-36).

22. A method according to any one of the preceding embodiments, wherein the target peptide is PYY(3-36).

23. A method according to any one of the preceding embodiments, wherein the target peptide is /?PYY(3-36).

24. A method according to any one of the preceding embodiments, wherein the target peptide is [Arg4, Gin 18, Lys30]-PYY(3-36).

25. A method according to any one of the preceding embodiments, wherein the target peptide is [Arg4, Gin 18, Lys30]-/?PYY(3-36).

26. A method according to any one of the preceding embodiments, wherein the target peptide is α-CGRP or an analogue thereof.

27. A method according to any one of the preceding embodiments, wherein the target peptide is a-CGRP.

28. A method according to any one of the preceding embodiments, wherein the target peptide is ha-CGRP.

29. A method according to any one of the preceding embodiments, wherein the target peptide is CT or an analogue thereof.

30. A method according to any one of the preceding embodiments, wherein the target peptide is CT.

31 . A method according to any one of the preceding embodiments, wherein the target peptide is hCJ.

32. A method according to any one of the preceding embodiments, wherein the target peptide is PP or an analogue thereof.

33. A method according to any one of the preceding embodiments, wherein the target peptide is PP.

34. A method according to any one of the preceding embodiments, wherein the target peptide is hPP.

35. A method according to any one of the preceding embodiments, wherein the target peptide is amylin or an analogue thereof.

36. A method according to any one of the preceding embodiments, wherein the target peptide is frAmylin or an analogue thereof.

37. A method according to any one of the preceding embodiments, wherein the target peptide is [Asp14, Arg17, Pro21 , Pro27, Arg35]-Amylin. 38. A method according to any one of the preceding embodiments, wherein the target peptide is [Asp14, Arg17, Pro21 , Pro27, Arg35]-/?Amylin.

39. A method according to any of the preceding embodiments, wherein the fusion protein further comprises a purification tag and optionally a protease site, which allows for identification and/or purification by affinity or other chromatographic methods.

40. A method according to embodiment 39, wherein the purification tag is alkaline.

41 . A method according to embodiment 39 wherein the purification tag comprises a

histidine tag.

42. A method according to any one of embodiments 39-41 , wherein the purification tag and optionally protease site is located at the amino-terminus of the target peptide.

43. A method according to any one of the embodiments 39-42, wherein the purification tag may be cleaved off with an appropriate protease to give the target peptide.

44. A method according to any of the preceding embodiments, wherein the fusion protein is expressed in bacteria, yeast, mammalian cells or in a body fluid of a transgenic mammal.

45. A method according to any of the preceding embodiments wherein the fusion protein is expressed in bacteria or yeast.

46. A method according to any of the preceding embodiments, wherein the fusion protein is expressed in bacteria.

47. A method according to any of the preceding embodiments, wherein the fusion protein is expressed in E. coli.

48. A method according to any of the preceding embodiments, further comprising thiol- induced cleavage of the fusion protein resulting in the a-thioester of the target peptide.

49. A method according to embodiment 48 further comprising conversion of the a- thioester to the corresponding a-amide of the target peptide.

50. A method according to embodiment 49, wherein the amidation step occurs in the presence of an ammonium nucleophile.

51 . A method according to embodiment 50, wherein the ammonium nucleophile is

ammonium bicarbonate.

52. A method for producing an a-amidated peptide, which comprises the step of

expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide and an intein of SEQ ID NO: 20.

53. A method according to embodiment 52, wherein said intein sequence is located at the carboxy-terminus of the target peptide. 54. A method according to any of embodiments 52-53, wherein the fusion protein further comprises a purification tag and optionally a protease site, which allows for identification and/or purification by affinity or other chromatographic methods.

55. A method according to embodiment 54 wherein the purification tag comprises a

histidine tag.

56. A method according to embodiment 55 wherein the purification tag comprises an alkaline tag.

57. A method according to any one of embodiments 54-56, wherein the purification tag and optionally protease site is located at the amino-terminus of the target peptide. 58. A method according to any one of the embodiments 54-57, wherein the purification tag may be cleaved off with an appropriate protease to give the target peptide.

59. A method according to any of embodiments 52-58, wherein the fusion protein is expressed in bacteria, yeast, mammalian cells or in a body fluid of a transgenic mammal.

60. A method according to embodiment 59, wherein the fusion protein is expressed in bacteria or yeast.

61 . A method according to embodiment 59, wherein the fusion protein is expressed in bacteria.

62. A method according to embodiment 59, wherein the fusion protein is expressed in E. coli.

63. A method for producing [Gly4]-/?PYY(4-36), which comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide of SEQ ID NO: 10, an intein of SEQ ID NO: 20, a purification tag of SEQ ID NO: 4 and a protease site of SEQ ID NO: 8, and wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag and protease site is located at the amino-terminus of the target peptide.

64. A method for producing /?PYY(3-36), which comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide of SEQ ID NO: 1 1 , an intein of SEQ ID NO: 20, a purification tag of SEQ ID NO: 5 and a protease site of SEQ ID NO: 9, and wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag and protease site is located at the amino-terminus of the target peptide.

65. A method for producing fra-CGRP, which comprises the step of expressing the

peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide of SEQ ID NO: 12, an intein of SEQ ID NO: 20, a purification tag of SEQ ID NO: 5 and a protease site of SEQ ID NO: 9, and wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag and protease site is located at the amino-terminus of the target peptide.

66. A method for producing hPP, which comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide of SEQ ID NO: 14, an intein of SEQ ID NO: 20, a purification tag of SEQ ID NO: 5 and a protease site of SEQ ID NO: 9, and wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag and protease site is located at the amino-terminus of the target peptide.

67. A method for producing [Arg4, Gin 18, Lys30]-/?PYY(3-36), which comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide of SEQ ID NO: 15, an intein of SEQ ID NO: 20, and a purification tag of SEQ ID NO: 6, and wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag is located at the amino-terminus of the target peptide.

68. A method for producing [Asp14, Arg17, Pro21 , Pro27, Arg35]-/?Amylin, which

comprises the step of expressing the peptide as part of a fusion protein, wherein said fusion protein comprises a target peptide of SEQ ID NO: 16, an intein of SEQ ID NO: 20, and a purification tag of SEQ ID NO: 7, and wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag is located at the amino-terminus of the target peptide.

69. A fusion protein comprising a target peptide and an intein, wherein said intein is

minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ).

70. A fusion protein according to embodiment 69, wherein said intein is a GyrA intein.

71 . A fusion protein according to embodiment 70 wherein said intein is a Mxe GyrA intein.

72. A fusion protein according to embodiment 71 , wherein said intein is a Mxe GyrA intein minimized in size by excision of residues 107-164 or a part of residues 107-164 of the Mxe GyrA intein (SEQ ID NO:1 ) and where the excised residues are replaced by a linker of sequence GSGSGSGS.

73. A fusion protein according to embodiment 72, wherein said intein is a Mxe GyrA intein minimized in size by excision of residues 107-160 of the Mxe GyrA intein (SEQ ID NO:1 ) and where the excised residues are replaced by a linker of sequence

GSGSGSGS. 74. A fusion protein according to embodiment 73, wherein the sequence of said intein is SEQ ID NO: 20.

75. A fusion protein according to any of embodiments 69-74, wherein said intein

sequence is located at the carboxy-terminus of the target peptide.

76. A fusion protein according to any of embodiments 69-75, wherein the target peptide is a peptide hormone involved in diabetes or obesity control.

77. A fusion protein according to any of embodiments 69-76, wherein the target peptide is a peptide hormone involved in diabetes control.

78. A fusion protein according to any of embodiments 69-76, wherein the target peptide is a peptide hormone involved in obesity control.

79. A fusion protein according to any of embodiments 69-78, wherein the target peptide is an a-amidated peptide.

80. A fusion protein according to any of embodiments 69-78, wherein an endogenous form of the target peptide is an a-amidated peptide.

81 . A fusion protein according to any of embodiments 69-78, wherein the target peptide is a carboxy-terminal bioconjugated peptide.

82. A fusion protein according to any of embodiments 69-78, wherein the target peptide is selected from PYY, PP, α-CGRP, and amylin or analogues thereof. 83. A fusion protein according to embodiment 82, wherein the target peptide is PYY or an analogue thereof.

84. A fusion protein according to embodiment 83, wherein the target peptide is /?PYY or an analogue thereof.

85. A fusion protein according to embodiment 83, wherein the target peptide is [Gly4]- PYY(4-36).

86. A fusion protein according to embodiment 83, wherein the target peptide is [Gly4]- /?PYY(4-36).

87. A fusion protein according to embodiment 83, wherein the target peptide is PYY(3- 36).

88. A fusion protein according to embodiment 83, wherein the target peptide is /?PYY(3- 36).

89. A fusion protein according to embodiment 83, wherein the target peptide is [Arg4, Gin 18, Lys30]-PYY(3-36).

90. A fusion protein according to embodiment 83, wherein the target peptide is [Arg4, Gin 18, Lys30]-/?PYY(3-36). 91 . A fusion protein according to embodiment 82, wherein the target peptide is a-CGRP or an analogue thereof.

92. A fusion protein according to embodiment 91 , wherein the target peptide is a-CGRP.

93. A fusion protein according to embodiment 91 , wherein the target peptide is /7CX-CGRP

5 94. A fusion protein according to embodiment 82, wherein the target peptide is CT or an analogue thereof.

95. A fusion protein according to embodiment 94, wherein the target peptide is CT.

96. A fusion protein according to embodiment 94, wherein the target peptide is CJ.

97. A fusion protein according to embodiment 82, wherein the target peptide is PP or an

Ι Ο analogue thereof.

98. A fusion protein according to embodiment 97, wherein the target peptide is PP.

99. A fusion protein according to embodiment 97, wherein the target peptide is hPP.

100. A fusion protein according to embodiment 82, wherein the target peptide is amylin or an analogue thereof.

101. A fusion protein according to embodiment 100, wherein the target peptide is [Asp14, Arg17, Pro21 , Pro27, Arg35]-Amylin.

102. A fusion protein according to embodiment 100, wherein the target peptide is [Asp14, Arg17, Pro21 , Pro27, Arg35]-/?Amylin.

103. A fusion protein according to any of embodiments 69-102 further comprising a

purification tag and optionally a protease site, which allows for identification and/or purification by affinity or other chromatographic methods.

104. A fusion protein according to embodiment 103, wherein the purification tag is

alkaline.

105. A fusion protein according to embodiment 104, wherein the purification tag

comprises a histidine tag.

106. A fusion protein according to any one of embodiments 103-105, wherein the

purification tag and optionally protease site is located at the amino-terminus of the target peptide.

107. A fusion protein according to any one of embodiments 103-106, wherein the

purification tag may be cleaved off with an appropriate protease to give the target peptide.

108. A fusion protein comprising a target peptide of SEQ ID NO: 10, an intein of SEQ ID NO: 20, a purification tag of SEQ ID NO: 4 and a protease site of SEQ ID NO: 8, wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag and protease site is located at the amino-terminus of the target peptide.

109. A fusion protein comprising a target peptide of SEQ ID NO: 1 1 , an intein of SEQ ID NO: 20, a purification tag of SEQ ID NO: 5 and a protease site of SEQ ID NO: 9, wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag and protease site is located at the amino-terminus of the target peptide.

1 10. A fusion protein comprising a target peptide of SEQ ID NO: 12, an intein of SEQ ID NO: 20, a purification tag of SEQ ID NO: 5 and a protease site of SEQ ID NO: 9, wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag and protease site is located at the amino-terminus of the target peptide.

1 1 1. A fusion protein comprising a target peptide of SEQ ID NO: 14, an intein of SEQ ID NO: 20, a purification tag of SEQ ID NO: 5 and a protease site of SEQ ID NO: 9, wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag and protease site is located at the amino-terminus of the target peptide.

1 12. A fusion protein comprising a target peptide of SEQ ID NO: 15, an intein of SEQ ID NO: 20, and a purification tag of SEQ ID NO: 6, wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag is located at the amoino-terminus of the target peptide.

1 13. A fusion protein comprising a target peptide of SEQ ID NO: 16, an intein of SEQ ID NO: 20, and a purification tag of SEQ ID NO: 7, wherein said intein sequence is located at the carboxy-terminus of the target peptide and said purification tag is located at the amino-terminus of the target peptide.

1 14. An intein which is minimized in size and carries a cysteine mutation in the position corresponding by alignment to position 3 of Mxe GyrA intein (SEQ ID NO:1 ).

1 15. An intein according to embodiment 1 14, wherein said intein is a GyrA intein.

1 16. An intein according to embodiment 1 15, wherein said intein is a Mxe GyrA intein. 1 17. An intein according to embodiment 1 16, wherein said intein is a Mxe GyrA intein minimized in size by excision of residues 107-164 or a part of residues 107-164 of the Mxe GyrA intein (SEQ ID NO:1 ) and where the excised residues are replaced by a linker of sequence GSGSGSGS.

1 18. An intein according to embodiment 1 17, wherein said intein is a Mxe GyrA intein minimized in size by excision of residues 107-160 of the Mxe GyrA intein (SEQ ID NO:1 ) and where the excised residues are replaced by a linker of sequence

GSGSGSGS.

1 19. An intein according to embodiment 1 18, wherein the sequence of said intein is SEQ ID NO: 20.

120. An intein of SEQ ID NO: 20.

121. A DNA construct coding for a fusion protein as defined in any one of embodiments 69-1 13.

122. A DNA construct of embodiment 121 which is in the form of a vector.

Examples

The examples of the invention are based on human variants of PYY (ftPYY), PP (ΛΡΡ), amylin (ftAmylin), CT (ftCT), and a-CGRP (fta-CGRP). The Mxe GyrA intein variant referred to as the native intein (SEQ ID NO: 17) carries a N198A mutation, disabling its natural self-splicing function and rendering it susceptible for intermolecular thiol-induced cleavage. Seven sets of plasmids comprising a total of 35 vectors were designed (Table 1 ).

Table 1. List of fusion proteins with respective numbers as they are referred to in the examples.

Protein Plasmid N-terminal tag Protease site Target peptide Intein

no. no.

1A p1A SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 17

1 B p1 B SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 19

1 C p1 C SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 20

1 D p1 D SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 18

1 E p1 E SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 21

1 F p1 F SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 22

1 G p1 G SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 23

1 H p1 H SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 24

11 p1 l SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 25

1J pU SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 26

1 K p1 K SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 27

1 L p1 L SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 28

1 M p1 M SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 29

1 N p1 N SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 30

10 p10 SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 31

1 P p1 P SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 32

1 Q p1 Q SEQ ID NO: 4 SEQ ID NO: 8 SEQ ID NO: 10 SEQ ID NO: 33 2A p2A SEQ ID NO: 5 SEQ ID NO: 9 SEQ ID NO: 11 SEQ ID NO: 17

2B p2B SEQ ID NO: 5 SEQ ID NO: 9 SEQ ID NO: 11 SEQ ID NO: 19

2C p2C SEQ ID NO: 5 SEQ ID NO: 9 SEQ ID NO: 11 SEQ ID NO: 20

3A p3A SEQ ID NO: 5 SEQ ID NO: 9 SEQ ID NO: 12 SEQ ID NO: 17

3B p3B SEQ ID NO: 5 SEQ ID NO: 9 SEQ ID NO: 12 SEQ ID NO: 19

3C p3C SEQ ID NO: 5 SEQ ID NO: 9 SEQ ID NO: 12 SEQ ID NO: 20

4A p4A SEQ ID NO: 5 SEQ ID NO: 9 SEQ ID NO: 13 SEQ ID NO: 17

4B p4B SEQ ID NO: 5 SEQ ID NO: 9 SEQ ID NO: 13 SEQ ID NO: 19

4C p4C SEQ ID NO: 5 SEQ ID NO: 9 SEQ ID NO: 13 SEQ ID NO: 20

5A p5A SEQ ID NO: 5 SEQ ID NO: 9 SEQ ID NO: 14 SEQ ID NO: 17

5B p5B SEQ ID NO: 5 SEQ ID NO: 9 SEQ ID NO: 14 SEQ ID NO: 19

5C p5C SEQ ID NO: 5 SEQ ID NO: 9 SEQ ID NO: 14 SEQ ID NO: 20

6A p6A SEQ ID NO: 6 SEQ ID NO: 15 SEQ ID NO: 17

6B p6B SEQ ID NO: 6 SEQ ID NO: 15 SEQ ID NO: 19

6C p6C SEQ ID NO: 6 SEQ ID NO: 15 SEQ ID NO: 20

7A p7A SEQ ID NO: 7 SEQ ID NO: 16 SEQ ID NO: 17

7B p7B SEQ ID NO: 7 SEQ ID NO: 16 SEQ ID NO: 19

7C p7C SEQ ID NO: 7 SEQ ID NO: 16 SEQ ID NO: 20

Materials and Methods

List of Abbreviations

Amp Ampicillin

E. coli Escherichia coli

GyrA DNA GyraseA intein from Mycobacterium Xenopi

/7ot-CGRP Human a-calcitonin gene-related peptide

frAmylin Human amylin

CJ Human calcitonin

HPLC High performance liquid chromatograpy

PP Human pancreatic peptide

/?PYY Human peptide YY

IB Inclusion bodies

IPTG Isopropyl β-D-l -thiogalactopyranoside

LB Luria-Bertani

LC-MS Liquid chromatography mass spectrometry

MESNa Sodium 2-mercaptoethanesulfonate

OD 600 Optical density at 600nm

RP-UPLC Reversed phase ultra performance liquid chromatography

SDS-PAGE Sodium dodecylsulfate polyacrylamide gel electrophoresis

TAP Thermostable Alkaline Protein

TFA Trifluoroacetic acid

T. Maritima Thermotoga Maritima

General Methods of Preparation of Fusion Proteins

Cloning of constructs:

The plasmids were either generated by cloning of synthetic gene fragments obtained from Geneart (Regensburg, Germany) into a pET1 1 a vector (Novagen) using Xhol/BamHI, Nhel/BamHI or Nhel/Nsil restriction enzyme sites and T4 DNA ligase (New England BioLabs, Ipswich, MA) by methods described by the manufacturer or provided by Geneart as sub-cloned vectors. In either case the gene encoding the fusion proteins had been codon optimized for expression in E. coli. The vector map of the plasmids is exemplified in Figure 1 .

The plasmids were used to transform competent TOP10 (Invitrogen) E. coli cells and incubated overnight on LB medium agar plates containing 100 μg/ml Ampicillin.

Plasmids encoding the respective proteins of interest were obtained from positive clones following plasmid propagation in liquid LB/Amp (100 μg/ml Ampicillin) medium and standard mini-preparations. Correctness of the nucleotide sequences were verified by DNA sequencing using T7 promoter/terminator sequence specific primers (performed by Eurofins MWG Operon, Ebersberg, Germany). Expression of Fusion Proteins in Shake-Flasks:

Plasmids were transformed into E. coli expression strain BL21 (DE3), which were plated on LB/Amp plates and incubated at 37°C overnight. BL21 (DE3) cells containing the appropriate plasmid were grown to OD 6 oo of 0.4-0.6 at 37 °C in LB/Amp medium using shaker flasks. The cells were induced with 0.5 mM IPTG at 37 °C for 3 h. Following induction, the cells were harvested by centrifugation (4000xg, 4°C, 15min, Heraeus Multifuge 3 S-R, DJB Labcare Ltd., Newport Pagnell, England). The harvested cells were lysed by sonication (40% power with pulses of 15 sec on and off for 20 min performed on ice, Bandelin Sonopuls, Buch og Holm) in 25 mM sodium phosphate pH 5 buffer. The insoluble proteins were spun down (4000xg, 4°C, 15 min) and the cell lysate decanted.

SDS-PAGE Analysis:

Coomassie stained SDS-PAGE analysis was carried out of lysate samples in sample buffer (BioRad XT sample buffer + 4 mM TCEP) containing induced and uninduced cells, as well as soluble and insoluble fractions of induced cells (obtained by ultrasonication and centrifugation in a buffer containing 25 mM sodium phosphate pH 5 buffer). Band intensities of fusion proteins and inteins were estimated by densitometric analysis of SDS- PAGE gels using ImageJ.

General Methods of Detection and Characterization

UPLC method A: Quantification of Fusion Proteins

The protein concentration of the fusion proteins was quantified by RP-UPLC using the Acquity UPLC System (Waters) and reference Std 1 (purified fusion protein 1 B, 0.169 mg/ml). The samples were analyzed using an Acquity UPLC BEH300 C4 (1.7 urn, 2.1 x100 mm, 1 .7 μηηχ Waters) column and a gradient elution using 0.1 % TFA water (Eluent A) and 0.08% TFA in acetonitrile (Eluent B) at 0.4 ml/min and a column temperature of 40°C with 214 nm detection as follows: Time (min) % Eluent B

0 20

5 20

20 60

20.5 99

23 99

23.1 20

26 20

A sample set was initiated with single injection of Std 1 to control the system and end with double injection of Std 1 for calibration of samples. UPLC method B: Analysis of Thiol-induced Cleavage

Aliquots collected from cleavage reactions of the fusion proteins were analysed by RP-UPLC using the Acquity UPLC System (Waters) and an Acquity UPLC BEHC18, (2.1 x 150 mm column, 1 .7 μηη, Waters) column connected to a BEH C18 (2.1x5 mm 1 .7 μηη, Waters) van guard™ pre column. A two step gradient elution using 0.1 % TFA water (Eluent A) and 0.08% TFA in acetonitrile (Eluent B) at 0.4 ml/min and a column temperature of 50°C with 214 nm detection was applied as follows:

UPLC method C: Quantification of Peptide g-thioester

The concentration of a-thioester was quantified by RP-UPLC using the Acquity UPLC System (Waters) and reference Std 2 (purified peptide α-thioester of fusion protein 1A, (0.144 mg/ml). An Acquity UPLC BEHC18, (2.1 x 150 mm column, 1.7 μηι, Waters) column connected to a BEH C18 (2.1 x5 mm 1 .7 μηη, Waters) van guardTM pre column was applied. A linear gradient elution of 0.1 % TFA water (Eluent A) and 0.08% TFA in acetonitrile (Eluent B) at 0.4 ml/min and a column temperature of 50°C with 214 nm detection was used as follows:

A sample set was ended with double injection of Std 2 for calibration of samples. LC-MS Analysis A: Characterization of Fusion Proteins and Inteins

Characterisation of fusion proteins and cleavage mixtures were performed by LC- MS analysis using a LC-MSD-TOF (Agilent Technologies) instrument using a Zorbax 300SB- C18 rapid resolution (2.1 x 50 mm, 3.5 μηη, Agilent Technologies) column and a column temperature of 40°C. MS ionisation mode was set to positive ion. Scan 100-3000 amu. A linear gradient of 8.8 mM ammonium format in 0.1 % formic acid in water (Eluent A) and 0.1 % formic acid in acetonitrile (Eluent B) with a flow of 0.3 ml/min and detection at 214 nm was used as follows:

LC-MS Analysis B: Analysis of Enzymatic Cleavages

The enzymatic digestion with proteolytic enzyme to remove purification tags was analyzed by LC-MS analysis using a LC-MSD-TOF (Agilent Technologies) instrument and a Zorbax 300SB-C18 rapid resolution (2.1 x 50 mm, 3.5 μηη, Agilent Technologies) column with a column temperature of 40°C. MS ionisation mode was set to positive ion. Scan 100-3000 amu. A linear gradient of 8.8 mM ammonium format in 0.1 % formic acid in water (Eluent A) and 0.1 % formic acid in acetonitrile (Eluent B) with a flow of 0.3 ml/min and detection at 214 nm was used as follows:

Example 1. Measured activities of engineered inteins

The objective was to generate a functional Mxe GyrA intein that was minimized in size and carried a T3C mutation. To achieve this, a set of seventeen fusion proteins that each carried a different Mxe GyrA intein variant were generated and the rate constants of their thiol-induced cleavage reactions were used as a measure for the intein activity.

All seventeen fusion proteins (protein 1A-1 Q, Table 1 ) comprised the target peptide [Gly4]-/?PYY(4-36) (SEQ ID NO: 10) fused between an amino-terminal extension via a protease site and a carboxy-terminal Mxe GyrA intein variant. The amino-terminal extension was a variant of ribosomal protein L27 from Thermotoga maritime (RL27tm, SEQ ID NO: 4) that allows purification by cation exchange chromatography. The protease site (SEQ ID NO: 8) contained the EVLFQ sequence that is recognized by the HRV14-3C protease and allows removal of the amino-terminal extension. Protein 1A contained the native Mxe GyrA intein variant, protein 1 D contained the [Cys3]-A xe GyrA intein variant, protein 1 B and 1 E-1 S contained a minimized Mxe GyrA intein variant, and protein 1 C contained a minimized Mxe GyrA intein variant with the T3C mutation. The minimized inteins had 46-57 amino acids between the amino- and carboxy-termini splicing regions removed or substituted with a small linker (Table 2). More specifically, the removed region comprised a part of residues 103-164 of the Mxe GyrA intein, which according to the crystal structure (Klabunde T. (1998) Nature Structural Biology, 5, 31-36) are located between β-strands 9 and 10 of the carboxy-terminal splicing region. The examined linkers involved the LDRHGN sequence that links β-strand 4 and 5 of the /V-terminal splicing domain of the Mxe GyrA intein, an a-helical ADNLALA linker sequence from the linker database (George RA. and Heringa J. (2002) Protein Engineering, 15, 871 -879), the RDVETGE linker that links β-strand 9 and 10 of the hedgehog protein of Drosophila melanogaster (as described by Hiraga, K. et al. (2005) J. Mol. Biol. 354, 916— 926), and a flexible GSGSGSGS linker.

Table 2. List of intein variants with respective deletion size, linker and mutations.

Both residues are remaining as part of the truncated intein variants. Thiol-induced cleavage of fusion proteins:

The fusion proteins 1A to 1 Q were expressed in shake-flasks containing 200 ml. LB medium as described in the general methods. The insoluble-proteins were washed twice with MQ, divided into smaller fractions and stored at -20°C. The insoluble protein pellets were re- suspended in solubilisation buffer [100 mM sodium phosphate, pH 7.5, 5 M urea] and incubated on ice for 1 h followed by filtration (0.45 μηη). The protein concentrations were determined by RP-UPLC method A, followed by dilution to a final concentration of 0.6 mg/ml with solubilisation buffer. Aliquots of the the protein solutions (325 μΙ_) were refolded by dilution into a mixture of 485 μΙ_ aqueous dilution buffer A [100 mM sodium phosphate, pH 7.5, 250 mM NaCI] and 190 μΙ_ buffer B [100 mM sodium phosphate pH 7.5, 2M urea, 150 mM NaCI], resulting in a final concentration of 0.2 mg/ml protein in 100 mM sodium phosphate, pH 7.5, 2 M urea, 150 mM NaCI. Cleavage was induced by addition of MESNa to 100 mM from a 2 M stock in buffer B. Reaction was allowed at 5 °C in an Eppendorf tube overnight with agitation (300 rpm). Aliquots were collected over time and quenched with one volume of 1 .7% HCI in 6 M guanidine hydrochloride to approximately pH 3 and analyzed by RP-UPLC method B and characterized by LC-MS method A. Triplicate determination was performed and product formation was calculated as area a -thioester product/(area a -thioester product + area P rotein) with a-thioester product being the amino-terminal tagged [Gly4]-PYY(4-36) o thioester and protein being the fusion protein. Product formation was plotted as a function of time. Pseudo-first order reactions were achieved by using MESNa in great molar excess and rate constants {k obs ) were determined by fitting the data to the equation P = P 0 (1 - e-kt), where P is the percentage of formed peptide α-thioester product at time t, P 0 is the maximum percentage of peptide α-thioester product obtained, and k is the observed rate using

GraphPad Prism 5.01 (GraphPad Software Inc., La Jolla, CA).

1.1. Measured activities of the functional minimized intein:

Initially, the Mxe GyrA intein was optimized with respect to its size. The truncated intein variants had as a minimum 46 residues removed between Ala103 and Ala164 of the Mxe GyrA intein sequence (corresponding to 23% of the overall intein size). Removal of residues 1 12-157 (46 amino acids) of the intein (SEQ ID NO: 21 ) in protein 1 E reduced the splicing activity slightly compared to protein 1A, whereas removal of residues 107-160 (54 amino acids) of the intein (SEQ ID NO: 22) in protein 1 F resulted in an inactive intein (Table 3). The difference on protein 1 E and 1 F is that the intein of protein 1 E has some unstructured amino acids remaining between β-strands 9 and 10, whereas the entire unstructured region between β-strands 9 and 10 has been removed in protein 1 F. Introduction of a flexible GSGSGSGS linker between residues Gln106 and Phe161 of the intein in protein 1 F, resulting in protein 1 B containing an engineered intein sequence with SEQ ID NO: 19, resulted in a regain of activity to a level that was comparable to the native intein of protein 1 A (Table 3). To examine if the intein in protein 1 B could be even further minimized, residues at either side of the GSGSGSGS linker was systematically removed, resulting in six fusion proteins 1 N-1 S. In protein 1 L, 1 M and 1 N, one, two and three amino acids, respectively, was removed amino-terminally to the GSGSGSGS linker of the intein (SEQ ID NO: 28-30) and this caused significant decreases in intein activity with no cleavage observed for protein 1 N (Table 3). In contrast, compared to protein 1 B moderate 1 .1 -, 1.2- and 1.7-fold drops in activity were observed for inteins (SEQ ID NO: 31 -33) in protein 10, 1 P and 1 Q, respectively, in which one to three amino acids had been removed carboxy-terminally to the GSGSGSGS linker of the intein.

To evaluate the effect of differences in structural integrity and charge of the linker region a range of other linkers were inserted between the deletion sites. In protein 1 G and 1 H, the LDRHGN linker was inserted between Asp158 and Ser 1 1 1 (SEQ ID NO: 23) or Gln106 and Phe 161 (SEQ ID NO: 24) of the intein, respectively. Compared to the

corresponding inteins of protein 1 E and 1 F, where no linker was present, the LDRHGN linker of protein 1 G and 1 H did not alter the intein activity (Table 3). When the hedgehog

RDVETGE linker (in SEQ ID NO: 25), an a-helical ADNLALA linker (in SEQ ID NO: 26) or an alkaline GRGSGRGS linker (in SEQ ID NO: 27) was introduced in place of the GSGSGSGS linker of protein 1 B, resulting in protein 1 1, 1 J and 1 K, respectively, the intein activity was reduced by moderate 1 .8-, 4.8- and 1.5-fold (Table 3).

The most active minimized intein was the one in protein 1 B in which residues 107- 160 had been substituted with a GSGSGSGS linker (SEQ ID NO: 19).

7.2. Introducing a T3C mutation into the Minimized Intein:

Next, a T3C mutation was introduced into the minimized intein of protein 1 B, resulting in protein 1 C, which contained the [Cys3, 106(GS) 4 161]-Mxe GyrA (SEQ ID NO: 20). The full length T3C intein variant, [Cys3]-Mxe GyrA (SEQ ID NO: 18), in protein 1 D was used as a control. The T3C mutation of proteins 1 D and 1 C, respectively, resulted in a decreased intein activity compared to the native intein of protein 1A (Table 3). The decrease in activity was more pronounced for the minimized intein in protein 1 C than for the full length intein in protein 1 D. Several active Mxe GyrA intein variants that were reduced in size by approximately 25% were identified. These minimized inteins had 46-57 amino acids between Gln106 and Phe161 removed or substituted with a linker. The most active minimized intein was

[106(GS) 4 161]-Afxe GyrA (SEQ ID NO: 19) that had a flexible GSGSGSGS linker introduced instead of residues 107-160. The corresponding intein with a T3C mutation, [Cys3,

106(GS) 4 161 ]- Wxe GyrA (SEQ ID NO: 20), was also active, but resulted in slower thiol- induced cleavage.

Table 3. List of molecular weights of fusion proteins and their corresponding intein variants. Reaction rates of thiol-induced cleavages are further listed.

No or negligible cleavage observed.

Example 2. Protein expression levels using an engineered intein

The objective was to determine how the the engineered intein, [Cys3, 106(GS) 4 161]- Mxe GyrA (SEQ ID NO: 20), affected the expression level of fusion proteins containing desired target peptides. For each of the target peptides, [Gly4]-/?PYY(4-36) (SEQ ID NO: 10), /?PYY(3-36) (SEQ ID NO: 1 1 ), /?a-CGRP (SEQ ID NO: 12), CJ (SEQ ID NO: 13), PP (SEQ ID NO: 14), [Arg4, Gln18, Lys30]-/?PYY(3-36) (SEQ ID NO: 15), and [Asp14, Arg17, Pro21 , Pro27, Arg35]-/?Amylin (SEQ ID NO: 16), three different fusion proteins were constructed. All fusion proteins were comprised of an amino-terminal extension, the target peptide, and a Mxe GyrA intein variant. A, B and C fusion proteins of each of the respective target peptides contained the native Mxe GyrA (SEQ ID NO: 17), [106(GS) 4 161]-Mxe GyrA (SEQ ID NO: 19), and [Cys3, 106(GS) 4 161 ]-Mxe GyrA (SEQ ID NO: 20) inteins, respectively. Expression of fusion proteins:

BL21 (DE3) cells containing the appropriate plasmids were grown in LB media (100 μg/mL ampicillin) using TPP cell suspension tubes at 37 °C to OD 6 oo of approximately 0.4. The cell cultures were cooled to 18 °C for 20-30 min and expression of fusion proteins were induced by adding 0.5 mM IPTG at OD 6 oo of 0.4-0.6. Protein expression was allowed at 18 °C over night. The cells were harvested in 1 ml. aliquots by centrifugation. SDS-PAGE analysis was carried out as described in the general methods. Percentage hydrolysis and expression level were estimated from band intensities of the induced lysate samples (Table 4). Specifically, hydrolysis was calculated as areaintein (area P rotein + areaintein) and the relative expression level represents the area of fusion proteins A and B relative to area of fusion protein C within each group. Percentage solubility was estimated from the soluble and insoluble fractions and were calculated as area SO iubie protein/(area SO iubie protein + area in soiubie rotein) with area SO iubie protein and area inS oiubie protein representing the total area of intein and fusion protein in the soluble and insoluble samples, respectively. 2.1. Expression of PYY-intein fusion proteins

The protein 1 family comprising proteins 1A, 1 B, and 1 C include [Gly4]-/?PYY(4-36) fused between an amino-terminal extension (containing an alkaline tag (SEQ ID NO: 4) via a HRV14-3C protease site (SEQ ID NO: 8)) and a carboxy-terminal Mxe GyrA intein variant being native Mxe GyrA (SEQ ID NO: 17), [106(GS) 4 161 ]-Mxe (SEQ ID NO: 19), and [Cys3, 106(GS) 4 161 ]-Mxe GyrA (SEQ ID NO: 20), respectively. Proteins 1A-1 C were predominantly obtained in the soluble fraction (Table 4). Approximately 50% of protein 1A and protein 1 B was hydrolysed, whereas no hydrolysis was observed for the engineered intein in protein 1 C. The absence of hydrolysis was associated with approximately 30% increased expression level for protein 1 C compared to proteins 1A and 1 B. The protein 2 family comprising 2A, 2B, and 2C include /?PYY(3-36) fused between an amino-terminal extension (containing a His tag (SEQ ID NO: 5) via an enterokinase protease site (SEQ ID NO: 9)) and a carboxy-terminal Mxe GyrA intein variant being native Mxe GyrA, [106(GS) 4 161 ]-Mxe GyrA, and [Cys3, 106(GS) 4 161 ]-Mxe GyrA, respectively. Proteins 2A-2C were predominantly obtained in the soluble fraction (Table 4). Approximately 60% and 50% of protein 2A and 2B was hydrolysed, respectively, whereas no hydrolysis was observed for protein 2C. This was associated with a 30-40% increase in expression level of protein 2C compared to 2A and 2B.

The protein 6 family comprising 6A, 6B, and 6C include [Arg4, Gln18, Lys30]- /?PYY(3-36) fused between an amino-terminal histidine rich dodecapeptide extension that can be removed by DAP1 (SEQ ID NO: 6) and a carboxy-terminal Mxe GyrA, [106(GS) 4 161]- Mxe GyrA, and [Cys3, 106(GS) 4 161]-A xe GyrA, respectively. All three fusion proteins were mainly obtained in the soluble fraction (Table 4). Approximately 50% of proteins 6A and 6B werewere hydrolysed, whereas no hydrolysis was observed for protein 6C. This was further associated with a 15-30% decreased expression level of protein 6A and 6B relative to 6C.

2.2. Expression of a-CGRP-intein fusion proteins

The protein 3 family comprising 3A, 3B, and 3C include fra-CGRP fused between an amino-terminal extension (containing a His tag (SEQ ID NO: 5) and an enterokinase protease site (SEQ ID NO: 9)) and a carboxy-terminal Mxe GyrA intein variant being native Mxe GyrA, [106(GS) 4 161 ]-Mxe GyrA, and [Cys3, 106(GS) 4 161 ]-Mxe GyrA, respectively. All three fusion proteins were mainly obtained in the soluble fraction (Table 4). Approximately 50% of protein 3A and 3B was hydrolysed, whereas no hydrolysis was observed for protein 3C. This was further associated with an approximately 30% decreased expression level of protein 3A and 3B relative to 3C.

2.3. Expression of CT-intein fusion proteins

The protein 4 family comprising 4A, 4B, and 4C include hCT fused between an amino-terminal extension (containing a His tag (SEQ ID NO: 5) and an enterokinase protease site (SEQ ID NO: 9)) and a carboxy-terminal Mxe GyrA intein variant being native Mxe GyrA, [106(GS) 4 161 ]-Mxe GyrA, and [Cys3, 106(GS) 4 161 ]-Mxe GyrA, respectively. All three proteins were obtained mainly in the soluble fraction without any hydrolysis (Table 4). In this case the expression levels of proteins 4A and 4B were higher compared to protein 4C. 2.4. Expression of PP-intein fusion proteins

The protein 5 family comprising 5A, 5B, and 5C include hPP fused between an amino-terminal extension (containing a His tag (SEQ ID NO: 5) and an enterokinase protease site (SEQ ID NO: 9)) and a carboxy-terminal Mxe GyrA intein variant being native Mxe GyrA, [106(GS) 4 161 ]- Wxe GyrA, and [Cys3, 106(GS) 4 161 ]- Wxe GyrA, respectively. Proteins 5A-5C were obtained in the soluble fraction (Table 4). Whereas approximately 50% hydrolysis was observed for protein 5A and 5B, no hydrolysis was observed for protein 5C. This was associated with approximately 10-20% increased expression level of protein 5C compared to 5A and 5B.

2.5. Expression of amylin-intein fusion proteins

The protein 7 family comprising 7A, 7B, and 7C include [Asp14, Arg17, Pro21 , Pro27, Arg35]-/?Amylin fused between an amino-terminal histidine rich tridecapeptide extension that can be removed by alp protease (SEQ ID NO: 7) and a carboxy-terminal Mxe GyrA intein variant being native Mxe GyrA, [106(GS) 4 161 ]- Wxe GyrA, and [Cys3,

106(GS) 4 161 ]-A xe GyrA, respectively. Whereas proteins 7A and 7B were both mainly obtained in the soluble fraction, protein 7C was obtained as a mixture of soluble and insoluble protein (Table 4). For both protein 7A and protein 7B extensive hydrolysis was observed, whereas the engineered intein of protein 7C was associated with no hydrolysis. This resulted in an increased expression level by 7-13% of protein 7C compared to 7A and 7B.

In conclusion, the [Cys3, 106(GS) 4 161 ]- Wxe GyrA intein (SEQ ID NO: 20) generally resulted in decreased hydrolysis and a concomitant increase in the expression level of fusion proteins compared to the wildtype Mxe GyrA (SEQ ID NO: 17) and minimized [106(GS) 4 161]- Mxe GyrA (SEQ ID NO: 19) inteins.

Table 4. Hydrolysis levels, expression levels and solubility of fusion proteins of the native Mxe GyrA, [106(GS) 4 161 ]-Mxe GyrA, and [Cys3, 106(GS) 4 161 ]-Mxe GyrA inteins.

Induced sample Hydrolysis Expression Soluble Insoluble Soluble

No. Protein Intein [%] a level [%] b protein protein protein (area) (area) (area) 0 (area) 0 [%r

1A 13381 12560 48 71 13290 537 96

1 B 12120 9299 43 64 10070 459 96 1 C 18726 0 0 100 10277 4344 70

2A 9067 13958 61 58 1 1 1 10 441 96

2B 10199 9631 49 73 10721 1361 89

2C 13892 0 0 100 921 1 1381 87

3A 15504 18586 55 73 29018 777 97

3B 15516 14075 48 73 18273 1477 93

3C 21352 0 0 100 15325 2242 87

4A 15028 0 0 120 5492 809 87

4B 13137 0 0 105 6787 1736 80

4C 12523 0 0 100 6540 924 88

5A 13353 8285 38 80 8322 972 90

5B 15544 10519 40 93 17907 1719 91

5C 16778 0 0 100 1 1792 1241 90

6A 14438 16435 53 85 23601 761 97

6B 1 1385 9574 46 67 15269 895 94

6C 16938 0 0 100 8951 3202 74

7A 18941 25732 58 87 38419 2092 95

7B 19832 13501 41 91 12629 5249 71

7C 2181 1 0 0 100 6073 8547 42

Estimated as areaintein (area pro tein + areaintein)

bExpression levels of fusion proteins C was set to 100% and that of fusion proteins A and B were calculated relative to C.

cSum of areas of intein and fusion protein.

Estimated as area s0 luble protein/(area S oluble protein+areajnsoluble protein)■

Example 3. Peptide a-thioester formation using engineered inteins

To examine the effect of the engineered [Cys3, 106(GS) 4 161]-A xe GyrA intein on the yield of the target peptide the amino terminal tagged [Gly4]-/?PYY(4-36) α-thioester was isolated from fusion protein 1A, 1 C, and 1 D containing native Mxe GyrA (SEQ ID NO: 17), [Cys3, 106(GS) 4 161 ]-Mxe GyrA (SEQ ID NO: 20), and [Cys3]-Mxe GyrA (SEQ ID NO: 18) respectively. The fusion proteins 1A, 1 C and 1 D were expressed in shake-flasks containing 200 mL LB medium as described in the general methods. SDS-PAGE analysis was performed as described in the general methods and showed that >90% of the fusion proteins were in the insoluble fraction. The harvested cells were lysed in 25 mM sodium phosphate pH 5 buffer by sonication and the insoluble fraction was washed twice in wash buffer [25 mM sodium phosphate, 1 mM EDTA, 1 % Triton X-100, pH 7] followed by one wash in 100 mM sodium phosphate pH 7.5 buffer. The isolated inclusion bodies were re-suspended in solubilization buffer [100 mM sodium phosphate, 8 M urea, 1 mM TCEP, 1 mM EDTA, pH 7.5] (20 mL) and incubated for 2 h at 5 °C with weak agitation. The expression yield was estimated by RP- UPLC method A. The sample was filtered (0.45 μηη) and refolded by rapid dilution into 9 volumes of cold refolding buffer [100 mM sodium phosphate, 1.3 M urea, 150 mM NaCI, 1 mM EDTA, 1 mM TCEP, pH 7.5] resulting in a concentration of 135 mM NaCI and 2 M urea. Immediately after dilution, the cleavage was initiated by adding MESNa to a final

concentration of 100 mM. Reactions were allowed at 5 °C for 20 h at pH 7.3. The thiol- induced cleavages were monitored by RP-UPLC method B and LC-MS method A. The observed masses of fusion proteins and their corresponding inteins correlated with the expected (Table 3). The thiol-induced reaction mixtures were diluted by three volumes of buffer A [50 mM sodium phosphate, 1 mM EDTA, pH 7] to decrease the conductivity to approximately 12 mS/cm before loading onto a HiTrap SP Sepharose HP (16x25 mm, 5 mL, GE Healthcare) column at 5 mL/min using an AKTA explorer 100 system (GE Healthcare). The column was washed with 9 column volumes of buffer A and the amino-terminal tagged [Gly4]-/?PYY(4-36) a-thioester was eluted using a linear gradient of 0-100% buffer A containing 1 M NaCI over 20 column volumes at 3 mL/min. Fractions containing the a-thioester were pooled and the amount was estimated by RP-UPLC method C. The identity of the product was confirmed by ESI-MS resulting in a deconvoluted mass of 12315.35 Da (expected: 12315.09). The yields are listed in Table 5.

Table 5. Purification of A agged [Gly4]-/?PYY(4-36) α-thioester using different intein variants 3 .

aThe yields obtained from 200 mL shake flask cultures were normalized to yield per liter culture. Data are presented as means from two independent experiments.

f c Amount of a-acid by-product not included.

""Estimated as area / (area + area by rp-UPLC. In conclusion, the yield of amino-terminal tagged [Gly4]-/?PYY(4-36) a-thioester obtained from protein 1A containing the wild-type Mxe GyrA intein (SEQ ID NO: 14) was similar to that of protein 1 D containing the [Cys3]-Mxe GyrA (SEQ ID NO: 18), which indicated that the T3C mutation itself did not increase the amount of target peptide (Table 5). However, when isolated from protein 1 C containing the [Cys3, 106(GS) 4 161 ]-A xe GyrA intein (SEQ ID NO: 20), the yield was increased by approximately 80% compared to protein 1A, indicating that the engineered intein comprising a minimized size and a T3C mutation positively affected the target peptide yield. Importantly, the amount of amino-terminal tagged [Gly4]-/?PYY(4-36) a-acid by-product was reduced for protein 1 C and 1 D compared to 1A, indicating that the T3C mutation also have a beneficial effect during purification.

The [Cys3, 106(GS) 4 161]- Wxe GyrA intein (SEQ ID NO: 20) intein increases the expression level of the fusion protein and enhances the yield of the target peptide. As this was not observed for the [Cys3]-A xe GyrA intein (SEQ ID NO: 18), the minimized size contributes to these enhanced properties.

Example 4. Generation of an a-amidated peptide using the rCvs3, 106(GS)d1611-Mxe GyrA intein

The potential usage of fusion proteins comprising the engineered intein for large scale production of a-amidated peptides depends on its ability to express under high-cell density conditions. This was demonstrated for [Gly4]-/?PYY(4-36)-NH 2 , which was isolated from protein 1 C after fed-batch fermentation in a bio-reactor.

4.1. Fed-batch Fermentation of Fusion Protein 1 C

A pre-culture of BL21 (DE3) cells containing plasmid p1 C was grown in LB medium supplied with 100 μg/mL ampicillin in a shaker flask for 6-8 h. Fermentations were carried out under aerobic conditions in 500 mL bioreactors (DasGib Technology) with an initial volume of 200 mL of a defined fermentation medium with glucose and ammonia as carbon and nitrogen sources, respectively, and with 100 μg/mL ampicillin added after sterilisation (autoclavation at 121 °C for 30 min). The pre-culture was inoculated into a bioreactor to reach an initial OD 6 oo of about 0.2. The pH was maintained at 7.0 by addition of 5 N NH 4 OH, the temperature was maintained at 37 °C and an air flow rate of 0.4 L/min was sparkled through the culture broth throughout the fermentation period. The agitation speed (400-1200 rpm) was controlled to reach a dissolved oxygen level of at least 30% 0 2 saturation. The initial glucose concentration in the fermentation medium was 10 g/L and from 5 h of fermentation a glucose feed supplied with magnesium and trace metals was added continuously in increasing steps until a final feed rate of 10 g glucose/L/h, which was kept until the fermentation stopped. The production of the fusion proteins were induced by the addition of 0.5 mM IPTG at an OD 6 oo of 50-60. The fermentation was stopped after 4 h induction. The cells were harvested by centrifugation in a Sorvall RC 6 Plus centrifuge with a F10-6x500y Rotor (13000 x g) and stored at -20 °C. The cells were re-suspended in lysis buffer [25 mM sodium phosphate, pH 5] to OD 6 oo of 40-50 and stirred 1 h at 4 °C. Cells were lysed on a constant cell disruption system E615 at a pressure of 1.36 Kbar. The lysed cells were spun down and the

supernatant decanted. The insoluble inclusion bodies were washed in wash buffer [25 mM sodium phosphate, pH 7, 5 mM EDTA, 1 % Triton X-100] and stirred at 5 °C for 2 h, followed by one wash in MilliQ water. The inclusion bodies were divided equally into 20 smaller fractions.

4.2. Isolation of [Glv41-/?PYY(4-36)-amide ([Gly41-/?PYY(4-36)-NH ? )

The strategy is outlined in Figure 2. Inclusion bodies from 1/20 fed-batch culture were re-suspended in denaturation buffer [25 mM sodium phosphate, 8 M urea, 1 mM TCEP, 1 mM EDTA, pH 7.5] (30 mL) and incubated at 5 °C for 3.5 h with gentle stirring. The yield of fusion protein was estimated by RP-UPLC method A and the sample was diluted to a concentration of approximately 1 .2 mg/mL with denaturation buffer. The protein was refolded by adding the protein over 5 min into 8.5 volumes of dilution buffer [25 mM sodium

phosphate, 1 mM TCEP, 1 mM EDTA, 0.5 M NaCI, pH 7.5] with stirring. Thiol-induced cleavage was initiated immediately by adding MESNa to 100 mM from a 2 M stock in dilution buffer. Reaction was allowed in the 0.8 M urea and 0.5 M NaCI solution at 5 °C and pH 7.3 overnight with stirring.

Amidation was performed by addition of solid NH 4 HC0 3 to a final concentration of 1 M. The pH of the solution was adjusted to 8.5 with 1 N NaOH and reaction was allowed at 5 °C overnight. Thiol-induced cleavage and amidation reactions were monitored by LC-MS method A (Table 6).

Table 6. Masses after thiolysis and amidation of protein 1 C.

Average masses (Da)

amino-terminal tagged Expected Observed

[Gly4]-/?PYY(4-36)

a-thioester 12315.09 12315.33

a-amide 12189.92 12189.93 The amidation mixture was filtrated (0.22 μηη) and the filter washed with 8 M urea. The urea wash was loaded on a Phenomenex Luna column (15 μηη and 300 A; 10 x 250 mm) followed by the amidation mixture using an AKTA Explorer 100 (GE Healthcare).

Amino-terminal tagged [Gly4]-/?PYY(4-36)-NH 2 was eluted using 0.1 % TFA in water (solvent A) and 0.1 % TFA in acetonitrile (solvent B) at a linear gradient of 25-45% B. After lyophilization, the protein was dissolved in 25 mM sodium phosphate pH 7.5 buffer and the amino-terminal tag was removed by enzymatic cleavage using HRV14-3C protease that recognizes the sequence EVLFQ/GP specifically. An enzyme to substrate ratio of 1 :12 (w/w) was used and incubation was allowed at room temperature for at least 48 h. After adjusting pH of the digest to 4.3, the [Gly4]-/?PYY(4-36)-NH 2 was recovered by loading at 3 mL/min onto a pre-packed HiTrap SP sepharose HP column (5 mL) equilibrated with eluent A [10 mM NH4HCO 3 , pH 5.5], followed by a gradient of 0-100% eluent B [10 mM NH4HCO 3 , pH 8.5] over 5 column volumes and [Gly4]-/?PYY(4-36)-NH 2 was eluted using isocratic eluent B for 10 CV. Purity and identity of [Gly4]-/?PYY(4-36)-NH 2 was determined by RP-UPLC method C and characterized by LC-MS method B, whereas yield was estimated by chemoluminescence nitrogen detection (Table 7).

The [Gly4]-/?PYY(4-36)-NH 2 was isolated in a purity of 99% and yield of 184 mg/L (Table 7).

In conclusion, fusion protein 1 C containing the [Cys3, 106(GS) 4 161]-A xe GyrA (SEQ ID NO: 20) was successfully expressed under high-cell density conditions and converted into [Gly4]-/?PYY(4-36)-NH 2 , demonstrating the applicability of the engineered intein for production of oamidated peptides in large-scale.

Table 7. Production of a-amidated [Gly4]-/?PYY(4-36)-NH 2 from fed-batch culture of fusion

Protein yield [Gly4]-/?PYY(4-36)- Recovery Purity Monoisotopic mass of

[g/L] NH 2 yield [mg/L] [mol %] [%] [M+5H] 5+ ion

Expected Observed

4.4 184 31 99 773.59 773.60

The yields obtained from 1/20 of the 200 mL fed-batch cultures were normalized to yield per liter culture. Data are presented as means from two independent cultivations and subsequent purifications.

While certain features of the invention have been illustrated and described herein, many modifications, substitutions, changes, and equivalents will now occur to those of ordinary skilled in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the true spirit of the invention.