Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
NOVEL MICROORGANISM AND ITS USE IN LIGNOCELLULOSE DETOXIFICATION
Document Type and Number:
WIPO Patent Application WO/2011/026906
Kind Code:
A2
Abstract:
The present invention relates to an isolated microorganism which is Cupriavidus basilensis strain HMF14 Deposit number DSM 22875, and its use in a process for the in-situ detoxification of lignocelluloses hydrolysate.

Inventors:
KOOPMAN FRANK WOUTER (NL)
RUIJSSENAARS HARALD JOHAN (NL)
WIERCKX NICK JOHANNES PETRUS (NL)
DE WINDE JOHANNES HENDRIK (NL)
Application Number:
PCT/EP2010/062885
Publication Date:
March 10, 2011
Filing Date:
September 02, 2010
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SHELL INT RESEARCH (NL)
KOOPMAN FRANK WOUTER (NL)
RUIJSSENAARS HARALD JOHAN (NL)
WIERCKX NICK JOHANNES PETRUS (NL)
DE WINDE JOHANNES HENDRIK (NL)
International Classes:
C12N1/20; A62D3/02; C07K14/195; C12N9/00; C12N9/02; C12N9/16; C12N9/88; C12N15/31; C12N15/52; C12P19/32; C12R1/01
Domestic Patent References:
WO2009030713A12009-03-12
WO2009017441A12009-02-05
Foreign References:
US7067303B12006-06-27
Other References:
BIORESOURCE TECHNOLOGY, vol. 93, 2004, pages 1 - 10
MICROBIOL. BIOTECHNOL, 31 January 2009 (2009-01-31)
YU ET AL.: "Microbial utilizaiton and biopolyester synthesis of bagasse hydrolysates", BIORESOURCE TECHNOLOGY, vol. 99, 2008, pages 8042 - 8048, XP023182703, DOI: doi:10.1016/j.biortech.2008.03.071
HAWKSWORTH ET AL.: "Ainsworth and Bisby's Dictionary of The Fungi, 8th edition,", 1995, CAB INTERNATIONAL, UNIVERSITY PRESS
NEEDLEMAN; WUNSCH, J. MOL. BIOL., 1970, pages 444 - 453
E. MEYERS; W. MILLER, CABIOS, vol. 4, 1989, pages 11 - 17
ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 10
ALTSCHUL ET AL., NUCLEIC ACIDS RES., vol. 25, no. 17, 1997, pages 3389 - 3402
SAMBROOK ET AL.: "Molecular Cloning, A Laboratory Manual", 1989, COLD SPRING HARBOR PRESS
AUSUBEL ET AL.: "Current Protocols in Molecular Biology", 1995, JOHN WILEY & SONS
GOEDDEL: "Gene Expression Technology: Methods in Enzymology", vol. 185, 1990, ACADEMIC PRESS
KOZAK, J. BIOL. CHEM., vol. 266, 1991, pages 19867 - 19870
ALTSCHUL, S.F. ET AL.: "Basic local alignment search tool", J. MOL. BIOL., vol. 215, 1990, pages 403 - 410, XP002949123, DOI: doi:10.1006/jmbi.1990.9999
Download PDF:
Claims:
C L A I M S

1. An isolated microorganism which is Cupriavidus

basilensis strain HMF14 Deposit number DSM 22875.

2. A bacterial culture comprising a Cupriavidus

microorganism of Cupriavidus basilensis HMF 14 strain Deposit number DSM 22875 according to claim 1.

3. An isolated microorganism or culture according to claim 1 or claim 2, which, when provided with HMF,

Furfurylalcohol , Furfural and/or Furoic acid as a carbon source, grows on said source.

4. An isolated microorganism or culture according to anyone of claims 1 to 3, which expresses the following enzymes: a Furoyl-CoA dehydrogenase and/or a Furoyl-CoA synthetase and/or a 2-oxoglutaroyl-CoA hydrolase and/or a 2 , 5-furan-dicarboxylic acid decarboxylase 1 and/or a 2,5- furan-dicarboxylic acid decarboxylase 2 and/or a

HMF/furfural oxidoreductase .

5. A polypeptide having aldehyde dehydrogenase activity which comprises the amino acid sequence set out in SEQ ID NO: 15 or an amino acid sequence encoded by the

nucleotide sequence of SEQ ID NO: 16 or a variant

polypeptide thereof, wherein the variant has at least 64% sequence identity or more with the sequence set out in SEQ ID NO: 16.

6. A polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 16;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 16;

(c) a nucleotide sequence having at least 64% sequence identity or more with the nucleotide sequence of SEQ ID

NO : 16; (d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

7. A polypeptide having LysR family transcriptional regulator activity which comprises the amino acid

sequence set out in SEQ ID NO: 17 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 18 or a variant polypeptide thereof, wherein the variant has at least 47% sequence identity or more with the sequence set out in SEQ ID NO: 17.

8. A polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 18;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ

ID NO: 18;

(c) a nucleotide sequence having at least 72% sequence identity or more with the nucleotide sequence of SEQ ID NO : 18;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ;

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) . 9. A polypeptide having 2 , 5-furan-dicarboxylic acid decarboxylase 1 activity which comprises the amino acid sequence set out in SEQ ID NO: 19 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 20 or a variant polypeptide thereof, wherein the variant has at least 54% sequence identity or more with the sequence set out in SEQ ID NO: 19.

10. A polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 20;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 20;

(c) a nucleotide sequence having at least 66% sequence identity or more with the nucleotide sequence of SEQ ID NO : 20 ;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

11. A polypeptide having 2 , 5-furan-dicarboxylic acid decarboxylase 2 activity which comprises the amino acid sequence set out in SEQ ID NO: 21 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 22 or a variant polypeptide thereof, wherein the variant has at least 52% sequence identity or more with the sequence set out in SEQ ID NO: 21.

12. A polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 22; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 22;

(c) a nucleotide sequence having at least 67% sequence identity or more with the nucleotide sequence of SEQ ID

NO: 22;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

13. A polypeptide having HMF/furfural oxidoreductase activity which comprises the amino acid sequence set out in SEQ ID NO: 25 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 26 or a variant

polypeptide thereof, wherein the variant has at least 45% sequence identity or more with the sequence set out in SEQ ID NO: 25.

14. A polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 26; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 26;

(c) a nucleotide sequence having at least 66% sequence identity or more with the nucleotide sequence of SEQ ID NO : 26 ;

(d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b), (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

15. A polypeptide having LysR type transcriptional regulator activity which comprises the amino acid

sequence set out in SEQ ID NO: 33 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO:

34 or a variant polypeptide thereof, wherein the variant has at least 46% sequence identity or more with the sequence set out in SEQ ID NO: 33.

16. A polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 34;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 34;

(c) a nucleotide sequence having at least 65% sequence identity or more with the nucleotide sequence of SEQ ID

NO: 34;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

17. A polypeptide having Furoyl-CoA dehydrogenase (large subunit) activity which comprises the amino acid sequence set out in SEQ ID NO: 35 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 36 or a variant polypeptide thereof, wherein the variant has at least 54% sequence identity or more with the sequence set out in SEQ ID NO: 35.

18. A polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 36;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 36;

(c) a nucleotide sequence having at least 66% sequence identity or more with the nucleotide sequence of SEQ ID NO : 36;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

19. A polypeptide having Furoyl-CoA dehydrogenase (FAD binding subunit) activity which comprises the amino acid sequence set out in SEQ ID NO: 37 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO:

38 or a variant polypeptide thereof, wherein the variant has at least 49% sequence identity or more with the sequence set out in SEQ ID NO: 37.

20. A polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 38;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 38; (c) a nucleotide sequence having at least 71% sequence identity or more with the nucleotide sequence of SEQ ID NO : 38;

(d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b), (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

21. A polypeptide having Furoyl-CoA dehydrogenase 2Fe-2S iron sulfur subunit activity which comprises the amino acid sequence set out in SEQ ID NO: 39 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 40 or a variant polypeptide thereof, wherein the variant has at least 64% sequence identity or more with the sequence set out in SEQ ID NO: 39.

22. A polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 40;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 40;

(c) a nucleotide sequence having at least 70% sequence identity or more with the nucleotide sequence of SEQ ID NO: 40; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b), (c) or (d); (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

23. A polypeptide having Furoyl-CoA synthetase activity which comprises the amino acid sequence set out in SEQ ID

NO: 41 or an amino acid sequence encoded by the

nucleotide sequence of SEQ ID NO: 42 or a variant

polypeptide thereof, wherein the variant has at least 57% sequence identity or more with the sequence set out in SEQ ID NO: 41.

24. A polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 42;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 42;

(c) a nucleotide sequence having at least 68% sequence identity or more with the nucleotide sequence of SEQ ID NO: 42;

(d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b), (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

25. A polypeptide having 2-oxoglutaroyl-CoA hydrolase activity which comprises the amino acid sequence set out in SEQ ID NO: 43 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 44 or a variant

polypeptide thereof, wherein the variant has at least 72% sequence identity or more with the sequence set out in SEQ ID NO: 43.

26. A polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 44; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 44;

(c) a nucleotide sequence having at least 74% sequence identity or more with the nucleotide sequence of SEQ ID NO : 44 ;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

27. A polynucleotide which encodes a polypeptide

according to anyone of claims 5, 7, 9, 11, 13, 15, 17,

19, 21, 23 and 25.

28. A nucleic acid construct comprising the

polynucleotide according to any one of claims 6, 8, 10, 12, 14, 16, 18, 20, 22, 24 and 26.

29. A vector incorporating a polynucleotide sequence according to any one of claims 6, 8, 10, 12, 14, 16, 18,

20, 22, 24 and 26 or a nucleic acid construct according to claim 27 or 28.

30. A cell comprising a polypeptide according to any one of claims 5, 7, 9, 11, 13, 15, 17, 19, 21, 23 and 25, a polynucleotide according to any one of claims 6, 8, 10, 12, 14, 16, 18, 20, 22, 24 and 26, a nucleic acid construct according to claim 28 or a vector according to claim 29.

31. A host microorganism cell transformed or transfected by the polynucleotide sequence according to any one of claims any one of claims 6, 8, 10, 12, 14, 16, 18, 20,

22, 24 and 26 or a nucleic acid construct according to claim 27 or 28 or by a vector according to claim 29 under conditions conducive to express one or more of a Furoyl- CoA dehydrogenase, a Furoyl-CoA synthetase, a 2- oxoglutaroyl-CoA hydrolase, a 2 , 5-furan-dicarboxylic acid decarboxylase 1, a 2 , 5-furan-dicarboxylic acid

decarboxylase 2 and a HMF/furfural oxidoreductase .

32. A cell according to claim 31, wherein the cell is a prokaryote cell, an eukaryote cell, a plant cell or an animal cell.

33. A cell according to claim 32, wherein the cell is a bacterium chosen from the group of Escherichia, Anabaena, Caulobactert , Gluconobacter, Rhodobacter, Pseudomonas, Paracoccus, Bacillus, Brevibacterium, Corynebacterium, Rhizobium (Sinorhizobium) , Flavobacterium, Klebsiella,

Enterobacter, Lactobacillus, Lactococcus,

Methylobacterium, Staphylococcus or Streptomyces genus; or B. subtilis, B. amyloliquefaciens , B. licheniformis , B. puntis, B. megaterium, B. halodurans, B. pumilus, G. oxydans, Caulobactert crescentus CB 15, Methylobacterium extorquens, Rhodobacter sphaeroides, Pseudomonas

zeaxanthinifaciens , Pseudomonas putida, Pseudomonas putida S12, Paracoccus denitrificans , E. coli, C.

glutamicum, Staphylococcus carnosus, Streptomyces

lividans, Sinorhizobium melioti and Rhizobium radiobacter species .

34. A cell according to claim 32, wherein the cell is a yeast cell from the group consisting of the Candida, Hansenula, Kluyveromyces , Pichia, Saccharomyces ,

Schizosaccharomyces, or Yarrowia genus; or Kluyveromyces lactis, S. cerevisiae, Hansenula polymorpha, Yarrowia lipolytica and Pichia pastoris species or a filamentous fungal cell from the group Aspergillus, Chrysosporium,

Penicillium, Talaromyces or Trichoderma genus; or

Aspergillus niger, Aspergillus awamori, Aspergillus foetidus, Aspergillus sojae, Aspergillus fumigatus, Talaromyces emersonii, Aspergillus oryzae, Chrysosporium lucknowense, Trichoderma reesei or Penicillium

chrysogenum species.

35. A cell according to any one of claims 33 to 34, wherein one or more gene is deleted, knocked-out or disrupted in full or in part, wherein optionally the gene encodes for a protease.

36. A cell extract from a Cupriavidus microorganism according to claim 1 or claim 2 or a host cell according to claims 32 to 34, comprising a Furoyl-CoA dehydrogenase and/or a Furoyl-CoA synthetase and/or a 2-oxoglutaroyl- CoA hydrolase and/or a 2 , 5-furan-dicarboxylic acid decarboxylase 1 and/or a 2 , 5-furan-dicarboxylic acid decarboxylase 2 and/or a HMF/furfural oxidoreductase .

37. A process for the in-situ detoxification of a lignocellulose hydrolysate containing furanic compounds, such as preferably one or more of HMF, Furfurylalcohol ,

Furfural and/or Furoic acid, with a suitable host microorganism, the process comprising a host

microorganism from the family of Burkholderiaceae and contacting the lignocellulose hydrolysate with the host microorganism under conditions facilitating the

expression of one or more of a Furoyl-CoA dehydrogenase, a Furoyl-CoA synthetase, a 2-oxoglutaroyl-CoA hydrolase, a 2 , 5-furan-dicarboxylic acid decarboxylase 1, a 2,5- furan-dicarboxylic acid decarboxylase 2 and a

HMF/furfural oxidoreductase from the host microorganism from the family of Burkholderiaceae to convert furanic compounds, such as HMF, Furfurylalcohol , Furfural and/or Furoic acid, to non-toxic components to obtain a

detoxified lignocellulose hydrolysate.

38. A process according to claim 37, wherein the

Cupriavidus microorganism is Cupriavidus basilensis, preferably Cupriavidus basilensis HMF14 DSM 22875

according to claim 1.

39. A process according to claim 37, wherein the host microorganism is a microorganism according to any one of claims 32 to 34.

40. A process according to anyone of claims 37 to 39, further comprising subjecting the detoxified

lignocellulose hydrolysate to a simultaneous or

subsequent fermentation step.

41. A process according to anyone of claims 37 to 40, further comprising a step of pre-treating lignocellulose- containing material to obtain a lignocellulose

hydrolysate, preferably under acidic conditions.

42. A process according to anyone of claims 37 to 41, comprising introducing (a) a polynucleotide according to claim 27 in an appropriate host cell, cultivating the obtained host cell under conditions conducive to the detoxification of lignocelluloses hydrolysate, and recovering a detoxified lignocelluloses hydrolysate from the culture.

43. A process for the production of Furoyl-CoA

dehydrogenase and/or Furoyl-CoA synthetase and/or 2- oxoglutaroyl-CoA hydrolase and/or 2 , 5-furan-dicarboxylic acid decarboxylase 1 and/or 2 , 5-furan-dicarboxylic acid decarboxylase 2 and/or a HMF/furfural oxidoreductase which are at least 45% identical to those expressed by Cupriavidus basilensis HMF14 DSM 22875, or a host

microorganism according to claim 4, comprising

(a) culturing a microorganism in a nutrient medium containing carbon and nitrogen sources and inorganic salts; and

(b) isolating the enzymes produced from the

microorganism.

44. A process for the conversion of 5- hydroxymethylfurfural (HMF) , 2 , 5-dihydroxymethyl furan

(HMF alcohol) , 5-hydroxymethyl-2-furancarboxylic acid (HMF acid) and/or 2 , 5-furandicarboxylic acid to 2-furoyl CoA, comprising contacting furfuryl alcohol and/or furfural with a furoyl-CoA dehydrogenase, a furoyl-CoA synthetase, a 2-oxoglutaroyl-CoA hydrolase, and a 2,5- furan-dicarboxylic acid decarboxylase catalyst in the presence of one or more coenzyme cofactor(s) .

45. A process according to claim 44, wherein the catalyst comprises a polypeptide according to claims 9, 11, 17, 19 and/or 21.

46. A method for the biocatalytic production of biofuels such as ethanol in a suitable host microorganism of the enzymatic pathway of Cupriavidus basilensis strain DSM 22875, under conditions facilitating the expression of the activity of the enzymes in the enzymatic pathway.

47. A process for the production of second-generation biofuels from a lignocelluloses-containing material, which process comprises

a) pretreating and hydrolysis of the lignocellulose- containing material to prepare a lignocellulosic

hydrolysate comprising one or more furanic compounds and one or more fermentable sugars; b) in-situ detoxification of the lignocellulose

hydrolysate comprising contacting the lignocellulose hydrolysate with a host microorganism under conditions facilitating the expression of one or more of a Furoyl- CoA dehydrogenase, a Furoyl-CoA synthetase, a 2- oxoglutaroyl-CoA hydrolase, a 2 , 5-furan-dicarboxylic acid decarboxylase 1, a 2 , 5-furan-dicarboxylic acid

decarboxylase 2 and/or a HMF/furfural oxidoreductase from the host microorganism to convert the one or more furanic compounds to non-toxic components to obtain a detoxified lignocellulose hydrolysate;

c) conversion of the fermentable sugars from the

lignocellulose hydrolysate or the detoxified

lignocelluloses hydrolysate to produce a biofuel.

48. The process according to claim 47 wherein the one or more furanic compounds are selected from the group consisting of furfural, furfuryl alcohol, 5- (hydroxymethyl ) -furfural (HMF) and furoic acid.

49. The process according to claim 47 or 48 a wherein the host microorganism is a host microorganism from the family of Burkholderiaceae .

50. The process according to anyone of claims 47 to 49 wherein the conversion of fermentable sugars in step c) comprises fermentation of the fermentable sugars.

51. The process according to anyone of claims 47 to 50, wherein in step b) a stable fermentable sugar

concentration is maintained before all furanic compounds are converted.

52. The process according to anyone of claims 47 to 51 further comprising the step of catalytic cracking and/or hydrotreating of the biofuel to prepare a cracked and/or hydrotreated biofuel. 53. The process according to anyone of claims 47 to 52 further comprising the step of blending the, optionally cracked or hydrotreated, biofuel with and one or more other components to produce an automotive fuel.

Description:
NOVEL MICROORGANISM AND ITS USE IN LIGNOCELLULOSE

DETOXIFICATION

Field of the invention

The present invention relates to the isolation of a novel Cupriavidus species, and to novel polynucleotide sequences encoding novel polypeptides derived from this organism. The enzymes described herein are useful in the detoxification of lignocellulose hydrolysates in

recombinant organisms.

Background of the invention

The use of lignocellulose-containing material for the production of second-generation biofuels and biochemicals is well known in the art. Commonly, the process comprises the steps of pretreatment, hydrolysis, and fermentation.

A significant drawback of the pretreatment process step, in particular if involving acidic conditions, is the release of degradation products from the

lignocellulose-containing material that may evolve into fermentation inhibitors during the hydrolysis process step. Specifically furanic compounds, such as furfural, furfuryl alcohol, 5- (hydroxymethyl ) -furfural (HMF) and/or furoic acid pose problems that are difficult to mitigate, since they inhibit the growth of e.g. ethanologenic microorganisms employed in the fermentation step, thereby inhibiting a suitable performance of these organisms and reducing yields.

Approaches to remove the degradation products at least in part by chemical, physical or biological methods have been discussed in detail in Bioresource Technology 93 (2004), 1-10, and Microbiol. Biotechnol, DOI

10.1007/s00253-009-1875-l, published 31 st January 2009. Yu et al, in their article titled "Microbial

utilizaiton and biopolyester synthesis of bagasse

hydrolysates", Bioresource Technology, vol 99 (2008), pages 8042-8048, investigated the utilization of

hydrolysates by an aerobic bacterium, Ralstonia eutropha, to determine if organic inhibitors can be removed for potential recycling and reuse of the process water. In the article it is noted that R. Eutropha has a high metabolic activity on furfural. The R. Eutropha ,

however, also showed high metabolic activity on glucose and fructose.

WO2009/030713 discloses the washing of pre-treated lignocellulose-containing material, further referred to as lignocellulose hydrolysate. However, the inhibitors have a comparatively low solubility in water, while recycling of the wash water is only possible to a very limited extent due to build-up of the inhibitors, thus requiring large amounts of water. This makes the

disclosed process cumbersome and difficult to apply on a commercial scale.

WO2009/017441 discloses a process for biological detoxification of lignocellulosic hydrolysate with genetically modified yeasts, while US 7,067,303 discloses the use of a fungus for the same purpose.

While such microorganisms can metabolize the

fermentation inhibitors or convert them into less toxic compounds, their preferred carbon source are fermentable sugars, thereby reducing the fermentable sugar content of the lignocellulosic hydrolysate, and thus reducing the overall yield of the desired fermentation products.

Consequently, there is a need for providing a process for detoxifying pre-treated lignocellulose-containing material to obtain substrates suitable for hydrolysis, fermentation or thermochemical conversion, while

preserving the fermentable sugar content for subsequent or simultaneous process steps.

SUMMARY OF THE INVENTION

The present invention relates to an isolated

microorganism of the family Burkholderiaceae, genus

Cupriavidus, and the species and strain designation basilensis HMF 14. This strain has been deposited on 19 August 2009 at the Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (German Collection of

Microorganisms and Cell Cultures) , InhoffenstraBe 7B, 38124 Braunschweig, GERMANY as strain HMF 14, Deposit number DSM 22875.

The invention further relates to a bacterial culture comprising the Cupriavidus microorganism of Cupriavidus basilensis HMF 14 strain Deposit number DSM 22875. This strain preferably grows, when provided with HMF,

Furfurylalcohol , Furfural and/or Furoic acid as a carbon source. The invention further relates to an isolated microorganism or culture according to the invention, which expresses the following enzymes: a Furoyl-CoA dehydrogenase (composed of three polypeptide subunits) , a Furoyl-CoA synthetase, a 2-oxoglutaroyl-CoA hydrolase, a 2 , 5-furan-dicarboxylic acid decarboxylase 1, a 2,5-furan- dicarboxylic acid decarboxylase 2 and a HMF/furfural oxidoreductase . These enzymes form a novel pathway to selectively metabolize furanic compounds.

The present invention further provides

polynucleotides encoding polypeptides having the

following activities : a 2-furoyl-CoA: acceptor 5- oxidoreductase (hydroxylating) , EC 1.3.99.8 (further referred to herein as Furoyl-CoA dehydrogenase) ; a

Furoyl-CoA synthetase, EC 6.2.1.31; a 2-oxoglutaroyl-CoA hydrolase (thioester hydrolase), EC 3.1.2; a first 2,5- furan-dicarboxylic acid decarboxylase (Carboxy-lyase) , EC 4.1.1; a second 2 , 5-furan-dicarboxylic acid decarboxylase

(Carboxy-lyase), EC 4.1.1; and a HMF/furfural oxido- reductase EC 1.1.3 and 1.2.3.

A polynucleotide of the invention typically encodes a polypeptide having Furoyl-CoA dehydrogenase activity

(large subunit hmfA) . Another polynucleotide of the invention typically encodes a polypeptide having Furoyl- CoA dehydrogenase activity (FAD binding subunit, hmfB) . A further polynucleotide of the invention typically encodes a Furoyl-CoA dehydrogenase (2Fe-2S iron sulfur subunit, hmfC).Yet another polynucleotide of the invention

typically encodes a polypeptide having Furoyl-CoA

synthetase activity. A further polynucleotide of the invention typically encodes a polypeptide having 2- oxoglutaroyl-CoA hydrolase activity. A further

polynucleotide of the invention typically encodes a polypeptide having 2 , 5-furan-dicarboxylic acid

decarboxylase activity, while a further polynucleotide of the invention typically encodes a polypeptide having a second 2 , 5-furan-dicarboxylic acid decarboxylase

activity .

This invention further provides a polypeptide having aldehyde dehydrogenase activity which comprises the amino acid sequence set out in SEQ ID NO: 15 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 16 or a variant polypeptide thereof, wherein the variant has at least 64% sequence identity or more with the sequence set out in SEQ ID NO: 16.

This invention further provides a polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 16; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 16;

(c) a nucleotide sequence having at least 64% sequence identity or more with the nucleotide sequence of SEQ ID

NO : 16;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

This invention further provides a polypeptide having LysR family transcriptional regulator activity which comprises the amino acid sequence set out in SEQ ID NO: 17 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 18 or a variant polypeptide thereof, wherein the variant has at least 47 % sequence identity or more with the sequence set out in SEQ ID NO: 17.

This invention further provides a polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 18;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 18;

(c) a nucleotide sequence having at least 72% sequence identity or more with the nucleotide sequence of SEQ ID NO : 18; (d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

This invention further provides a polypeptide having

2 , 5-furan-dicarboxylic acid decarboxylase 1 activity which comprises the amino acid sequence set out in SEQ ID NO: 19 or an amino acid sequence encoded by the

nucleotide sequence of SEQ ID NO: 20 or a variant

polypeptide thereof, wherein the variant has at least 54% sequence identity or more with the sequence set out in SEQ ID NO: 19.

This invention further provides a polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 20;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 20;

(c) a nucleotide sequence having at least 66% sequence identity or more with the nucleotide sequence of SEQ ID

NO : 20;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d); (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

This invention further provides a polypeptide having 2 , 5-furan-dicarboxylic acid decarboxylase 2 activity which comprises the amino acid sequence set out in SEQ ID NO: 21 or an amino acid sequence encoded by the

nucleotide sequence of SEQ ID NO: 22 or a variant

polypeptide thereof, wherein the variant has at least 52% sequence identity or more with the sequence set out in

SEQ ID NO: 21.

This invention further provides a polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 22; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 22;

(c) a nucleotide sequence having at least 67% sequence identity or more with the nucleotide sequence of SEQ ID NO: 22;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

This invention further provides a polypeptide having

HMF/furfural oxidoreductase activity which comprises the amino acid sequence set out in SEQ ID NO: 25 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 26 or a variant polypeptide thereof, wherein the variant has at least 45% sequence identity or more with the sequence set out in SEQ ID NO: 25.

This invention further provides a polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 26;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 26;

(c) a nucleotide sequence having at least 66% sequence identity or more with the nucleotide sequence of SEQ ID NO : 26;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

This invention further provides a polypeptide having LysR type transcriptional regulator activity which comprises the amino acid sequence set out in SEQ ID NO: 33 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 34 or a variant polypeptide thereof, wherein the variant has at least 46% sequence identity or more with the sequence set out in SEQ ID NO: 33.

This invention further provides a polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 34; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 34;

(c) a nucleotide sequence having at least 65% sequence identity or more with the nucleotide sequence of SEQ ID

NO: 34;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

This invention further provides a polypeptide having Furoyl-CoA dehydrogenase (large subunit) activity which comprises the amino acid sequence set out in SEQ ID NO: 35 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 36 or a variant polypeptide thereof, wherein the variant has at least 54 % sequence identity or more with the sequence set out in SEQ ID NO: 35.

This invention further provides a polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 36;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 36;

(c) a nucleotide sequence having at least 66% sequence identity or more with the nucleotide sequence of SEQ ID NO : 36; (d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

This invention further provides a polypeptide having

Furoyl-CoA dehydrogenase (FAD binding subunit) activity which comprises the amino acid sequence set out in SEQ ID NO: 37 or an amino acid sequence encoded by the

nucleotide sequence of SEQ ID NO: 38 or a variant

polypeptide thereof, wherein the variant has at least 49% sequence identity or more with the sequence set out in SEQ ID NO: 37.

This invention further provides a polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 38;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 38;

(c) a nucleotide sequence having at least 71% sequence identity or more with the nucleotide sequence of SEQ ID

NO : 38;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d); (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

This invention further provides a polypeptide having Furoyl-CoA dehydrogenase 2Fe-2S iron sulfur subunit activity which comprises the amino acid sequence set out in SEQ ID NO: 39 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 40 or a variant

polypeptide thereof, wherein the variant has at least 64% sequence identity or more with the sequence set out in

SEQ ID NO: 39.

This invention further provides a polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 40;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 40;

(c) a nucleotide sequence having at least 70% sequence identity or more with the nucleotide sequence of SEQ ID NO: 40; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b), (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

This invention further provides a polypeptide having Furoyl-CoA synthetase activity which comprises the amino acid sequence set out in SEQ ID NO: 41 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 42 or a variant polypeptide thereof, wherein the variant has at least 57% sequence identity or more with the sequence set out in SEQ ID NO: 41.

This invention further provides a polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 42;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 42;

(c) a nucleotide sequence having at least 68% sequence identity or more with the nucleotide sequence of SEQ ID

NO: 42;

(d) a fragment of a nucleotide sequence as defined in

(a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d);

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

This invention further provides a polypeptide having 2-oxoglutaroyl-CoA hydrolase activity which comprises the amino acid sequence set out in SEQ ID NO: 43 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 44 or a variant polypeptide thereof, wherein the variant has at least 72% sequence identity or more with the sequence set out in SEQ ID NO: 43.

This invention further provides a polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 44;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 44; (c) a nucleotide sequence having at least 74% sequence identity or more with the nucleotide sequence of SEQ ID NO : 44;

(d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b), (c) or (d);

(f)a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

This invention further provides a vector

incorporating a polynucleotide sequence as described above or a cell comprising a polypeptide sequence as described above.

This invention further provides a host microorganism cell transformed or transfected by a polynucleotide sequence as describe above or by a vector as described above under conditions conducive to express one or more of a Furoyl-CoA dehydrogenase, a Furoyl-CoA synthetase, a 2-oxoglutaroyl-CoA hydrolase, a 2 , 5-furan-dicarboxylic acid decarboxylase 1, a 2 , 5-furan-dicarboxylic acid decarboxylase 2 and a HMF/furfural oxidoreductase .

This invention further provides a process for the in- situ detoxification of lignocellulose hydrolysate

containing furanic compounds, such as preferably one or more of HMF, Furfurylalcohol , Furfural and/or Furoic acid, with a suitable host microorganism, comprising contacting the lignocellulose hydrolysate with the host microorganism under conditions facilitating the

expression of one or more of a Furoyl-CoA dehydrogenase, a Furoyl-CoA synthetase, a 2-oxoglutaroyl-CoA hydrolase, a 2 , 5-furan-dicarboxylic acid decarboxylase 1, a 2,5- furan-dicarboxylic acid decarboxylase 2 and a

HMF/furfural oxidoreductase from a microorganism from the family of Burkholderiaceae to convert furanic compounds, such as HMF, Furfurylalcohol , Furfural and/or Furoic acid, to non-toxic components to obtain a detoxified lignocellulose hydrolysate.

This invention further provides a process for the production of Furoyl-CoA dehydrogenase, Furoyl-CoA synthetase, 2-oxoglutaroyl-CoA hydrolase, 2,5-furan- dicarboxylic acid decarboxylase 1, 2 , 5-furan-dicarboxylic acid decarboxylase 2 and/or a HMF/furfural oxidoreductase which are at least 45% identical to those expressed by Cupriavidus basilensis HMF14 DSM 22875, or a host

microorganism as described above, comprising

(a) culturing a microorganism in a nutrient medium containing carbon and nitrogen sources and inorganic salts; and

(b) isolating the enzymes produced from the

microorganism.

This invention further provides a process for the conversion of 5-hydroxymethylfurfural (HMF), 2,5- dihydroxymethyl furan (HMF alcohol) , 5-hydroxymethyl-2- furancarboxylic acid (HMF acid) and/or 2,5- furandicarboxylic acid to 2-furoyl CoA, comprising comprising contacting furfuryl alcohol and/or furfural with a furoyl-CoA dehydrogenase, furoyl-CoA synthetase, 2-oxoglutaroyl-CoA hydrolase, and 2 , 5-furan-dicarboxylic acid decarboxylase catalyst in the presence of one or more coenzyme cofactor. Brief Description of the Figures

Figure 1 shows the growth of C. basilensis HMF14 on mineral salts medium with furfural as the sole carbon source .

Figure 2 illustrates the growth of C. basilensis

HMF14 on different concentrations of furfural (A) or HMF (B) .

Figure 3 depicts the detection of PHA in cultures of C. basilensis HMF14 in minimal medium with 120 mM

acetate.

Figure 4 illustrates the detoxification of

lignocellulosic hydrolysate by C. basilensis HMF14.

Figure 5 illustrates a reaction scheme of reactions catalysed by Cupriavidus basilensis HMF 14.

Figure 6 is a schematic representation of the genetic organization of the furfural and HMF metabolic genes in C. basilensis HMF14 (A) and other species (B) that were identified as potential furfural and / or HMF utilizers.

Description of the Sequences

Sequences SEQ ID NO: 1 to 14 set out the DNA

sequences of several synthetic DNA primers used for PCR and/or for cloning or isolation of relevant genes.

Underlined sequences indicate a restriction site.

Synthetic DNA

SEQ ID NO: 1

Hybridizes to hmfA, forward primer.

GCACGCGCCTGAGTTACGAC

Synthetic DNA

SEQ ID NO: 2

Hybridizes to hmfR2, forward primer.

CATGCTCGGCGCTGGTGAC Synthetic DNA

SEQ ID NO: 3

Hybridizes near start site of hmfF gene, forward primer. CATGAATTCCGACCCAGGAGTCACGCCAT Synthetic DNA

SEQ ID NO: 4

Hybridizes near stop site of hmfF gene, reverse primer. CGGCGGCCGCGGATATACCGACAATGATGCGTCTCT

Synthetic DNA

SEQ ID NO: 5

Hybridizes near stop site of hmfG gene, reverse primer. CGGCGGCCGCTGTCTCCTGCCTGTTCAGCATTCA

Synthetic DNA

SEQ ID NO: 6

Hybridizes near stop site of hmfD gene, reverse primer.

GCGGGCCCCTTACTCCTTGATGGTATCGACAGG

Synthetic DNA

SEQ ID NO: 7

Hybridizes near start site of hmfA gene, forward primer. GCGGTACCGGGAGGGCCGGTCATGAG

Synthetic DNA

SEQ ID NO: 8

Hybridizes near start site of hmfD gene, forward primer. GCGGTACCGGCGTAGATACCCAGGAGGC Synthetic DNA

SEQ ID NO: 9 Hybridizes near stop site of hmfC gene, reverse primer. GCGGGCCCCCACGCTTTGCAGGAAGGTG

SEQ ID NO: 10

Hybridizes near start site of hmfG gene, forward primer. CGGAATTCCGGCGCATGTGTTCACGC

Synthetic DNA

SEQ ID NO: 10

CGGAATTCCGGCGCATGTGTTCACGC

Synthetic DNA

SEQ ID NO: 11

Hybridizes near stop site of hmfE gene, reverse primer. GCGCGGCCGCCGCCCGCGATTTCCATCAG

Synthetic DNA

SEQ ID NO: 12

Hybridizes near start site of hmfE gene, forward primer.

GCGGTACCCCATCAAGGAGTAAGACATGACCC

Synthetic DNA

SEQ ID NO: 13

Hybridizes near start site of hmfH gene, forward primer.

CGGAATTCCACATGACAAGGGGAGACCG

Synthetic DNA

SEQ ID NO: 14

Hybridizes near stop site of hmfH gene, reverse primer. CGGAATTCGCTTCGGTCTTCAACTCGGATG

SEQ ID NO: 15 sets out the amino acid sequence

of a putative furan aldehyde dehydrogenase (adh) of

Cupriavidus basilensis HMF14. SEQ ID NO: 16 sets out the coding sequence of a putative furan aldehyde dehydrogenase (adh) .

SEQ ID NO: 17 sets out the amino acid sequence of LysR- type transcriptional regulator hmfRl .

SEQ ID NO: 18 sets out the coding sequence of hmfRl.

SEQ ID NO: 19 sets out the amino acid sequence of 5- hydroxymethyl-2-furoic acid decarboxylase 1 hmfF.

SEQ ID NO: 20 sets out the coding sequence of 5- hydroxymethyl-2-furoic acid decarboxylase 1 hmfF.

SEQ ID NO: 21 sets out the amino acid sequence of 5- hydroxymethyl-2-furoic acid decarboxylase 2 hmfG.

SEQ ID NO: 22 sets out the coding sequence of 5- hydroxymethyl-2-furoic acid decarboxylase 2 hmfG.

SEQ ID NO: 23 sets out sets out the amino acid sequence of the extracytoplasmic solute receptor hmfH' .

SEQ ID NO: 24 sets out the coding sequence of

extracytoplasmic solute receptor hmfH' .

SEQ ID NO: 25 sets out the amino acid sequence of

HMF/furfural oxidoreductase hmfH.

SEQ ID NO: 26 sets out the coding sequence of

HMF/furfural oxidoreductase hmfH.

SEQ ID NO: 27 sets out the amino acid sequence of fatty acid hydroxylase hyd.

SEQ ID NO: 28 sets out the coding sequence of fatty acid hydroxylase hyd.

SEQ ID NO: 29 sets out the amino acid sequence of

truncated LysR-type transcriptional regulator LysR hmfRt .

SEQ ID NO: 30 sets out the coding sequence for truncated

LysR-type transcriptional regulator LysR hmfRt .

SEQ ID NO: 31 sets out the amino acid sequence for major facilitator superfamily transporter putative furanic MFS- type transporter mfsl. SEQ ID NO: 32 sets out the coding sequence for a putative furanic MFS-type transporter major facilitator

superfamily transporter mfsl.

SEQ ID NO: 33 sets out the amino acid sequence for LysR- type regulator hmfR2.

SEQ ID NO: 34 sets out the coding sequence for LysR-type regulator hmfR2.

SEQ ID NO: 35 sets out the amino acid sequence for the furoyl-CoA dehydrogenase large subunit hmfA.

SEQ ID NO: 36 sets out the coding sequence for furoyl-CoA dehydrogenase large subunit hmfA.

SEQ ID NO: 37 sets out the amino acid sequence for the

Furoyl-CoA dehydrogenase FAD binding subunit, hmfB .

SEQ ID NO: 38 sets out the coding sequence for Furoyl-CoA dehydrogenase FAD binding subunit hmfB .

SEQ ID NO: 39 sets out the amino acid sequence for the Furoyl-CoA dehydrogenase 2Fe-2S iron sulfur subunit, hmfC SEQ ID NO: 40 sets out the coding sequence for Furoyl-CoA dehydrogenase 2Fe-2S iron sulfur subunit hmfC .

SEQ ID NO: 41 sets out the amino acid sequence for the

Furoyl-CoA synthetase hmfD.

SEQ ID NO: 42 sets out the coding sequence for Furoyl-CoA synthetase hmfD.

SEQ ID NO: 43 sets out the amino acid sequence for the 2- oxoglutaroyl-CoA hydrolase hmfE .

SEQ ID NO: 44 sets out the coding sequence for 2- oxoglutaroyl-CoA hydrolase hmfE .

SEQ ID NO: 45 sets out the amino acid sequence for a putative furanic MFS-type transporter the major

facilitator superfamily transporter mfs2.

SEQ ID NO: 46 sets out the coding sequence for a putative furanic MFS-type transporter mfs2. Definition of Terms

The following terms will be understood as defined herein unless otherwise stated. Such definitions include without recitation those meanings associated with these terms known to those skilled in the art.

Within the context of the subject invention, the term "furanic compound" may be selected from one or more of the following compounds: Furfural, furfurylalcohol , furoic acid, hydroxymethylfurfural ,

hydroxymethylfurancarboxylic acid, furandimethanol , diformylfuran, formylfuran carboxylic acid,

furandicarboxylic acid, formylthiophene,

thiophene methanol, thiophene carboxylic acid,

hydroxymethyl formylthiophene, hydroxymethyl thiopene carboxylic acid, thiophene dimethanol, diformyl

thiophene, formylthiopene carboxylic acid, thiophene dicarboxylic acid, formylpyrrole, pyrrolylmethanol , pyrrole carboxylic acid, hydroxymethyl formylpyrrole, hydroxymethyl pyrrole carboxylic acid, pyrroledimethanol , diformylpyrrole, formylpyrrole carboxylic acid and pyrrole dicarboxylic acid.

Preferred furanic compounds are furfural,

furfurylalcohol , furoic acid, hydroxymethylfurfural , hydroxymethylfurancarboxylic acid, furandimethanol , diformylfuran, formylfuran carboxylic acid,

furandicarboxylic acid, more preferably at least one of hydroxymethylfurfural (HMF) , hydroxymethylfuran

carboxylic acid (HMF acid), 2 , 5-dihydroxymethylfuran (HMF alcohol) . At the furanic compound, the furan ring or any or its substitutable side-group may be substituted, e.g. with OH, alkyl (e.g. C1-C10 alkyl) , allyl, aryl or RO- ether moiety, including cyclic groups, in the furan ring on any available position. More preferably, Furanic compounds are herein understood to be any compound having a furan group that may be oxidized to 2,5-furan- dicarboxylic acid or a precursor thereof.

Expression of polypeptides according to the

invention: Regardless of the exact mechanism utilized for expression of enzymes, it is contemplated that such expression is transferable by the introduction of genes encoding these enzymes into another host cell by methods known in the art. Genetic elements as herein defined include nucleic acids (generally DNA or RNA) having expressible coding sequences for products such as

proteins, specifically enzymes, apoproteins or antisense RNA, which express or regulate expression of relevant enzymes. The expressed proteins can function as enzymes, repress or derepress enzyme activity or control

expression of enzymes. Recombinant DNA encoding these expressible sequences can be either chromosomal

(integrated into the host cell chromosome by, for

example, homologous recombination) or extra-chromosomal (for example, carried by one or more plasmids, cosmids and other vectors capable of self replication) . It is understood that the recombinant DNA utilized for

transforming the host cell in accordance with this invention can include, in addition to structural genes and transcription factors, expression control sequences, including promoters, repressors and enhancers, that act to control expression or derepression of coding sequences for proteins, apoproteins or antisense RNA. For example, such control sequences can be inserted into wild-type host cells to promote overexpression of selected enzymes already encoded in the host cell genome, or alternatively they can be used to control synthesis of

extrachromosomally encoded enzymes. Recombinant DNA can be introduced into the host cell by any means, including, but not limited to, plasmids, cosmids, phages, yeast artificial chromosomes or other vectors that mediate transfer of genetic elements into a host cell. These vectors can include an origin of replication, along with cis-acting control elements that control replication of the vector and the genetic

elements carried by the vector. Selectable markers can be present on the vector to aid in the identification of host cells into which genetic elements have been

introduced .

Means for introducing genetic elements into a host cell (e.g. cloning) are well known to the skilled

artisan. One can utilize an extrachromosomal multi-copy plasmid vector to insert the genetic elements in

accordance with the present invention. Plasmid-borne introduction of the genetic element into host cells involves an initial cleaving of a plasmid vector with a restriction enzyme, followed by ligation of the plasmid and genetic elements encoding for the targeted enzyme species in accordance with the invention. Upon

recircularization of the ligated recombinant plasmid, infection (e.g., packaging in phage lambda) or other mechanism for plasmid transfer (e.g., electroporation, microinjection, etc.) is utilized to transfer the plasmid into the host cell. Plasmids suitable for insertion of genetic elements into the host cell are well known to the skilled artisan. Other gene cloning methods include, but are not limited to, direct integration of the genetic material into the chromosome. This can occur by a variety of means, including cloning the genetic elements

described herein on non-replicating plasmids flanked by homologous DNA sequences of the host chromosome; upon transforming said recombinant plasmid into a host the genetic elements can be introduced into the chromosome by DNA recombination. Such recombinant strains can be recovered if the integrating DNA fragments contain a selectable marker, such as antibiotic resistance.

Alternatively, the genetic elements can be directly introduced into the chromosome of a host cell without use of a non-replicating plasmid. This can be done by

synthetically producing DNA fragments of the genetic elements in accordance to the present invention that also contain homologous DNA sequences of the host chromosome. Again if these synthetic DNA fragments also contain a selectable marker, the genetic elements can be inserted into the host chromosome.

A DNA fragment, as used herein, may encode regulatory and/or structural genetic information. A DNA fragment useful in the present invention shall also include:

nucleic acid molecules encoding sequences complementary to those provided; nucleic acid molecules (DNA or RNA) which hybridize under stringent conditions to those molecules that are provided; or those nucleic acid molecules that, but for the degeneracy of the genetic code, would hybridize to the molecules provided or their complementary strands. "Stringent" hybridization

conditions are those that minimize formation of double stranded nucleic acid hybrids from non-complementary or mismatched single stranded nucleic acids. In addition, hybridization stringency may be affected by the various components of the hybridization reaction, including salt concentration, the presence or absence of formamide, the nucleotide composition of the nucleic acid molecules, etc. The nucleic acid molecules useful in the present invention may be either naturally or synthetically derived .

A "heterologous" or "exogenous" DNA fragment has been introduced into the host microorganism by any process such as transformation, transfection, transduction, conjugation, electroporation, etc. Additionally, it should be noted that it is possible that the host cell into which the "heterologous" DNA fragment has been inserted may itself also naturally harbour molecules encoding the same or similar sequences. A molecule such as this is referred to as a "homologous" DNA molecule.

A stably transformed microorganism is one that has had one or more DNA fragments introduced such that the introduced molecules are maintained, replicated and segregated in a growing culture. Stable transformation may be due to multiple or single chromosomal integration (s) or by (an) extrachromosomal element (s) such as (a) plasmid vector (s) . A plasmid vector is capable of directing the expression of polypeptides encoded by particular DNA fragments.

Expression may be constitutive or regulated by inducible (or repressible) promoters that enable high levels of transcription of functionally associated DNA fragments encoding specific polypeptides.

Regardless of the exact mechanism utilized for expression of enzymes necessary for detoxification of lignocelluloses , it is contemplated that such expression is transferable by the introduction of genes encoding these enzymes into another host cell by methods known in the art. Genetic elements as herein defined include nucleic acids (generally DNA or RNA) having expressible coding sequences for products such as proteins,

specifically enzymes, apoproteins or antisense RNA, which express or regulate expression of relevant enzymes. The expressed proteins can function as enzymes, repress or derepress enzyme activity or control expression of enzymes. Recombinant DNA encoding these expressible sequences can be either chromosomal (integrated into the host cell chromosome by, for example, homologous

recombination) or extra-chromosomal (for example, carried by one or more plasmids, cosmids and other vectors capable of self replication) . It is understood that the recombinant DNA utilized for transforming the host cell in accordance with this invention can include, in

addition to structural genes and transcription factors, expression control sequences, including promoters, repressors and enhancers that act to control expression or derepression of coding sequences for proteins, apoproteins or antisense RNA. For example, such control sequences can be inserted into wild-type host cells to promote overexpression of selected enzymes already encoded in the host cell genome, or alternatively they can be used to control synthesis of extrachromosomally encoded enzymes.

Recombinant DNA can be introduced into the host cell by any means, including, but not limited to, plasmids, cosmids, phages, yeast artificial chromosomes or other vectors that mediate transfer of genetic elements into a host cell. These vectors can include an origin of replication, along with cis-acting control elements that control replication of the vector and the genetic elements carried by the vector. Selectable markers can be present on the vector to aid in the identification of host cells into which genetic elements have been

introduced Means for introducing genetic elements into a host cell (e.g. cloning) are well known to the skilled

artisan. One can utilize an extrachromosomal multi-copy plasmid vector to insert the genetic elements in

accordance with the present invention. Plasmid-borne introduction of the genetic element into host cells involves an initial cleaving of a plasmid vector with a restriction enzyme, followed by ligation of the plasmid and genetic elements encoding for the targeted enzyme species in accordance with the invention. Upon

recircularization of the ligated recombinant plasmid, infection (e.g., packaging in phage lambda) or other mechanism for plasmid transfer (e.g., electroporation, microinjection, etc.) is utilized to transfer the plasmid into the host cell. Plasmids suitable for insertion of genetic elements into the host cell are well known to the skilled artisan.

Other gene cloning methods include, but are not limited to, direct integration of the genetic material into the chromosome. This can occur by a variety of means, including cloning the genetic elements described herein on non-replicating plasmids flanked by homologous DNA sequences of the host chromosome; upon transforming said recombinant plasmid into a host the genetic elements can be introduced into the chromosome by DNA

recombination. Such recombinant strains can be recovered if the integrating DNA fragments contain a selectable marker, such as antibiotic resistance. Alternatively, the genetic elements can be directly introduced into the chromosome of a host cell without use of a non- replicating plasmid. This can be done by synthetically producing DNA fragments of the genetic elements in accordance to the present invention that also contain homologous DNA sequences of the host chromosome. Again if these synthetic DNA fragments also contain a selectable marker, the genetic elements can be inserted into the host chromosome.

A preferred embodiment of the invention is a host cell comprising one or more polypeptides,

polynucleotides, nucleic acid constructs or vectors according to the invention. This may be a cell in which the polypeptides, polynucleotides, nucleic acid

constructs or vectors can suitably be expressed. The enzymes according to the invention may be favourably expressed in a host cell. The host cell according to the invention may be any host cell. The cell may be a

prokaryote cell, an eukaryote cell, a plant cell or an animal cell.

The cell may be a host microorganism, which may be an autonomous single-celled organism useful for microbial production of biofuels, such as ethanol, as well as production of chemicals, including both eukaryotic and prokaryotic microorganisms. Useful microorganisms may be prokaryotes or eukaryotes and include organisms like bacteria, yeast, and fungi and plants.

Such a host microorganism usually contains all DNA, either endogenous or heterologous, required for the digestion of furanic compounds from lignocellulose hydrolysate. It may further preferably also comprise all DNA, either endogenous or heterologous, required for the conversion of fermentable sugars from lignocellulose hydrolysate for the production of a biofuel component such as for instance ethanol, n- or iso-butanol from lignocelluloses hydrosylate. In such cell one or more gene may be deleted, knocked-out or disrupted in full or in part. According to an embodiment, the host cell according to the invention is a eukaryotic host cell. Preferably, the eukaryotic cell is a mammalian, insect, plant, fungal, or algal cell. Preferred mammalian cells include e.g. Chinese hamster ovary (CHO) cells, COS cells, 293 cells, PerC6 cells, and hybridomas . Preferred insect cells include e.g. Sf9 and Sf21 cells and

derivatives thereof.

More preferably, the eukaryotic cell is a fungal cell, such as for instance a yeast cell, such as those of the Candida, Hansenula, Kluyveromyces , Pichia,

Saccharomyces , Schizosaccharomyces, and/or Yarrowia strains. More preferably, it si a cell selected from Kluyveromyces lactis, S. cerevisiae, Hansenula

polymorpha, Yarrowia lipolytica and Pichia pastoris, or a filamentous fungal cell. More preferably, the eukaryotic cell is a yeast cell. "Filamentous fungi" include all filamentous forms of the subdivision Eumycota and

Oomycota (as defined by Hawksworth et al . , In, Ainsworth and Bisby's Dictionary of The Fungi, 8th edition, 1995, CAB International, University Press, Cambridge, UK) . The filamentous fungi are characterized by a mycelial wall composed of chitin, cellulose, glucan, chitosan, mannan, and other complex polysaccharides. Vegetative growth is by hyphal elongation and carbon catabolism is obligately aerobic. Filamentous fungal strains include, but are not limited to, strains of Acremonium, Agaricus, Aspergillus, Aureobasidium, Chrysosporium, Coprinus, Cryptococcus , Filibasidium, Fusarium, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces , Penicillium, Piromyces, Panerochaete, Pleurotus,

Schizophyllum, Talaromyces, Thermoascus, Thielavia,

Tolypocladium, and Trichoderma. Preferred filamentous fungal cells belong to a species of an Aspergillus, Chrysosporium, Penicillium, Talaromyces or Trichoderma genus, and most preferably a species of Aspergillus niger, Aspergillus awamori,

Aspergillus foetidus, Aspergillus sojae, Aspergillus fumigatus, Talaromyces emersonii, Aspergillus oryzae, Chrysosporium lucknowense, Trichoderma reesei or

Penicillium chrysogenum. When the host cell according to the invention is an Aspergillus host cell, the host cell preferably is CBS 513.88, CBS124.903 or a derivative thereof. Several strains of filamentous fungi are readily accessible to the public in a number of culture

collections, such as the American Type Culture Collection (ATCC) , Deutsche Sammlung von Mikroorganismen und

Zellkulturen GmbH (DSM) , Centraalbureau Voor

Schimmelcultures (CBS) , and Agricultural Research Service Patent Culture Collection, Northern Regional Research Center (NRRL) Aspergillus niger CBS 513.88, Aspergillus oryzae ATCC 20423, IFO 4177, ATCC 1011, ATCC 9576,

ATCC14488-14491, ATCC 11601, ATCC12892, P. chrysogenum

CBS 455.95, Penicillium citrinum ATCC 38065, Penicillium chrysogenum P2, Talaromyces emersonii CBS 124.902,

Acremonium chrysogenum ATCC 36225 or ATCC 48272,

Trichoderma reesei ATCC 26921 or ATCC 56765 or ATCC 26921, Aspergillus sojae ATCC11906, Chrysosporium

lucknowense ATCC44006 and derivatives thereof.

According to another preferred embodiment, the host cell according to the invention is a prokaryotic cell. Preferably, the prokaryotic host cell is bacterial cell. The term "bacterial cell" includes both Gram-negative and

Gram-positive microorganisms. Suitable bacteria may be selected from e.g. Escherichia, Anabaena, Caulobacter, Gluconobacter, Rhodobacter, Pseudomonas, Paracoccus, Bacillus, Brevibacterium, Corynebacterium, Rhizobium (Sinorhizobium) , Flavobacterium, Klebsiella,

Enterobacter, Lactobacillus, Lactococcus,

Methylobacterium, Staphylococcus or Streptomyces .

Preferably, the bacterial cell is selected from the group consisting of B. subtilis, B. amyloliquefaciens , B.

licheniformis , B. puntis, B. megaterium, B. halodurans, B. pumilus, G. (Gluconobacter) oxydans, Caulobacter crescentus CB 15, Methylobacterium extorquens,

Rhodobacter sphaeroides, Pseudomonas zeaxanthinifaciens ,

Pseudomonas putida, Pseudomonas putida S12, Paracoccus denitrificans , E. coli, C. glutamicum, Staphylococcus carnosus, Streptomyces lividans, Sinorhizobium melioti and Rhizobium radiobacter.

Preferred host organisms are any genus/species that are able to produce and incorporate the molydopterin cofactor required for the furoyl-CoA dehydrogenase, or comprising a molydopterin cofactor independent furoyl-CoA dehydrogenase, flavin adenine dinucleotide (FAD) or nicotine adenine dehydrogenase (NAD+) .

For specific uses of a compound produced in a host cell according to the invention, the selection of the host cell may be made according to such use. Where e.g. the compound produced in a host cell according to the invention is to be used in food applications, a host cell may be selected from a food-grade organism such as

Saccharomyces cerevisiae. Specific uses include, but are not limited to, food, (animal) feed, pharmaceutical, agricultural such as crop-protection, and/or personal care applications.

The invention further relates to method for the preparation of polypeptides having various enzymatic activities. This method comprises cultivating a cell according to the invention under conditions which allow for expression of the appropriate polypeptide and, optionally, recovering the expressed polypeptide and to a polypeptide obtainable by that method.

A polypeptide according to the invention having aldehyde dehydrogenase activity comprises the amino acid sequence set out in SEQ ID NO: 15 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 16 or a variant polypeptide thereof, wherein the variant has at least 41% sequence identity or more with the sequence set out in SEQ ID NO: 15.

In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least

64%, 66%, 68%, 70%, 72%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 16.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least

41%, 45%, 47%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 15.

An embodiment of the invention is a polynucleotide which comprises:

(a) the nucleotide sequence set out in SEQ ID NO: 16;

(b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 16; (c) a nucleotide sequence having at least 64% sequence identity or more with the nucleotide sequence of SEQ ID NO: 16; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is

degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ;

(f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

A polypeptide according to the invention having LysR family transcriptional regulator activity comprises the amino acid sequence set out in SEQ ID NO: 17 or an amino acid sequence encoded by the nucleotide sequence of SEQ

ID NO: 18 or a variant polypeptide thereof, wherein the variant has at least 47% sequence identity or more with the sequence set out in SEQ ID NO: 17.

In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 72%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 18.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least 47%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 17.

An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in SEQ ID NO: 18; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 18; (c) a nucleotide sequence having at least 72% sequence identity or more with the nucleotide sequence of SEQ ID NO: 18; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) ,

(b) , (c) , (d) or (e) , or encodes a polypeptide; and to the relating polypeptides.

A polypeptide according to the invention having 2,5- furan-dicarboxylic acid decarboxylase 1 activity

comprises the amino acid sequence set out in SEQ ID NO:

19 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 20 or a variant polypeptide thereof, wherein the variant has at least 54 % sequence identity or more with the sequence set out in SEQ ID NO: 19.

An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in SEQ ID NO: 20; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 20; (c) a nucleotide sequence having at least 66% sequence identity or more with the nucleotide sequence of SEQ ID NO: 20; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f)a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 66%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 20.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least

54%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 19.

A polypeptide according to the invention having 2,5- furan-dicarboxylic acid decarboxylase 2 activity

comprises the amino acid sequence set out in SEQ ID NO: 21 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 22 or a variant polypeptide thereof, wherein the variant has at least 52 % sequence identity or more with the sequence set out in SEQ ID NO:

21.

An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in SEQ ID NO: 22; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 22; (c) a nucleotide sequence having at least 67% sequence identity or more with the nucleotide sequence of SEQ ID NO: 22; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 67%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 22.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least 52%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 21.

A polypeptide according to the invention having

HMF/furfural oxidoreductase activity comprises the amino acid sequence set out in SEQ ID NO: 25 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 26 or a variant polypeptide thereof, wherein the variant has at least 45% sequence identity or more with the sequence set out in SEQ ID NO: 25.

An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in SEQ ID NO: 26; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 26; (c) a nucleotide sequence having at least 66% sequence identity or more with the nucleotide sequence of SEQ ID NO: 26; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 66%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 26.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 25.

A polypeptide according to the invention having fatty acid hydroxylase (hyd) activity comprises the amino acid sequence set out in SEQ ID NO: 27, or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 28 or a variant polypeptide thereof, wherein the variant has at least 66% sequence identity or more with the sequence set out in SEQ ID NO: 28.

An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in SEQ ID NO: 28; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 28; (c) a nucleotide sequence having at least 66% sequence identity or more with the nucleotide sequence of SEQ ID NO: 28; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) ,

(b) , (c) , (d) or (e) .

In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least

66%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 28.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least

31%, 33%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 27.

A polypeptide according to the invention having truncated LysR-type transcriptional regulator LysR

(hmfRt) activity activity comprises the amino acid sequence set out in SEQ ID NO: 29, or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 30 or a variant polypeptide thereof, wherein the variant has at least 65% sequence identity or more with the sequence set out in SEQ ID NO: 30.

An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in SEQ ID NO: 30; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 30; (c) a nucleotide sequence having at least 85% sequence identity or more with the nucleotide sequence of SEQ ID NO: 30; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 85%, 88%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 30.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least 37, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 29.

A polypeptide according to the invention having for major facilitator superfamily transporter putative furanic MFS-type transporter mfsl activity comprises the amino acid sequence set out in SEQ ID NO: 31, or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 32 or a variant polypeptide thereof, wherein the variant has at least 79% sequence identity or more with the sequence set out in SEQ ID NO: 32. An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in SEQ ID NO: 32; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 32; (c) a nucleotide sequence having at least 79%

sequence identity or more with the nucleotide sequence of SEQ ID NO: 32; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 79%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,

98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 32. In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least 32%, 33%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 31.

A polypeptide according to the invention having LysR type transcriptional regulator activity comprises the amino acid sequence set out in SEQ ID NO: 33 or an amino acid sequence encoded by the nucleotide sequence of SEQ

ID NO: 34 or a variant polypeptide thereof, wherein the variant has at least 46% sequence identity or more with the sequence set out in SEQ ID NO: 33.

An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in

SEQ ID NO: 34; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 34; (c) a nucleotide sequence having at least 65% sequence identity or more with the nucleotide sequence of SEQ ID NO: 34; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 34.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least 46%, 50%, 55%, 60%, 65%, 70%, 76%, 80%, 85%, 90%, 91%,

92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 33.

A polypeptide according to the invention having

Furoyl-CoA dehydrogenase (large subunit) activity

comprises the amino acid sequence set out in SEQ ID NO:

35 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 36 or a variant polypeptide thereof, wherein the variant has at least 54 % sequence identity or more with the sequence set out in SEQ ID NO: 35.

An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in SEQ ID NO: 36; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 36; (c) a nucleotide sequence having at least 66% sequence identity or more with the nucleotide sequence of SEQ ID NO: 36; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 66%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 36.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least

54%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 35.

A polypeptide according to the invention having

Furoyl-CoA dehydrogenase FAD binding subunit activity which comprises the amino acid sequence set out in SEQ ID NO: 37 or an amino acid sequence encoded by the

nucleotide sequence of SEQ ID NO: 38 or a variant

polypeptide thereof, wherein the variant has at least 49% sequence identity or more with the sequence set out in

SEQ ID NO: 37.

An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in SEQ ID NO: 38; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 38; (c) a nucleotide sequence having at least 71% sequence identity or more with the nucleotide sequence of SEQ ID NO: 38; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) .

In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 71%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 38.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least 49%, 50%, 55%, 58%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 37.

A polypeptide according to the invention having

Furoyl-CoA dehydrogenase 2Fe-2S iron sulfur subunit activity which comprises the amino acid sequence set out in SEQ ID NO: 39 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 40 or a variant

polypeptide thereof, wherein the variant has at least 64% sequence identity or more with the sequence set out in SEQ ID NO: 39.

An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in

SEQ ID NO: 40; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 40; (c) a nucleotide sequence having at least 70% sequence identity or more with the nucleotide sequence of SEQ ID NO: 40; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) . In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,

97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 40.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least 64%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%,

95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 39.

A polypeptide according to the invention having

Furoyl-CoA synthetase activity which comprises the amino acid sequence set out in SEQ ID NO: 41 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 42 or a variant polypeptide thereof, wherein the variant has at least 57% sequence identity or more with the sequence set out in SEQ ID NO: 41.

An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in SEQ ID NO: 42; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 42; (c) a nucleotide sequence having at least 68% sequence identity or more with the nucleotide sequence of SEQ ID NO: 42; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length;

(e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) ,

(b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) . In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 68%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%,

96%, 97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 42.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least 57%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%,

94%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 41.

A polypeptide according to the invention having 2- oxoglutaroyl-CoA hydrolase activity comprises the amino acid sequence set out in SEQ ID NO: 43 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 44 or a variant polypeptide thereof, wherein the variant has at least 72% sequence identity or more with the sequence set out in SEQ ID NO: 44.

An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in SEQ ID NO: 44; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 44; (c) a nucleotide sequence having at least 74% sequence identity or more with the nucleotide sequence of SEQ ID NO: 44; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) . In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 74%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,

97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 44.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least 72%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,

97%, 98%, 99% or more homologous to the amino acid sequence shown in SEQ ID NO: 43.

A polypeptide according to the invention having putative furanic MFS-type transporter (major facilitator superfamily transporter mfs2) activity comprises the amino acid sequence set out in SEQ ID NO: 45, or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 46 or a variant polypeptide thereof, wherein the variant has at least 29% sequence identity or more with the sequence set out in SEQ ID NO: 46. An embodiment of the invention is a polynucleotide which comprises: (a) the nucleotide sequence set out in SEQ ID NO: 46; (b) a nucleotide sequence which hybridizes selectively with a polynucleotide being the reverse complement of SEQ ID NO: 46; (c) a nucleotide sequence having at least 79%

sequence identity or more with the nucleotide sequence of SEQ ID NO: 46; (d) a fragment of a nucleotide sequence as defined in (a) , (b) or (c) which is at least about 100 nucleotides in length; (e) a sequence which is degenerate as a result of the genetic code to a sequence as defined in any one of (a) , (b) , (c) or (d) ; (f) a nucleotide sequence which is the reverse complement of a nucleotide sequence as defined in (a) , (b) , (c) , (d) or (e) . In one embodiment the variant nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the variant nucleic acid molecule comprises a substantially homologous nucleotide sequence of at least 79%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,

98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 46.

In another embodiment the variant protein comprises a substantially homologous amino acid sequence of at least 29%, 30%, 35%, 40%, 45%, 50% 55%, 60%, 65%, 70%, 75%,

80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to the nucleotide sequence shown in SEQ ID NO: 45.

A further embodiment is a vector incorporating the polynucleotide sequences or a nucleic acid constructs set out above .

The terms "homology", "sequence identity" and the like are used interchangeably herein. For the purpose of this invention, it is defined herein that in order to determine the degree of sequence identity shared by two amino acid sequences or by two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in the sequence of a first amino acid or nucleic acid sequence for optimal alignment with a second amino or nucleic acid sequence) . Such alignment may be carried out over the full lengths of the sequences being compared. Alternatively, the alignment may be carried out over a shorter comparison length, for example over about 20, about 50, about 100 or more nucleic acids/bases or amino acids. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position. The degree of identity shared between sequences is typically expressed in term of percentage identity between the two sequences and is a function of the number of identical positions shared by the sequences (i.e., % identity = number of identical positions/total number of positions (i.e. overlapping positions) x 100) . Preferably, the two sequences being compared are of the same or substantially the same length .

The skilled person will be aware of the fact that several different computer programs are available to determine the homology between two sequences. For

instance, a comparison of sequences and determination of percent identity between two sequences can be

accomplished using a mathematical algorithm. In a

preferred embodiment, the percentage identity between two amino acid sequences is determined using the Needleman and Wunsch (J. Mol. Biol. (48): 444-453 (1970)) algorithm which has been incorporated into the GAP program in the Accelrys GCG software package (available at

http://www.accelrys.com/products/gcg/), using either a Blossom 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2,

3, 4, 5, or 6. The skilled person will appreciate that all these different parameters will yield slightly different results but that the overall percentage

identity of two sequences is not significantly altered when using different algorithms.

In yet another embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the Accelrys GCG software package (available at http://www.accelrys.com/products/gcg/), using a NWSgapdna . CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. In another embodiment, the percent identity two amino acid or nucleotide sequence is determined using the algorithm of E. Meyers and W. Miller (CABIOS, 4:11-17 (1989) which has been incorporated into the ALIGN program (version 2.0) (available at the ALIGN Query using

sequence data of the Genestream server IGH Montpellier France http://vega.igh.cnrs.fr/bin/align-guess.cgi) using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

The nucleic acid and protein sequences of the present invention can further be used as a "query sequence" to perform a search against public databases to, for

example, identify other family members or related

sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score

= 100, wordlength = 12 to obtain nucleotide sequences for instance homologous to oxidoreductase encoding nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score = 50, wordlength = 3 to obtain amino acid sequences homologous to oxidoreductase protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al . , (1997) Nucleic Acids Res. 25(17): 3389-3402. When

utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST, also known as BLASTn and BLASTx) can be used. See the homepage of the National Center for Biotechnology Information at http://www.ncbi.nlm.nih.gov/.

As used herein, the term "selectively hybridizing", "hybridizes selectively" and similar terms are intended to describe conditions for hybridization and washing under which nucleotide sequences at least at least 70%, at least 75%, at least 80%, more preferably at least 85%, even more preferably at least 90%, preferably at least 95%, more preferably at least 98% or more preferably at least 99% homologous to each other typically remain hybridized to each other. That is to say, such

hybridizing sequences may share at least at least 70%, at least 75%, at least 80%, more preferably at least 85%, even more preferably at least 90%, more preferably at least 95%, more preferably at least 98 ~6 or more

preferably at least 99% sequence identity.

A preferred, non-limiting example of such

hybridization conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45oC, followed by one or more washes in 1 X SSC, 0.1% SDS at about 50oC, preferably at about 55oC, preferably at about 60oC and even more preferably at about 65oC.

Highly stringent conditions include, for example, hybridization at about 68oC in 5x SSC/5x Denhardt ' s solution / 1.0% SDS and washing in 0.2x SSC/0.1% SDS at room temperature. Alternatively, washing may be performed at 42°C.

The skilled artisan will know which conditions to apply for stringent and highly stringent hybridization conditions. Additional guidance regarding such conditions is readily available in the art, for example, in Sambrook et al . , 1989, Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, N.Y.; and Ausubel et al . (eds.), 1995, Current Protocols in Molecular Biology, (John Wiley & Sons, N.Y.).

Of course, a polynucleotide which hybridizes only to a poly A sequence (such as the 3' terminal poly (A) tract of mRNAs) , or to a complementary stretch of T (or U) resides, would not be included in a polynucleotide of the invention used to specifically hybridize to a portion of a nucleic acid of the invention, since such a

polynucleotide would hybridize to any nucleic acid molecule containing a poly (A) stretch or the complement thereof (e.g., practically any double-standed cDNA clone) .

In a typical approach, gene libraries constructed from other organisms, e.g. a bacterium , in particular from the micro-organism family Trichomaceae, for example from the genus Burkholderia can be screened such as

Burkholderia phytofirmans .

For example, Burkholderia strains can be screened for homologous encoding polynucleotides according to the invention by Southern blot analysis. Upon detection of homologous DNA restriction fragments according to the invention, gene libraries can be constructed from

chromosomal fragments of the same size from the

appropriate strain, utilizing standard techniques well known to those of skill in the art. Alternatively, if the microorganism is a eukaryote, the mRNA transcript of the respective genes according to the invention can be identified by Northen hybridization and upon

identification of the transcript, cDNA libraries can be prepared using total RNA isolated from the eukaryotic microorganism. Homologous gene sequences can be isolated, for example, by performing PCR using two degenerate oligonucleotide primer pools designed on the basis of nucleotide sequences as taught herein. The template for the reaction can be total chromosomal DNA from the strain know or suspected to express a polynucleotide according to the invertion. The PCR product can be subcloned and sequenced to ensure that the amplified sequences

represent the sequences of a new oxidoreductase nucleic acid sequence, or a functional equivalent thereof.

Alternatively the template for the reaction can be cDNA obtained by reverse transcription of mRNA prepared from strains known or suspected to express a

polynucleotide according to the invention. The PCR product can be subcloned and sequenced to ensure that the amplified sequences represent the sequences of a new oxidoreductase nucleic acid sequence, or a functional equivalent thereof.

The PCR fragment can then be used to isolate a full- length cDNA clone by a variety of known methods. For example, the amplified fragment can be labeled and used to screen a bacteriophage or cosmid cDNA library.

Alternatively, the labeled fragment can be used to screen a genomic library.

PCR technology also can be used to isolate full- length cDNA sequences from other organisms. For example, RNA can be isolated, following standard procedures, from an appropriate cellular or tissue source. A reverse transcription reaction can be performed on the RNA using an oligonucleotide primer specific for the most 5' end of the amplified fragment for the priming of first strand synthesis .

The resulting RNA/DNA hybrid can then be "tailed"

(e.g., with guanines) using a standard terminal

transferase reaction, the hybrid can be digested with RNase H, and second strand synthesis can then be primed (e.g., with a poly-C primer) . Thus, cDNA sequences upstream of the amplified fragment can easily be

isolated. For a review of useful cloning strategies, see e . g . , Sambrook et al . , supra; and Ausubel et al . , supra.

Another aspect of the invention pertains to vectors, including cloning and expression vectors, comprising a polynucleotide of the invention encoding a oxidoreductase protein or a functional equivalent thereof and methods of growing, transforming or transfecting such vectors in a suitable host cell, for example under conditions in which expression of a polypeptide of the invention occurs. As used herein, the term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked.

Polynucleotides of the invention can be incorporated into a recombinant replicable vector, for example a cloning or expression vector. The vector may be used to replicate the nucleic acid in a compatible host cell. Thus in a further embodiment, the invention provides a method of making polynucleotides of the invention by introducing a polynucleotide of the invention into a replicable vector, introducing the vector into a

compatible host cell, and growing the host cell under conditions which bring about replication of the vector. The vector may be recovered from the host cell. Suitable host cells are described below.

The vector into which the expression cassette or polynucleotide of the invention is inserted may be any vector which may conveniently be subjected to recombinant DNA procedures, and the choice of the vector will often depend on the host cell into which it is to be

introduced . A vector according to the invention may be an

autonomously replicating vector, i. e. a vector which exists as an extra-chromosomal entity, the replication of which is independent of chromosomal replication, e.g. a plasmid. Alternatively, the vector may be one which, when introduced into a host cell, is integrated into the host cell genome and replicated together with the chromosome (s) into which it has been integrated.

One type of vector is a "plasmid", which refers to a circular double stranded DNA loop into which additional

DNA segments can be ligated. Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as "expression vectors". In general, expression vectors of utility in recombinant DNA

techniques are often in the form of plasmids. The terms

"plasmid" and "vector" can be used interchangeably herein as the plasmid is the most commonly used form of vector. However, the invention is intended to include such other forms of expression vectors, such as cosmid, viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses) and phage vectors which serve equivalent functions. Vectors according to the invention may be used in vitro, for example for the production of RNA or used to transfect or transform a host cell.

A vector of the invention may comprise two or more, for example three, four or five, polynucleotides of the invention, for example for overexpression .

The recombinant expression vectors of the invention comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vector includes one or more regulatory sequences, selected on the basis of the host cells to be used for expression, which is operably linked to the nucleic acid sequence to be expressed.

Within a vector, such as an expression vector,

"operably linked" is intended to mean that the nucleotide sequence of interest is linked to the regulatory

sequence (s) in a manner which allows for expression of the nucleotide sequence (e.g., in an in vitro

transcription/translation system or in a host cell when the vector is introduced into the host cell), i.e. the term "operably linked" refers to a juxtaposition wherein the components described are in a relationship permitting them to function in their intended manner. A regulatory sequence such as a promoter, enhancer or other expression regulation signal "operably linked" to a coding sequence is positioned in such a way that expression of the coding sequence is achieved under condition compatible with the control sequences or the sequences are arranged so that they function in concert for their intended purpose, for example transcription initiates at a promoter and

proceeds through the DNA sequence encoding the

polypeptide . The term "regulatory sequence" or "control sequence" is intended to include promoters, enhancers and other expression control elements (e.g., polyadenylation signal) . Such regulatory sequences are described, for example, in Goeddel; Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990) .

The term regulatory or control sequences includes those sequences which direct constitutive expression of a nucleotide sequence in many types of host cells and those which direct expression of the nucleotide sequence only in a certain host cell (e.g. tissue-specific regulatory sequences) .

A vector or expression construct for a given host cell may thus comprise the following elements operably linked to each other in a consecutive order from the 5'- end to 3 ' -end relative to the coding strand of the sequence encoding the polypeptide of the first invention: (1) a promoter sequence capable of directing

transcription of the nucleotide sequence encoding the polypeptide in the given host cell ; (2) optionally, a signal sequence capable of directing secretion of the polypeptide from the given host cell into a culture medium; (3) a DNA sequence of the invention encoding a mature and preferably active form of a polypeptide having cellobiohydrolase activity; and preferably also (4) a transcription termination region (terminator) capable of terminating transcription downstream of the nucleotide sequence encoding the polypeptide.

Downstream of the nucleotide sequence according to the invention there may be a 3 ' untranslated region containing one or more transcription termination sites (e.g. a terminator) . The origin of the terminator is less critical. The terminator can, for example, be native to the DNA sequence encoding the polypeptide. However, preferably a yeast terminator is used in yeast host cells and a filamentous fungal terminator is used in

filamentous fungal host cells. More preferably, the terminator is endogenous to the host cell (in which the nucleotide sequence encoding the polypeptide is to be expressed) . In the transcribed region, a ribosome binding site for translation may be present. The coding portion of the mature transcripts expressed by the constructs will include a translation initiating AUG at the

beginning and a termination codon appropriately

positioned at the end of the polypeptide to be

translated .

Enhanced expression of the polynucleotide of the invention may also be achieved by the selection of heterologous regulatory regions, e. g. promoter,

secretion leader and/or terminator regions, which may serve to increase expression and, if desired, secretion levels of the protein of interest from the expression host and/or to provide for the inducible control of the expression of a polypeptide of the invention.

It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be

transformed, the level of expression of protein desired, etc. The vectors, such as expression vectors, of the invention can be introduced into host cells to thereby produce proteins or peptides, encoded by nucleic acids as described herein (e.g. in the case of hmfH,

oxidoreductase proteins, mutant forms of oxidoreductase proteins, fragments, variants or functional equivalents thereof, fusion proteins, etc.) . The same applies to the other polypeptides according to the invention. The vectors, such as recombinant expression vectors, of the invention can be designed for expression of suitable proteins in prokaryotic or eukaryotic cells. For example oxidoreductase proteins can be expressed in bacterial cells such as E. coli, insect cells (using baculovirus expression vectors) , filamentous fungi, yeast cells or mammalian cells. Suitable host cells are

discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990) . Representative examples of appropriate hosts are described hereafter. Appropriate culture mediums and conditions for the above-described host cells are known in the art.

As set out above, the term "control sequences" or "regulatory sequences" is defined herein to include at least any component which may be necessary and/or

advantageous for the expression of a polypeptide. Any control sequence may be native or foreign to the nucleic acid sequence of the invention encoding a polypeptide. Such control sequences may include, but are not limited to, a promoter, a leader, optimal translation initiation sequences (as described in Kozak, 1991, J. Biol. Chem. 266:19867-19870), a secretion signal sequence, a pro ¬ peptide sequence, a polyadenylation sequence, a

transcription terminator. At a minimum, the control sequences typically include a promoter, and

transcriptional and translational stop signals.

A stably transformed microorganism is one that has had one or more DNA fragments introduced such that the introduced molecules are maintained, replicated and segregated in a growing culture. Stable transformation may be due to multiple or single chromosomal integration (s) or by (an) extrachromosomal element (s) such as (a) plasmid vector (s) . A plasmid vector is capable of

directing the expression of polypeptides encoded by particular DNA fragments. Expression may be constitutive or regulated by inducible (or repressible) promoters that enable high levels of transcription of functionally associated DNA fragments encoding specific polypeptides.

Isolation of the polypeptides according to the invention: one or more polypeptides according to the invention or DNA material expressing the polypeptides according to the invention may be isolated from an organism, preferably a microorganism that expresses the oxidoreductase .

Preferably, the microorganism is capable of using furanic compounds, preferably HMF and/or furufal as a substrate, more preferably not using other carbon sources such as C5 and/or C6 sugars . The microorganism

preferably is chosen from the group consisting of:

Cupriavidus (preferably Cupriavidus basilensis,

Cupriavidus Eutropha and/or Cupriavidus basilensis

HMF14,), Burkholderia (prefereably Burkholderia

phytofirmans and/or Burkholderia phytofirmans PsJN) , Bradyhrizobium (preferably , Bradyhrizobium japonicum, and/or Bradyhrizobium japonicum USDA110),

Methylobacterium (preferably Methylobacterium

radiotolerans and/or Methylobacterium radiotolerans

JCM2831) .

Most preferred polypeptides useful in the present invention are converting furfural and/or HMF as

substrate, and are polypeptides according to the

invention isolated from Cupriavidus basilensis HMF 14 herein, deposited in accordance with the Budapest Treaty on International Recognition of the Deposits of

Microorganisms for the Purpose of Patent Procedures at the Deutsche Sammlung von Mikroorganismen und

Zellkulturen GmbH (German Collection of Microorganisms and Cell Cultures) , InhoffenstraBe 7B, 38124

Braunschweig, GERMANY as strain HMF 14, having deposit number DSM 22875; deposition date of August 19, 2009.

The present invention thus also relates to the isolated HMF-utilizing bacterium, Cupriavidus basilensis strain HMF14, and its genes involved in the HMF

degradative pathway.

One of the genes (herein defined as hmfH) encodes a

579-amino acid, 62 kDa FAD-dependent oxidoreductase that was found to oxidize furfuryl alcohol, furfural, HMF, and 5-hydroxymethyl-furoic acid.

The alcohol / aldehyde groups at C2 and C5 in these molecules are oxidized, to yield i.e. furan-dicarboxylic acid from HMF, respectively, and 5-furoic acid from furfurylalcohol or furfural (see Figure 5) .

The present invention thus provides polynucleotides encoding polypeptides, having the following activity:

Furoyl-CoA dehydrogenase large subunit; a Furoyl-CoA dehydrogenase FAD binding subunit; Furoyl-CoA

dehydrogenase 2Fe-2S iron sulfur subunit; a Furoyl-CoA synthetase; a 2-oxoglutaroyl-CoA hydrolase; a 2,5-furan- dicarboxylic acid decarboxylase 1, a 2,5-furan- dicarboxylic acid decarboxylase 2; a HMF/furfural oxidoreductase; a LysR type transcriptional regulator 1 and a LysR type transcriptional regulator 2, an aldehyde dehyrogenase and major family supertransporters 1 and 2.

Enzymes are herein understood as a subclass of polypeptides. A preferred furanic-compound-removing pathway useful in the present invention uses a novel Furoyl-CoA dehydrogenase, a novel Furoyl-CoA synthetase, a novel 2-oxoglutaroyl-CoA hydrolase, two novel 2,5- furan-dicarboxylic acid decarboxylases (1 and 2), and/or a novel HMF/furfural oxidoreductase isolated from

Cupriavidus basilensis HMF 14 herein, deposited in accordance with the Budapest Treaty on International Recognition of the Deposits of Microorganisms for the

Purpose of Patent Procedures as Deposit number DSM 22875.

Lignocellulose hydrolysate is the result of

subjecting lignocellulosic material to a pre-treatment step. Suitable lignocellulose-containing material

according to the subject invention includes, but is not limited to wood such as wood chips, saw dust; municipal waste containing lignocellulose; waste paper pulp;

perennial grasses such as switchgrass (panicum virgatum) ; miscanthus species such as miscanthus x giganteus, miscanthus sinensis and miscanthus sacchariflorus ; energy cane, sugar cane; sweet sorghum; corn cobs and corn stovers, wheat straw, rice straw and other sources of lignocellulosic material.

Lignocellulose is one of the most abundant plant material resources in the world. However, an effective pretreatment is needed to remove the rigid crystalline structure, enforced by lignin and hemicellulose to render it accessible for a subsequent hydrolysis step.

The pretreatment preferably includes both physical and chemical pretreatment steps. Physical pretreatment is often called size reduction to reduce biomass physical size. Chemical pretreatment is to remove chemical

barriers so that the enzymes can access cellulose

degradation .

Currently used pretreatment techniques include acid hydrolysis, steam explosion, ammonia fiber expansion, organosolve and sulfite pretreatmet, alkaline wet

oxidation and ozone pretreatment. Most pretreatment techniques result in the formation of degradation products that have inhibitory effects on subsequent hydrolysis and fermentation processes. The presence of inhibitors will not only further complicate the ethanol production but also increase the cost of production due to entailed detoxification steps.

The method most often applied includes acid

hydrolysis, where the lignocellulosic material is

subjected to an acid such as sulphuric acid, whereby the sugar polymers cellulose and hemicellulose are partly or completely hydrolysed to their constituent sugar

monomers. Another type of lignocellulose hydrolysis is steam explosion, a process comprising heating of the lignocellulosic material by steam injection to a

temperature of 190-230°C. A third method is wet oxidation wherein the material is treated with oxygen at 150-185°C. All methods may also be combined, e.g. steam explosion or oxidation in presence of acids. These pretreatments may be followed by enzymatic hydrolysis to complete the release of sugar monomers. The pretreatment steps results in the hydrolysis of cellulose into glucose while hemicellulose is transformed into the pentoses xylose and arabinose and the hexoses glucose, galactose and mannose.

The pretreatment step may in certain embodiments be supplemented with treatment resulting in further

hydrolysis of the cellulose and hemicellulose. The purpose of such an additional hydrolysis treatment is to hydrolyse oligosaccharide and possibly polysaccharide species produced during the acid hydrolysis, wet

oxidation, or steam explosion of cellulose and/or

hemicellulose origin to form fermentable sugars (e.g. glucose, xylose and possibly other monosaccharides) . Such further treatments may be either chemical or enzymatic. Chemical hydrolysis is typically achieved by treatment with an acid, such as treatment with aqueous sulphuric acid or formic acid, at a temperature in the range of about 100-150°C. Enzymatic hydrolysis is typically performed by treatment with one or more appropriate carbohydrase enzymes such as cellulases, glucosidases and hemicellulases including xylanases.

Detailed description of the invention

Without wishing to be bound to any particular theory, it is believed that the inhibitory effect of

lignocellulosic hydrolysate on fermentation comes from a concerted effect of many toxic constituents, but that furanic derivatives have a key role in this effect. It has been surprisingly found that C. basilensis HMF14 according to the invention is able to selectively degrade furanic derivatives in lignocellulosic hydrolysate, while essentially not utilizing fermentable sugars.

C. basilensis HMF14 was isolated on HMF, however it was found to also utilize furfural as sole carbon source. As a further beneficial effect, C. basilensis HMF14 is also able to remove the majority of non-furanic

inhibitory compounds, such as acetate and formate, as illustrated in figure 3, which shows that the furanic derivatives, acetate and formate were completely removed from wheat straw hydrolysate after only ten hours of cultivation. During this period, the glucose, xylose and arabinose concentrations, i.e. fermentable sugars in lignocellulose hydrolysate, were stable. Only when the incubation was prolonged after these inhibitor compounds were consumed, the sugar concentration decreased by approximately 11 % within 15 h. Without wishing to be bound to any particular theory, it is believed that this may indicate that the sugars are not converted, but absorbed in the mucous layer around the bacterial cells.

Accordingly, the ability to digest or convert furanic compounds without use of fermentable sugars is defined as the reduction of the inhibitor concentration while maintaining a stable fermentable sugar concentration before all furanic compounds are converted. The term "all furanic compounds are converted" refers to a

concentration of furanic compounds of less than 3000 ppmw, more preferably less than 2000 ppmw, more

preferably less than 1500 ppmw, more preferably less than 1000 ppmw, more preferably less than 500 ppmw, and yet more preferably less than 150 ppmw. In a further

embodiment the term "all furanic compounds are converted" refers to a concentration of furanic compounds that is equal to or less than 50 wt%, more preferably equal to or less than 30 wt%, more preferably equal to or less than 10 wt%, more preferably equal to or less than 5 wt%, yet more preferably equal to or less than 1 wt %, and still more preferably equal to or less than 0.1 wt% of the concentration of furanic compounds present in the

lignocellulose hydrolysate before digestion or conversion of the furanic compounds as described herein. Undiluted wheat straw hydrolysate was detoxified to completion as well, although an extended lag phase occurred that could be ameliorated by increasing the inoculum density.

As set out above, C. basilensis HMF14 was

demonstrated to metabolize individual inhibitors in minimal medium. In addition to individual compounds, complex mixtures of toxic inhibitors were also

efficiently metabolized as demonstrated by the

detoxification of actual wheat straw hydrolysate.

Treatment of lignocellulosic hydrolysate with C. basilensis HMF14 resulted in a solution of glucose, xylose and arabinose that is essentially free from furan aldehydes, acetate and formate. The unique substrate profile of C. basilensis HMF14 according to the invention makes this bacterium ideally suited for biological detoxification of lignocellulosic hydrolysate.

Accordingly, the present invention further provides for an isolated culture of a Cupriavidus microorganism of Cupriavidus basilensis strain Deposit number DSM 22875 which, when provided with furanic compounds derivatives, preferably HMF and Furfural-derived compounds as a sole carbon source, grows on said source and expresses several novel enzymes, which act synergetically, comprising a novel 2-furoyl-CoA: acceptor 5-oxidoreductase

(hydroxylating) EC 1.3.99.8), further referred to as

Furoyl-CoA dehydrogenase, a novel Furoyl-CoA synthetase, a novel 2-oxoglutaroyl-CoA hydrolase, two novel 2,5- furan-dicarboxylic acid decarboxylases , and a novel HMF/furfural oxidoreductase .

An embodiment of the present invention is a vector incorporating the polynucleotide sequences or nucleic acid constructs set out herein-above, wherein the

nucleotide sequences encode one or more of Furoyl-CoA dehydrogenase, Furoyl-CoA synthetase, 2-oxoglutaroyl-CoA hydrolase, 2 , 5-furan-dicarboxylic acid decarboxylase 1,

2 , 5-furan-dicarboxylic acid decarboxylase 2 and a

HMF/furfural oxidoreductase from Cupriavidus basilensis HMF14 DSM 22875 in a host cell under conditions conducive for their expression.

The invention also relates to a host microorganism transformed or transfected by the isolated DNA or by a vector or plasmid comprising the isolated DNA according to the invention under conditions conducive to express one or more of a Furoyl-CoA dehydrogenase, a Furoyl-CoA synthetase, a 2-oxoglutaroyl-CoA hydrolase, a 2,5-furan- dicarboxylic acid decarboxylase 1, a 2,5-furan- dicarboxylic acid decarboxylase 2 and a HMF/furfural oxidoreductase .

The present invention also provides for a cell extract from a Cupriavidus microorganism or a host cell comprising one or more of a Furoyl-CoA dehydrogenase, a Furoyl-CoA synthetase, a 2-oxoglutaroyl-CoA hydrolase, a 2 , 5-furan-dicarboxylic acid decarboxylase 1, a 2,5-furan- dicarboxylic acid decarboxylase 2 and a HMF/furfural oxidoreductase .

These enzymes act synergetically when provided with the relevant cofactors. The invention also relates to a composition comprising the cell extract, which preferably is a fractionated soluble cytosolic fraction.

The present invention also relates to a process for the in-situ detoxification of lignocellulose hydrolysate comprising furanic compounds, preferably HMF, HMF alcohol and HMF carboxylic acid, and Furfurylalcohol , Furfural and/or Furoic acid with a suitable host microorganism, comprising contacting the lignocellulose hydrolysate with the host microorganism under conditions facilitating the expression of one ore more of the Furoyl-CoA

dehydrogenase, a Furoyl-CoA synthetase, a 2-oxoglutaroyl-

CoA hydrolase, a 2 , 5-furan-dicarboxylic acid

decarboxylase 1, a 2 , 5-furan-dicarboxylic acid

decarboxylase 2 and a HMF/furfural oxidoreductase, as required to for the conversion of the furanic compounds, from a microorganism from the family of Burkholderiaceae, preferably Cupriavidus, more preferably Cupriavidus basilensis, again more preferably Cupriavidus basilensis hmf 14 according to the invention to convert the furanic compounds to non-toxic components to obtain a detoxified lignocellulose hydrolysate. The term "toxic" refers to lignocellulose degradation products that are fermentation inhibitors that inhibit the growth of e.g. ethanologenic microorganisms employed in the fermentation step, thereby inhibiting a suitable performance of these organisms and reducing yields. The term "non-toxic" means that the conversion products of the toxic compounds are

essentially not inhibiting the growth of microorganisms that ferment the cellulose components.

Preferably, the Cupriavidus microorganism is

Cupriavidus basilensis HMF14 according to the invention. More preferably, the host microorganism comprises DNA encoding one or more of the following group: Furoyl-CoA dehydrogenase, Furoyl-CoA synthetase, 2-oxoglutaroyl-CoA hydrolase, 2 , 5-furan-dicarboxylic acid decarboxylase 1, 2 , 5-furan-dicarboxylic acid decarboxylase 2 and

HMF/furfural oxidoreductase from Cupriavidus

microorganism of Cupriavidus basilensis HMF14 according to the invention. The process further preferably

comprises subjecting the detoxified lignocellulose hydrolysate to a simultaneous or subsequent fermentation step. It further preferably comprises a step of pre- treating lignocellulose-containing material to obtain a lignocellulose hydrolysate, more preferably under acidic conditions .

A further aspect of the subject invention is a process comprising introducing isolated DNA sequence as set out herein above and/or (b) isolated DNA sequences which are at least sufficiently identical to the DNA sequences to encode, polypeptides having the activity of of one ore more of the Furoyl-CoA dehydrogenase, a

Furoyl-CoA synthetase, a 2-oxoglutaroyl-CoA hydrolase, a 2 , 5-furan-dicarboxylic acid decarboxylase 1, a 2,5-furan- dicarboxylic acid decarboxylase 2 and a HMF/furfural oxidoreductase, as required to for the conversion of the furanic compounds in an appropriate host cell,

cultivating the obtained host cell under conditions conducive to the detoxification of lignocelluloses hydrolysate, and recovering a detoxified lignocelluloses hydrolysate from the culture. The term "detoxification" thus refers to the conversion of toxic compounds to non- toxic compounds.

The present invention also provides for a process for the production of Furoyl-CoA dehydrogenase, Furoyl-CoA synthetase, 2-oxoglutaroyl-CoA hydrolase, 2,5-furan- dicarboxylic acid decarboxylase 1, 2 , 5-furan-dicarboxylic acid decarboxylase 2 and/or a HMF/furfural oxidoreductase which are at least 45% identical to those expressed by Cupriavidus basilensis HMF14 DSM 22875or a host

microorganism as set out above, comprising

(a) culturing a microorganism in a nutrient medium containing carbon and nitrogen sources and inorganic salts; and

(b) isolating the enzymes produced from the

microorganism.

Preferably, in the process according to the

invention, Furanic compounds such as HMF,

Furfurylalcohol , Furfural and/or Furoic acid are

converted to a polyhydroxyalkanoate (PHA) , and/or to biofuels such as ethanol. The invention also provides for a process for the in-situ detoxification of

lignocellulose hydrolysate with a suitable host

microorganism, comprising cultivating the host

microorganism in the presence of furfural, furfuryl alcohol, hydroxymethylfurfural and/or furoic acid under conditions facilitating the expression of the enzymes from a Cupriavidus microorganism. Preferably, the host microorganism is a Cupriavidus microorganism of

Cupriavidus basilensis HMF 14 according to the invention. More preferably, the host microorganism comprises DNA encoding an enzyme complex comprising a Furoyl-CoA dehydrogenase, a Furoyl-CoA synthetase, a 2-oxoglutaroyl- CoA hydrolase, a 2 , 5-furan-dicarboxylic acid

decarboxylase 1, a 2 , 5-furan-dicarboxylic acid

decarboxylase 2 and a HMF/furfural oxidoreductase .

A further aspect of the subject invention preferably comprises subjecting the detoxified lignocellulose hydrolysate to a simultaneous or subsequent fermentation step .

Also described is a group of enzymes forming a synergetic enzyme pathway for the degradation of furanic compounds, further referred to as enzyme complexwhich may be isolated from Cupriavidus basilensis HFM14.

Furanic compounds, such as HMF, Furfural and furanoic acid are the primary substrates for this enzyme complex expressed from C.b. HMF 14 since this organism can grow on either compound as the sole substrate providing carbon and energy.

Also a process for producing these enzymes is

provided. Thus, there is provided an improved method for the biocatalytic production of PHA and/or biofuels such as ethanol in a suitable host microorganism of the enzymatic pathway of Cupriavidus basilensis strain DSM 22875, under conditions facilitating the expression of the activity.

This embodiment preferably includes modifications of Cupriavidus basilensis strain DSM 22875 to block conversion of fermentable sugars to compounds along its degradation pathway to compounds.

The present invention further pertains to a process for the conversion of furanic compoiunds, such as

furfuryl alcohol and/or furfural and/or furoic acid to 2- furoyl CoA, comprising contacting furfuryl alcohol and/or furfural and/or furoic acid with a furoyl-CoA

dehydrogenase, furoyl-CoA synthetase, 2-oxoglutaroyl-CoA hydrolase, and 2 , 5-furan-dicarboxylic acid decarboxylase catalyst in the presence of one or more coenzyme

cofactor. Without wqishing to be bound to any particular theory, it is believed that the furoyl-CoA synthetase converts the furanic compounds to form furoyl-CoA, while the other enzymes degrade the thus obtained furoyl-CoA. The (bio) catalyst preferably comprises a polypeptide as set out herein before for each of the polypeptide

enzymes .

The present invention also pertains to a process for the conversion of furanic compounds, preferably one or more of 5-hydroxymethylfurfural (HMF) , 2,5- dihydroxymethyl furan (HMF alcohol) , 5-hydroxymethyl-2- furancarboxylic acid (HMF acid) and/or 2,5- furandicarboxylic acid to 2-furoyl CoA, comprising comprising contacting the Furanic compounds with a furoyl-CoA dehydrogenase, furoyl-CoA synthetase, HMF oxidoreductase and the decarboxylases catalyst in the presence of one or more coenzyme cofactor.

Preferably, the coenzyme cofactor is nicotinamide adenine dinucleotide (NAD+) and/or flavin adenine

dinucleotide (FAD) and/or pyrroloquinoline quinolone

(PQQ) .

Yet further, its capability of producing PHA may furthermore contribute to cost effectiveness since the biomass generated may be employed for the production of bioplastics .

Similarly, conditions may be suitable to also

facilitate the expression of any one or more of the enzymatic activities necessary to convert the furanic compounds to poylhydroxalkanoate .

The enzyme of the present invention is a

multicomponent enzyme that can utilize nicotinamide adenine dinucleotide (NADH) or nicotinamide adenine phosphate dinucleotide (NADPH) , requires flavin adenine dinucleotide (FAD) and its activity is stimulated by the presence of iron in a cell-free extract.

Also proposed is the cloning and sequencing of the gene encoding the preferred enzymes, Furoyl-CoA

dehydrogenase, Furoyl-CoA synthetase, 2-oxoglutaroyl-CoA hydrolase, 2 , 5-furan-dicarboxylic acid decarboxylase 1,

2 , 5-furan-dicarboxylic acid decarboxylase 2 and

HMF/furfural oxidoreductase .

Detailed description of the Figures

Figure 1 shows the growth of C. basilensis HMF14 on mineral salts medium with furfural as the sole carbon source. □, furfural; A , furfuryl alcohol; Δ, furoic acid;

■, OD600. Cultures were performed in triplicate.

Variations between replicates was less than 10%.

Figure 2 illustrates the growth of C. basilensis HMF14 on different concentrations of furfural (A) or HMF (B) . The concentrations used were ■, 3 mM; □, 6 mM; ▲, 9 mM; Δ, 12 mM; ·, 15 mM.

Figure 3 depicts the detection of PHA in cultures of C. basilensis HMF14 in minimal medium with 120 mM

acetate. Left: phase contrast image. Middle: Fluorescence microscopic image of the same slide stained with Nile Blue A. Right: Overlay of the two previous images. Figure 4 illustrates the detoxification of

lignocellulosic hydrolysate by C. basilensis HMF14. The sugars concentration ( A ) is the sum of the concentrations of glucose, xylose and arabinose. The furans

concentration (Δ) is the sum of the concentrations of the alcohol, aldehyde and acid derivatives of furfural and HMF. The acids concentration (□) is the sum of the concentrations of acetic and formic acid. Biomass (■) was measured by the optical density at 600 nm.

Figure 5 is a graphical representation of the HMF (A) and furfural (B) metabolic pathway in C. basilensis

HMF14. Coloured hexamers and triangles indicate enzymes with the following activities: orange, furfural/HMF oxidoreductase ; green and red, 2 , 5-furan-dicarboxylic acid decarboxylase; blue, 2-furoyl-CoA synthetase;

yellow, furoyl-CoA dehydrogenase; purple, 2-oxoglutaryl- CoA hydrolase. Colours correspond to the genes depicted in figure 2A. The black square indicates a lacton

hydrolysis which may occur spontaneously, or may be catalyzed by a generic lactone hydrolase. Double-pointed arrows indicate keto-enol tautomerizations . Reactions marked with (*) can be catalysed either by HmfH or by (probably non-specific) dehydrogenases.

Figure 6 shows a schematic representation of the genetic organization of the furfural and HMF metabolic genes in C. basilensis HMF14 (A) and other species (B) that were identified as potential furfural and / or HMF utilizers. Colours correspond to enzyme activities in figure 1. Bold numbers (x/y) below arrows indicate the percentage identity (x) to the corresponding C.

basilensis HMF14 protein in a y amino-acid stretch.

Orthologous genes were identified by BLASTx homology searches in the non-redundant protein database of the National Center for Biotechnology Information. Hits for the furfural cluster were defined as relevant when orthologues for hmfA, B, C, D and E were present in a single genome, with the hmfA orthologue encoding an enzyme that was at least 50 % identical to HmfA. The same criterion was used to define hmfF and hmfG orthologues, whereas 40 % identity to HmfH was used as the criterion for hmfH orthologues. Numbers in italics indicate genome locus tags of the indicated strain. White arrows depict genes with no metabolic function. C: Overview of growth phenotype of tested strains on mineral salts medium with either furfural or HMF (3 mM) as the sole carbon source. ND: not determined.

The following non-binding experiments illustrate the invention further.

Culture conditions

Cultures were performed either in Luria broth (LB) or in a minimal medium (MM) . Enrichment cultures were performed in MM with 20 mM HMF as the carbon source

(MMH20), supplemented as indicated with 0.1 g/1 yeast extract as vitamin source (MMyH20) . Solid media contained 1.5 % agar. Soil and water samples were collected from the botanical garden of the Delft University of

Technology and from a peat lake named x het

Kootwij kerveen' near Apeldoorn, The Netherlands. The samples were mixed and approximately 1 g was used to inoculate 50 ml MMH20 or MMyH20 in 500-ml Erlenmeyer flasks that were incubated at 30 °C for two days in a rotary shaker. One-ml samples were transferred twice to fresh medium and incubated until bacterial growth was apparent (OD600 > 1) . The final enrichment cultures were streaked onto solid MMH10 and MMyHIO and incubated at 30 °C until colonies appeared. Wheat straw hydrolysate was produced by dilute-acid hydrolysis (obtained from TNO Quality of Life, Zeist, The Netherlands) . The hydrolysate was neutralized by adding 37 mM phosphate buffer (pH 7) and adjusting the pH to 7.0 with 10 M NaOH, resulting in a brown precipitate. After addition of minimal medium components, the solution was centrifuged at 10 000 x g for 5 min. The supernatant was filter sterilized through a sterile PTFE filter with a pore size of 0.22 ym. The resulting medium (MMhyd) was inoculated with an overnight pre-culture of C. basilensis

HMF14 in minimal medium with 3 mM HMF, 3 mM furfural and 12 mM sodium succinate.

Bacterial identification

Partial sequence analysis of the 16S rDNA gene was performed for preliminary identification of the bacteria isolated from the enrichment cultures. Total DNA was isolated with a FastDNA kit (QBioGene / MP Biomedicals ) and the partial 16S gene was amplified by PCR using primers FD1/2, AGAGTTTGATCMTGGCTCAG and RP1/2,

ACGGYTACCTTGTTACGACTT . PCR products were purified with a

Qiaquick PCR purification kit (Qiagen) and sequenced by MWG Biotech AG with the same primers used for

amplification .

In order to isolate the genes involved in the HMF metabolic pathway of C. basilensis HMF14, a transposon mutant library of C. basilensis HMF14 was constructed and screened for clones unable to grow on furfural and / or HMF. Twenty-five transposon mutants were selected from 14 000 clones and the chromosomal DNA flanking the

transposon insertion sites was sequenced to identify the interrupted genes. Additional primer walking sequencing of up- and downstream regions of these genes revealed two distinct gene clusters. The first cluster contained five genes, designated hmfABCDE, whereas the other cluster contained four genes: hmfFGH'H. Insertion of a transposon in either of the two clusters corresponded to two distinct phenotypes. If the hmfABCDE cluster was

interrupted, no growth occurred on either HMF or

furfural, suggesting an -at least partly- shared

degradation pathway for furfural and HMF. An insertion in the hmfFGH'H cluster resulted in loss of growth on HMF only. Mutant phenotypes of transposon mutants and BLASTx analysis {Altschul, S.F. et al . Basic local alignment search tool. J. Mol. Biol. 215, 403-410 (1990)} of the genes comprised in the two clusters are summarized in Table 1.

The resulting sequences were assigned the following functions (Table 1) :

Table 1. Growth phenotype of selected C. basilensis HMF14 transposon mutants, and BLASTx analysis and assigned function of genes involved in furfural and HMF degradation.

Gene Growth phenotype of transposon mutant Best BLASTx hit (Acc. No) Assig funct

MM + MM + fur¬

MM + HMF

citrate fural

Furoy

Aerobic-type carbon CoA hmfA (Seq. monoxide dehydrogenase dehyd No. 36) homolog, subunits L and G genas

( YP 726196 ) large subun

Furoy CoA hmfB dehyd

Carbon-monoxide

(Seq. enase dehydrogenase (YP 293089)

38) FAD bindi subun

Furoy

CoA

Aerobic-type carbon dehyd hmfC (SEQ monoxide dehydrogenase enase NO: 40) 2Fe-2S iron-sulfur 2Fe-2 subunit (YP 726194) iron sulfu subun

Furoy hmfD (SEQ Acyl-CoA synthetase CoA NO: 42) (YP 726193) synth se

2- enoyl-CoA

hmfE (SEQ oxoglu hydratase/isomerase

NO: 44) oyl-Co

(YP 293086)

hydrol

2,5- furan- hmfF (SEQ UbiD family decarboxylase

dicarb NO: 20) (YP 001895811)

lie ac decarb

lase 1

2 , 5 - furan-

3-octaprenyl-4- hmfG (SEQ dicarb hydroxybenzoate carboxy ¬ NO: 22) lie ac lase (ZP_02881560)

decarb -lase hmfH'

hypothetical protein

(SEQ ID NO: NA

(YP 293096)

24)

HMF/- glucose-methanol-choline

hmfH (SEQ furfur oxidoreductase

NO: 26) oxido ¬

(YP 001895804)

reduct

Putati

LysR family LysR t hmfRl (SEQ

transcriptional regulator transc NO: 18)

(YP 001862747.1) t-tion regula hmfR2 LysR family Putati

(SEQ ID NO: transcriptional regulator LysR t 34) (YP 293091.1) transc tional regula a. The mutant phenotype was not determined since no transposon mutant was available. NA; no assigned function.

As such, a suitable host organism can preferably be transformed or transfected with DNA encoding one or more of the above sequences according to the invention.

Elucidation of the furfural catabolic pathway of C.

basilensis HMF14

The enzyme functions encoded by the hmfABCDE cluster of C. basilensis HMF14 were in good agreement with the enzyme activities that were reported to constitute the furoic acid degradation pathway of Pseudomonas putida strains F2 and Ful (see Fig. 6) . The first step of this proposed pathway involves an acyl-CoA synthetase to produce 2-furoyl-CoA from 2-furoic acid, which activity matches the function of HmfD. This was supported by the accumulation of 2-furoic acid in hmfD-disrupted

transposon mutants of C. basilensis HMF14 when cultured in the presence of furfuryl alcohol or furfural.

Furthermore, it was established that 2-furoic acid is the substrate for ATP-dependent CoA ligation by HmfD. This activity was present in cell extracts of wildtype C.

basilensis HMF14 and P. putida S12 expressing HmfD, whereas it was absent in C. basilensis HMF14 transposon mutants in which hmfD was disrupted.

In P. putida F2 and Ful, 2-furoyl-CoA was converted into 5-hydroxy-2-furoyl-CoA by a molybdenum-dependent 2- furoyl-CoA dehydrogenase. The proteins encoded by hmfABC in C. basilensis HMF14 correspond to the three subunits that constitute a bacterial Mo-dependent dehydrogenase. The functionality of hmfABC was confirmed by

demonstrating furoic-acid dependent Nitro Blue

Tetrazolium (NBT) reducing activity in cell extracts of P. putida S12 co-expressing HmfABC and HmfD. The latter activity was required to generate 2-furoyl-CoA from 2- furoic acid as the substrate for HmfABC. The HMF degradation route of C. basilensis HMF14 was reconstructed based on putative gene functions of the hmfFGH'H cluster. The hmfFG genes encode two putative decarboxylases of the UbiD/UbiX type which commonly operate concertedly 20, 21. C. basilensis HMF14 mutants with disrupted hmfFG genes accumulated HMF acid and 2,5- furan-dicarboxylic acid (FDCA) when cultured in the presence of HMF, which suggested that these carboxylic acids were the substrate for HmfFG. Cell extracts of both wildtype C. basilensis HMF14 and P. putida S12 expressing

HmfFG formed 2-furoic acid when incubated with FDCA. HMF acid was not decarboxylated to furfuryl alcohol,

demonstrating that FDCA was the actual substrate for HmfFG. Thus, HMF degradation in C. basilensis HMF14 proceeds obligately via its dicarboxylic acid form. No decarboxylase activity was observed in cell extract of P. putida S12 expressing HmfG only. When HmfF was expressed as a single enzyme only slight decarboxylase activity was observed, demonstrating that both proteins are required for optimal FDCA decarboxylase acitivity.

The hmfH gene encodes a putative FAD-dependent oxidoreductase .

C. basilensis HMF14 mutants with a disrupted hmfH gene accumulated HMF acid when cultured in the presence of HMF. Cell extracts of both wildtype C. basilensis

HMF14 and P. putida S12 expressing HmfH formed FDCA when incubated with HMF acid, confirming that HmfH catalyzes the oxidation of the HMF-monocarboxylic acid to the dicarboxylic acid form. No FDCA was formed when oxygen was removed, demonstrating that HmfH is a true oxidase.

The hmfH' gene encodes a hypothetical protein with 49 % identity over a stretch of 296 amino acids to a

probable extra-cytoplasmic solute receptor of Ralstonia eutropha H16. This gene may play a role in HMF transport, but a metabolic function was considered unlikely 22.

Analogous to the furfural pathway, no specific genes were identified for the oxidations in the upper HMF metabolic pathway leading from HMF-alcohol to HMF and

HMF-acid. Also these oxidations were concluded to be performed by non-specific, redundant dehydrogenases which activities were observed both in C. basilensis HMF14 and P. putida S12 (Table 2) . However, it was observed that also HmfH could oxidize HMF, furfural, and furfuryl alcohol to the corresponding acids, respectively,

furfural. Apparently, this oxidase is essential for the formation of FDCA from HMF-acid but also provides an oxidase-alternative to the non-specific alcohol and aldehyde dehydrogenases that constitute the upper

metabolic pathways for HMF and furfural.

Based on the above observations, the pathway depicted in Fig. 5 was constructed for HMF catabolism. First, HMF is oxidized to HMF acid, either by non-specific

dehydrogenases or by HmfH. Subsequently, HMF acid is oxidized to FDCA for which conversion HmfH is essential. The HMF and the furfural catabolic pathways converge at the level of 2-furoic acid upon decarboxylation of FDCA by HmfFG.

Analytical methods

Bacterial growth was determined by measuring optical density at 600 nm (OD600) using a Biowave Cell Density Meter (WPA Ltd) or a yQuant MQX200 universal microplate spectrophotometer (Bio-tek) , using flat-bottom 96-well microplates (Greiner) . Furan derivatives were analyzed on an Agilent 1100 system equipped with a diode array detector set at 230 nm. The column used was a Zorbax Eclipse XDB-C8 (length, 150 mm; internal diameter, 4.6 mm; particle size, 5 ym; Agilent) operated at 25°C. As eluent, a gradient of acetonitrile in 20 mM KH2P04 (pH 2) with 1% acetonitrile was used at a flow of 1.2 ml / min, increasing from 0 to 5% in 3.5 min and from 5 to 40% in 2.5 min, set as smooth gradients.

Glucose, xylose and arabinose were analyzed by ion chromatography (Dionex ICS3000 system) , using a CarboPac PA20 column (length, 150 mm; internal diameter, 3 mm) with 10 mM NaOH at a flow rate of 0.5 ml* min- 1 as the eluent.

For production of PHA, C. basilensis HMF14 was cultured in minimal medium with 120 mM acetate as a carbon source and 6 mM (NH4)2S04 as a nitrogen source. PHA was visualized by fluorescence microscopy using Nile Blue A staining, basically as described by Johnson et al .

(8) .

Chemicals

The analytical standard of 2 , 5-furandicarboxylic acid was purchased from Immunosource B.V. (Halle-Zoersel , Belgium) . 5-Hydroxymethyl-furoic acid (HMF acid) was purchased from Matrix Scientific (Columbia SC, United States) . This compound was found to be highly esterified. Therefore, immediately prior to use, a 10 mM solution of the esterified HMF acid was boiled for two hours in 2 M H2S04, cooled, and adjusted to pH 7.0 with NaOH after addition of 50 mM of phosphate buffer (pH 7) . All other chemicals were purchased from Sigma-Aldrich Chemie B.V. (Zwijndrecht, The Netherlands). 5-Hydroxy-2- methylfurfuryl alcohol was identified based on its UV-VIS spectrum (3) .

Enrichment and characterization of HMF degrading bacteria

Enrichment cultures on minimal medium with HMF as the sole carbon source were inoculated with soil and water samples. After 2 transfers into fresh medium, the

cultures were plated on solid HMF medium to isolate individual bacteria capable of degrading HMF. Fourteen individual colonies were selected and initial

identification was performed by partial 16S rDNA

sequencing. Only one isolate (CB HMF14) was the only isolate incapable of utilizing glucose. In addition, HMF14 was easily culturable.

Cupriavidus sp . HMF14 (according to the invention)

Cupriavidus sp . HMF14 was able to grow on gluconate, succinate, citrate, acetate, benzene, toluene and phenol. No growth was observed on glucose, xylose, arabinose and mannose. Cells were short rods, either single, in pairs or in short chains. On LB agar plates round colonies were formed that had a mucous appearance and formation of mucus was also observed in liquid cultures. Strain HMF14 could be cultured at temperatures up to 41 °C and did not show anaerobic nitrate respiration. The phenotypic characteristics of Cupriavidus sp . HMF14 best match the type species of Cupriavidus basilensis (DSMZ 11853T) . Therefore, the strain was designated Cupriavidus

basilensis HMF14.

The genus Cupriavidus is well known for its ability to efficiently produce PHA (24, 31) . In order to verify PHA production by the newly isolated C. basilensis HMF14, this strain was cultivated in minimal medium with acetate as a carbon source until an OD600 of 2.4 was reached. Fluorescence microscopic analysis showed PHA granules within the cells of C. basilensis (Figure 3) .

Degradation of furan derivatives by C. basilensis HMF14

In addition to HMF, other furan derivatives are present in lignocellulosic hydrolysates . In order to demonstrate whether C. basilensis HMF14 was capable of utilizing furan derivatives other than HMF, growth was assessed on minimal medium with 3.5 mM HMF, furfural, furfuryl alcohol or furoic acid as sole carbon source. Growth was observed on all tested furan derivatives, with slightly different growth characteristics (table 1) . In cultures on furfural, furfural was initially rapidly converted to furfuryl alcohol, while also a small amount of furoic acid was formed (Fig. 1) . Without wishing to be bound to any particular theory, conversion of furfural to its alcohol and / or acid form appears to be a common mechanism of furfural detoxification. At the onset of logarithmic growth, furfuryl alcohol production decreased in favour of biomass formation, which likely occurs via furoic acid. Similarly, HMF acid and -alcohol were formed in cultures with HMF as the carbon source. In addition, trace amounts of 2 , 5-furandicarboxylic acid and furoic acid were found in the HMF cultures (not shown) .

Table 2. Growth characteristics of C. basilensis HMF14 on furan derivatives.

a This culture did not reach stable exponential phase, since the growth rate increased continuously.

C. basilensis HMF14 grew in the presence of 5 mM of furfural or HMF (0.48 g / 1, and 0.63 g / 1

respectively) . However, since the concentration of these toxic compounds is often higher in lignocellulosic hydrolysates , with values ranging from 0 to 3.5 g / 1 for furfural, and from 0 to 5.9 g / 1 for HMF, the tolerance of C. basilensis HMF14 towards furfural and HMF was determined in shake-flask cultures with 3-15 mM furfural or HMF (Fig. 2, table 2) . The lag phase was found to increase with increasing concentrations of HMF or

furfural (Fig. 2), likely as a result of substrate toxicity. Nevertheless, after 24 h of cultivation growth was observed at all concentrations tested (not shown) . No stable exponential phase was reached at furfural

concentrations above 6 mM, but increased apparent growth rates were found at higher concentrations of furfural (results not shown) . Also in the HMF cultures, no stable exponential growth was observed, but the apparent growth rate decreased rather than increased with increasing HMF concentrations .

Detoxification of lignocellulosic hydrolysate by C.

basilensis HMF14

In addition to the furan derivatives, lignocellulosic hydrolysate also contains many other components that can inhibit fermentative production of biochemicals. While C. basilensis HMF14 is unable to degrade the sugars present in lignocellulosic hydrolysate, it preferably degrades many of the other toxic constituents, as illustrated in shake-flask cultures with each compound as a single carbon source (table 3) . Table 3. Degradation of constituents of lignocellulosic hydrolysate by C. basilensis HMF14.

Compound Substrate

utilization by

C. basilensis

HMF14 a

Sugars

Glucose N

Xylose N

Arabinose N

Mannose N

Furans

Furfural Y

Furfuryl alcohol Y

Y

Hydroxymethylfurfural

Furoic acid Y

Organic acids

Acetic acid Y

Formic acid Y b

Levulinic acid Y

Ferulic acid Y

Aromatics

4-hydroxybenzoic Y

acid

Vanillic acid Y

Syringic acid N

Phenol Y

4- Y

hydroxybenzaldehyde

4-hydroxybenzyl Y

alcohol

Guaiacol Y

Vanillin Y

Vanillyl alcohol Y

Syringol N

Syringaldehyde N

a Y, Yes; N, No.

Formic acid was only co-utilized with a different carbon source.

This unique substrate specificity makes C. basilensis HMF14 ideally suited for the biological detoxification of lignocellulosic hydrolysate. Heterologous expression of the furfural and HMF

degradation pathways in P. putida S12

The functional characterization of the furfural and HMF catabolic genes of C. basilensis HMF14 enabled a reconstruction of the complete catabolic pathway for these furanic compounds. For a final verification of the functionality of the reconstructed pathway, the encoding genes were expressed in a heterologous host, P. putida S12.

First, the furfural cluster hmfABCDE was introduced into P. putida S12. As expected, the resulting strain, P. putida S12 pJT' hmfABCDE, was able to utilize furoic acid, furfural and furfuryl alcohol as sole carbon sources, although growth was initially poor. Therefore, strain S12 pJT' hmfABCDE was repeatedly transferred to fresh mineral salts medium with furfural as the sole carbon source. After 10 serial transfers, P. putida strain S12_fur was obtained which showed a reproducible growth rate of 0.30 h-1 on furfural as a sole carbon source with a biomass yield of 51 % (C-mol biomass / C-mol substrate) . P.

putida S12 strains expressing only HmfABCD or HmfABC were also constructed, but these strains failed to grow on furoic acid. These results confirmed that all genes required for furfural metabolism are located in the furfural cluster hmfABCDE and that all genes in this cluster are essential for furfural metabolism, including the hmfE-encoded CoA thioester hydrolase.

Subsequently, the hmfFGH genes were cloned into P. putida S12_fur. The resulting strain, P. putida S12_HMF, was able to utilize either furfural or HMF as the sole carbon source, at a growth rate of 0.23 h-1 and a yield of 40 % (C-mol biomass / C-mol substrate) . Gene hmfH' was apparently dispensable for growth on HMF, confirming that the encoded gene had no function in HMF metabolism. Nor was the gene essential for HMF transport in P. putida S12. Thus, also all genes required for the utilization of HMF were characterized, and their functionality was reconfirmed by functional expression in a heterologous host .