Báo cáo khoa học: Structure and function of KH domains docx

15 405 0
Báo cáo khoa học: Structure and function of KH domains docx

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

REVIEW ARTICLE Structure and function of KH domains Roberto Valverde 1 , Laura Edwards 2 and Lynne Regan 1,3 1 Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, CT, USA 2 Department of Molecular and Cellular Developmental Biology, Yale University, New Haven, CT, USA 3 Department of Chemistry, Yale University, New Haven, CT, USA Introduction The hnRNP K homology (KH) domain was named for the human heterogeneous nuclear ribonucleopro- tein K (hnRNP K), the first protein in which the motif was identified [1]. The KH motif consists of approxi- mately 70 amino acids, and is found in a diverse vari- ety of proteins in archaea, bacteria and eukaryota Keywords fragile X mental retardation; interaction motif; KH domains; K homology domain; noncrystallographic symmetry; protein motif; RNA-binding; RNA-binding protein; RNA-recognition; solvent accessibility Correspondence L. Regan, Yale University, 266 Whitney Avenue, New Haven, CT 06520, USA Fax: +1 203 432 3104 Tel: +1 203 432 9843 E-mail: lynne.regan@yale.edu (Received 3 January 2008, revised 18 February 2008, accepted 14 March 2008) doi:10.1111/j.1742-4658.2008.06411.x The hnRNP K homology (KH) domain was first identified in the protein human heterogeneous nuclear ribonucleoprotein K (hnRNP K) 14 years ago. Since then, KH domains have been identified as nucleic acid recognition motifs in proteins that perform a wide range of cellular functions. KH domains bind RNA or ssDNA, and are found in proteins associated with transcriptional and translational regulation, along with other cellular processes. Several diseases, e.g. fragile X mental retardation syndrome and paraneoplastic disease, are associated with the loss of function of a particular KH domain. Here we discuss the progress made towards understanding both general and specific features of the molecular recognition of nucleic acids by KH domains. The typical binding surface of KH domains is a cleft that is versatile but that can typically accommodate only four unpaired bases. Van der Waals forces and hydrophobic interactions and, to a lesser extent, elec- trostatic interactions, contribute to the nucleic acid binding affinity. ‘Aug- mented’ KH domains or multiple copies of KH domains within a protein are two strategies that are used to achieve greater affinity and specificity of nucleic acid binding. Isolated KH domains have been seen to crystallize as monomers, dimers and tetramers, but no published data support the forma- tion of noncovalent higher-order oligomers by KH domains in solution. Much attention has been given in the literature to a conserved hydrophobic residue (typically Ile or Leu) that is present in most KH domains. The inter- est derives from the observation that an individual with this Ile mutated to Asn, in the KH2 domain of fragile X mental retardation protein, exhibits a particularly severe form of the syndrome. The structural effects of this muta- tion in the fragile X mental retardation protein KH2 domain have recently been reported. We discuss the use of analogous point mutations at this posi- tion in other KH domains to dissect both structure and function. Abbreviations BPS, branchpoint sequence; dFXRP, Drosophila fragile X-related protein; FBP, FUSE-binding protein; FMRP, fragile X mental retardation protein; FUSE, ssDNA far-upstream element; FXRP, fragile X-related protein; hFMRP, human fragile X mental retardation protein; hnRNP K, human heterogeneous nuclear ribonucleoprotein K; KH, hnRNP K homology; KSRP, K homology splicing regulator protein; NCS, noncrystallographic symmetry; PCBP, poly(C)-binding protein; PSI, P-element somatic inhibitor protein; SF1, splicing factor 1; Y2H, yeast-two hybrid. 2712 FEBS Journal 275 (2008) 2712–2726 ª 2008 The Authors Journal compilation ª 2008 FEBS [1,2]. Typically, KH domains are found in multiple copies, two in fragile X mental retardation protein (FMRP) [3–5], three in hnRNP K [1,6], and 14 in vigi- lin [7,8]. There are, however, a few examples of pro- teins with single KH motifs; Mer1p [1,9] and Sam68 [10] each have just one. The typical function of KH domains, whether they are present in single or multiple copies, is RNA or ssDNA recognition. When present in a protein in multiple copies, KH domains can func- tion independently or cooperatively. In ssDNA far- upstream element (FUSE)-binding protein (FBP), for example, the KH3 and KH4 domains are separated by a flexible Gly linker with no interdomain contacts [11]. Each KH domains binds to a segment of ssDNA, with a linker of noncontacted ssDNA between [12]. By con- trast, the two KH domain of NusA have an extensive interdomain contact area, and bind an extended seg- ment of RNA that runs across both domains [13–15]. KH modules are found in many different proteins, which are involved in a myriad of different biological processes, including splicing, transcriptional regulation, and translational control. Two folds, one motif It was pointed out by Grishin that there are actually two different versions of the KH motif, which he named type I and type II KH folds (Fig. 1) [2]. The type I fold is typically found in eukaryotic proteins, whereas the type II fold is typically found in prokary- otic proteins. Although type I and type II folds both share a ‘minimal KH motif’ in the linear sequence, the three-dimensional arrangement of the secondary struc- tural elements is different. In the type I fold, a b-sheet composed of three antiparallel b-strands is abutted by three a-helices (a1, a2, and a¢). The b-sheet in type I KH domains consists of three b-strands in the order b1, b¢ and b2. The b1-strand and b2-strand are parallel to each other, and the b¢-strand is antiparallel to both (Fig. 1). This all-antiparallel arrangement of strands distinguishes the type I KH fold from the type II KH fold, in which the b1-strand and b2-strand are adjacent and parallel to each other, and the b¢-strand is adja- cent and antiparallel to the b1-strand (Fig. 1). The length and sequence of the variable loop are different in different KH domains, be they type I or type II (the variable loop is shown as a dotted line in Fig. 1). Vari- able loop lengths from three to over 60 residues are known. All typical KH domains have a GXXG loop (shown in white in Fig. 1) [2], although this is some- times altered or interrupted in divergent KH domains [16]. Not only is the order of secondary structural ele- ments in individual eukaryotic type I KH domains dif- ferent from that in prokaryotic type II KH domains, but the relative orientation of tandem type I versus type II KH domains is also quite different. The com- parison is limited, however, because the structure of only one of each type of tandem KH domain has been published. Here we compare the structures of the tan- dem KH1–KH2 domains from protein NusA (Protein Data Bank entry 2ASB) [14,15] and from human FMRP (hFMRP) (Protein Data Bank entry 2QND) [17] as examples of tandem prokaryotic KH (type II) domains and tandem eukaryotic (type I) KH domains, respectively. In NusA, an unstructured six amino acid linker connects KH1 to KH2, and an area of  1380 A ˚ 2 is buried at the interface between the b-sheet of KH1 and the a-helices (a¢ and a2) of KH2 (Fig. 2B). By contrast, in hFMRP(KH1–KH2D), the a¢-helix of KH1 is linked to the b1-strand of KH2 by the single residue, Glu280, which adopts non-b non-a A B Fig. 1. Type I and type II KH domain folds. Stylized representations of (A) the type I KH domain (eukaryotic) and (B) the type II KH domain (prokaryotic). The labeling of second- ary structure elements is according to stan- dard KH nomenclature [2]. The dotted line connecting the b2-strand and b¢-strand rep- resents the variable loop. The white line connecting the a1-helix and the a2-helix represents the GXXG loop. R. Valverde et al. Structure and function of KH domains FEBS Journal 275 (2008) 2712–2726 ª 2008 The Authors Journal compilation ª 2008 FEBS 2713 phi ⁄ psi angles to accomplish this tight connection, which contains minimal interdomain contacts between aliphatic residues from the a1-helix of KH1 and the b-sheet of KH2 [17,18]. Evolutionary relationships between KH domains Type I domains are found in multiple copies in eukaryotic proteins, whereas type II KH domains are typically found as single copies in prokaryotic proteins. Here, therefore, we discuss eukaryotic proteins. Within a family of KH proteins with multiple KH domains (i.e. type I KH domains), the KH1 domain is always more similar to other KH1 domains in different pro- teins than to the KH2 and KH3 domains in the same protein. Similar relationships are seen for KH2 and KH3 domains – they are more similar to other KH2 and KH3 domains, respectively, than they are to each other or to KH1 domains (Fig. 3). This relationship holds true in all families and between species, from those within which the like-pairs of domains have very high identity [over 95% in the Nova and poly(C)-bind- ing protein families], to those within which like-pairs of domains have much lower identity (around 50% in the FXR family; Fig. 3). From this observation, a number of hypotheses about the origin and evolution of the KH domains may be proposed. If multiple KH domains arose as a result of a gene duplication event, the results cited above suggest that duplication occurred before the divergent evolution of the members of each protein family. Alternatively, one could speculate that the interdomain identities are a result of convergent evolu- tion of different domains in a parent protein, before subsequent evolutionary divergence produced different members of the family. Nucleic acid binding by KH domains – general features The structures of KH domains in complex with their cognate nucleic acid ligand are mostly of type I domains from eukaryotic proteins, which function in transcriptional and translation regulation. The only structures of type II KH domains in complex with nucleic acid ligand are of the bacterial protein NusA [15] (Protein Data Bank entries 2ATW and 2ASB). Although the total number of structures in the Pro- tein Data Bank of KH domains bound to cognate nucleic acid ligand is small, some common features of nucleic acid recognition emerge among them. The RNA or DNA is bound in an extended, single- stranded conformation across one face of the KH domain, between the a1-helix and the a2-helix and GXXG on the ‘left’, and the b2-sheet and the vari- able loop on the right (Fig. 4A). Together, these secondary structural elements form a binding cleft that accommodates four bases. Note that the secondary structure elements that shape the binding cleft com- prise, in part, the core motif found in type I and type II domains. The variable loop in type II KH domains, however, is located at the bottom of the binding cleft (Fig. 4A). The center of the binding pocket tends to be hydrophobic, with a variety of additional specific interactions stabilizing the complex. Nucleic acid base-to-protein aromatic side-chain stack- ing interactions, which are prevalent in other types of single-stranded nucleic acid binding motifs [19,20], are notably absent in KH domain nucleic acid recognition. In some complexes, the bases in the ssDNA or RNA bound by the KH domain stack with each other (Fig. 4B), whereas in other examples there is no base stacking. An adenine–backbone interaction is a feature seen in some KH domain–nucleic acid structures (Fig. 4C). Examples are (relevant adenine in bold) A42–G43– A44–A45 in NusA KH1, C48–A49–A50–U51 in NusA KH2 [15], U12–C13–A14–C15 in Nova-2 KH3 [21], and U6–A7–A8–C9 in splicing factor 1 (SF1) [22]. The adenine bases hydrogen bond to the protein backbone, mimicking a Watson–Crick base pairing pattern. Superimposing the NusA KH1 domain and ribonu- cleotides 42–46 on the NusA KH2 domain and ribonu- cleotides 48–53 reveals that the adenine bases of A44 and A50 make exactly equivalent hydrogen bonds to the protein backbone [15]. Fig. 2. The orientation of individual KH domains in tandem type I and type II arrays. Schematics are based on the crystal structures of the KH1–KH2 domains of NusA (type II) (Protein Data Bank entry 1KOR) and fragile X mental retardation protein [type I (B)] (Protein Data Bank entry 2QND). Each domain is represented as an oval with the b-sheet side colored solid black and the abutting a-helices striped. Structure and function of KH domains R. Valverde et al. 2714 FEBS Journal 275 (2008) 2712–2726 ª 2008 The Authors Journal compilation ª 2008 FEBS KH domains bind ssDNA and RNA with low micromolar affinity. For example, the K d values of the KH domain of the SF1–DNA complex and the hnRNP K KH3 domain–DNA complex are 3 lm and 1 lm, respectively [22,23]. The clustering of KH domains increases nucleic acid recognition and specific- ity [24]; the four tandem KH domains of P-element somatic inhibitor protein (PSI), for example, bind ligand cooperatively [25]. The KH1–KH2 domains of NusA (Protein Data Bank entries 2ATW and 2ASB) form an uninterrupted recognition surface that binds RNA with nanomolar affinity [15]. Together, the third and fourth KH domains of the K homology splicing regulator protein (KSRP) bind RNA ligand more tightly than each does separately [26]. Finally, where the structures of both the KH–nucleic acid complex and free KH domain have been deter- mined, ligand binding produces little or no structural change in the protein as determined by our analysis [27] and concluded in [15,21,28]. Nucleic acid recognition by KH domains – specific examples NMR structure of the KH3 domain of hnRNPK with ssDNA bound The type I KH3 domain of the transcriptional regula- tor hnRNP K binds to a 10mer ssDNA, specifically recognizing the tetrad 5¢-dTCCC (Fig. 5) [23] (Protein FMRP KH1 FMRP KH2 FXR1 KH1 FXR1 KH2 FXR2 KH1 FXR2 KH2 dmFMR1 KH1 dmFMR1 KH2 FMRP KH1 100.0 - - FMRP KH2 21.7100.0 - - FXR1 KH1 82.0 23.0 100.0 - - - - - FXR1 KH2 20.3 55.2 17.7 100.0 - - - - FXR2 KH1 64.3 20.0 68.6 19.0 100.0 - - - FXR2 KH2 21.7 53.7 22.9 82.0 22.8 100.0 - - dmFMR1 KH1 54.4 22.1 58.8 23.1 58.8 25.6 100.0 - dmFMR1 KH2 22.5 43.1 24.0 65.3 26.7 62.5 22.6 100.0 NOVA-1 KH1 NOVA-1 KH2 NOVA-1 KH3 NOVA-2 KH1 NOVA-2 KH2 NOVA-2 KH3 NOVA-1 KH1 100.0 - - - - - NOVA-1 KH2 35.3 100.0 - - - - NOVA-1 KH3 40.3 37.3 100.0 - - - NOVA-2 KH1 95.5 36.8 36.8 100.0 - - NOVA-2 KH2 32.4 86.3 34.3 34.3 100.0 - NOVA-2 KH3 38.8 35.8 90.9 37.3 35.4 100.0 PCB1 KH1 PCB1 KH2 PCB1 KH3 PCB2 KH1 PCB2 KH2 PCB2 KH3 PCB1 KH1 100.0 - - - - - PCB1 KH2 33.8 100.0 - - - - PCB1 KH3 35.4 33.8 100.0 - - - PCB2 KH1 95.2 33.8 32.3 100.0 - - PCB2 KH2 35.4 93.8 31.0 35.4 100.0 - PCB2 KH3 33.8 35.4 92.1 30.8 32.4 100.0 PCB3 KH1 88.7 32.3 36.9 90.3 33.8 35.4 PCB3 KH2 35.4 84.6 35.4 33.8 89.2 36.9 PCB3 KH3 36.4 38.5 84.1 33.3 40.0 84.1 PCB4 KH1 74.2 35.4 35.4 69.4 33.8 35.4 PCB4 KH2 35.4 76.9 36.9 33.8 80.0 38.5 PCB4 KH3 33.8 33.8 66.7 30.8 35.4 71.4 PCB3 KH1 PCB3 KH2 PCB3 KH3 PCB4 KH1 PCB4 KH2 PCB4 KH3 PCB1 KH1 PCB1 KH2 PCB1 KH3 PCB2 KH1 PCB2 KH2 PCB2 KH3 PCB3 KH1 100.0 - - - - - PCB3 KH2 33.4 100.0 - - - - PCB3 KH3 36.9 40.0 100.0 - - - PCB4 KH1 75.8 35.4 36.9 100.0 - - PCB4 KH2 36.9 86.2 41.5 33.8 100.0 - PCB4 KH3 35.4 36.9 68.3 33.8 33.8 100.0 Fig. 3. Table showing sequence identities of KH domains within protein families. Data for the FMRP, Nova and PCBP families are shown. For each family, the sequences of individual KH domains were aligned with KH domains at different positions in the same protein, and KH domains at the same position in different proteins. The highest percentage identities were consistently those between KH domains at the same position in different members of a protein family (highlighted in purple). R. Valverde et al. Structure and function of KH domains FEBS Journal 275 (2008) 2712–2726 ª 2008 The Authors Journal compilation ª 2008 FEBS 2715 Data Bank entry 1JK5). The authors propose that the complex is stabilized by methyl-to-oxygen hydrogen bonds between three Ile side-chains and the O2 and N3 atoms of the two central cytosine bases. Methyl-to- oxygen hydrogen bonds are uncommon and weak, but not without precedent [29,30]. Additional interactions that stabilize the complex include protein backbone and side-chain hydrogen bonds to bases, and electro- static interactions between positively charged side- chains on the protein and the phosphate backbone of the nucleic acid. Poly(C)-binding proteins Poly(C)-binding proteins (PCBPs) contain three type I KH domains, which appear to function independently, because they are separated by long linkers: KH1–(16 amino acid spacer)–KH2–(67 to > 100 amino acid spacer)–KH3. They bind to poly(C)-rich DNA and RNA sequences and function in a diverse range of cel- lular processes, including mRNA stabilization, transla- tional activation, and translational silencing [31,32]. Crystal structures have been solved of the PCBP2 KH1 in complex with a 12-nucleotide ssDNA and with its RNA equivalent (Protein Data Bank entries 2PQU and 2PQY, respectively) [33]. In both the ssDNA and RNA complexes, the 12 nucleotides correspond to two repeats of the human C-rich strand telomeric DNA, 5¢- AACCCTAACCCT-3¢ (a single repeat is underlined, and the core recognition sequence is in bold). The asymmetric unit of both ssDNA and RNA crystals contains two KH1 molecules tethered by one oligonucleotide ligand. The crystal structures of PCBP2 KH1 in complex with either 12-nucleotide ssDNA or its equivalent RNA are similar, with no indication that the hydroxyl groups of the RNA bases are involved in interactions with the protein (Fig. 6A). The CCCT ⁄ U tetranucleotide motif constitutes the core recognition sequence. Interestingly, however, when PCBP2 KH1 was crys- tallized with a seven-nucleotide single repeat ssDNA ligand 5¢-AACCCTA-3¢ (core recognition sequence in bold), a different ‘register’ of the nucleic acid–protein complex was observed [28] (Protein Data Bank entry 2AXY, shown in Fig. 6B). In all structures, the nucleic acid was in the ‘typical’ cleft, but its position relative to the protein was shifted up by one base in the 5¢-direction in the seven-nucleotide structure (ACCC versus CCCT; Fig. 6A–C). The first position of the core recognition motif sits on top of the a1-helix, and then the phosphate backbone of the next two nucleo- tides interacts with the a1-helix and the GXXG motif on the left, and the b2-strand and the variable loop on the right. Base stacking is observed between the third and fourth position nucleotides of the core recognition sequence. The recently solved high-resolution structure of the third KH domain of PCBP2 bound to ssDNA, 5¢-dAACCCTA-3¢ [34] (Protein Data Bank entry 2P2R) is similar to previous structures of the first KH domain of PCBP2. However, because the crystals dif- fracted to ultra-high resolution, hydrogen bonding and water molecules mediating protein DNA contacts were observed that previously could not be resolved in other crystal structures. Specifically, the binding cleft is occu- pied by the tetrad 5¢-CCCT-3¢, with direct water-medi- ated contacts stabilizing the last two bases, and protein nucleic acid contacts to two additional bases beyond the binding cleft where seen. Also of interest is AB C Fig. 4. Common features of KH domain– nucleic acid interactions. (A) Type I KH domain; the binding cleft comprises the sec- ondary structural elements a1-helix, GXXG loop, a2-helix, b2-strand, and variable loop (colored green), and recognizes four nucleo- tides (cyan sticks). The green dotted line represents the location of the variable loop in type II KH domains. (B) Nucleic acid bases of the ligand stacking with each other. Coordinates from Protein Data Bank entry 1J5K were used in (A) and (B), and coordinates from Protein Data Bank entry 2ASB were used in (C). Structure and function of KH domains R. Valverde et al. 2716 FEBS Journal 275 (2008) 2712–2726 ª 2008 The Authors Journal compilation ª 2008 FEBS the observation that in different crystal forms, the KH domains of PCBP2 were either monomeric or were a crystal-contact-mediated dimer (see section on KH dimers). RNA recognition by a single KH domain in cooperation with a QUA2 domain SF1 specifically recognizes the intron branchpoint sequence (BPS) UACUAAC in pre-mRNA transcripts [35], with KH domain binding augmented by addi- tional interactions with an N-terminal helix known as the QUA2 domain (labeled in Fig. 7) [36]. The RNA adopts an extended single-stranded conformation, and is bound in a hydrophobic groove between QUA2, the GXXG loop and the variable loop of the KH domain [22] (Fig. 7; Protein Data Bank entry 1K1G). The QUA2 region recognizes the 5¢-nucleotides of the BPS (ACU), with the a1-helix and a2-helix and the b2-strand of the KH domain region interacting with the next nucleotides of the RNA in ‘typical’ fashion. A large surface area of predominantly aliphatic hydro- phobic residues is buried at the protein–RNA inter- face. In addition, positively charged side-chains undergo electrostatic interactions with the solvent- exposed phosphate backbone. Protein contacts to the 3¢-end of the RNA are provided by the variable loop and the b2-strand. Binding of the seven-nucleotide RNA BPS requires both the QUA2 and KH regions. Another example of an augmented KH domain is the fourth KH domain of KSRP [26], which contains a novel fourth b-strand located adjacent and angled to the b1-strand and contributes to the stability of the protein (Protein Data Bank entry 2HH2). It is not yet known whether the fourth b-strand is involved in contacts with RNA [26]. X-ray structure of Nova-2 KH3 plus SELEX RNA The X-ray structure of the KH3 domain of Nova-2 bound to an in vitro selected stem–loop RNA contain- ing the 5¢-UCAC-3¢ core recognition sequence has been solved [21] (Fig. 8). This structure is something of an ‘outlier’, because the nucleic acid has a double- stranded hairpin stretch (not shown in Fig. 8), which may be a consequence of stability requirements for selection in vitro [37]. The stem of the hairpin adopts the A-form double- helical conformation, with four Watson–Crick base pairs (G1–C20, A2–U19, G3–C18, G4–C17) and a single hydrogen bond between A5 and C16 (N1– O2 = 2.4 A ˚ ). The extended target RNA (A11, U12, C13, A14, C15) lies upon a hydrophobic platform (formed by the a1-helix and the edge of the b2-strand), where it con- tacts both the invariant GXXG motif and the variable loop. Nucleic acid binding by tandem but independent KH domains – NMR structure of the KH3 and KH4 domains of FBP in complex with FUSE ssDNA FUSE-binding protein has four KH domains, which are separated by linkers of varying lengths [11]. FBP regulates c-myc expression by binding to FUSE [38]. The NMR structure of a complex between the KH3 Fig. 5. Solution structure of the KH3 domain of hnRNP K bound to ssDNA. The third KH domain of hnRNP K (Protein Data Bank entry 1J5K) recognizes a tetrad of sequence 5¢-dTCCC (purple sticks). Regions on the protein that are in contact with the nucleic acid ligand are colored green (hydrophilic) and cyan (polar). The sugar phosphate backbone curves around the a1-helix near the GXXG loop before proceeding parallel to the a2-helix. The first base sits on top of the a1-helix, and the 5¢-dCCC bases of the tetrad fill the interior of the predominantly hydrophobic cleft and base stack with each other (see Fig. 4B). The ends of the ssDNA sugar backbone are stabilized by electrostatic interactions with positively charged residues that line the ridge of the cleft on the GXXG loop and a2-helix. R. Valverde et al. Structure and function of KH domains FEBS Journal 275 (2008) 2712–2726 ª 2008 The Authors Journal compilation ª 2008 FEBS 2717 and KH4 domains of FBP and a 29-base ssDNA frag- ment from FUSE [12] shows that each KH domain binds to a separate 9- to10-base segment of ssDNA (Fig. 9). The KH domains are connected by a flexible Gly-rich linker, and behave independently. In addition, the two ssDNA segments to which the KH domains bind are themselves separated by a five-base linker of ssDNA. There are no protein contacts between the KH domains, and the linker DNA is not in contact with protein. In both KH domains, the ssDNA is bound in the typical extended orientation, in the groove between the a 1-helix and a2-helix plus the GXXG loop on one side, and the b2-strand and the variable loop on the other. The center of the groove is hydrophobic, and the edges are hydrophilic and charged, with the narrow binding site (10 A ˚ ) favoring pyrimidines over purines. NusA – crystal structure of tandem type II KH domains NusA regulates transcriptional elongation, pausing, termination and antitermination in prokaryotes [39– 41]. The protein contains two tandem type II KH domains, which are connected by a short six-residue linker [14,15]. This short linker, combined with a tight turn between the domains, results in a structure in which the two KH domains are in contact and form an extended and continuous surface for RNA binding. NusA binds with high affinity and specificity to BoxB– BoxA–BoxC antitermination sequences within the lea- der region of the rRNA operon [15]. Ligand binding produces no change in the structure or relative orienta- tion of the KH domains, (Protein Data Bank entries 1KOR and 2ASB) [27]. The ssRNA is bound in an extended conformation and is in contact with large areas on both KH domains (Fig. 10). Despite having type II connectivity, each KH domain of NusA contains a ‘typical’ binding cleft. The variable loop, however, hangs at the bottom of the cleft (Fig. 4A) instead of up and across from the GXXG loop, as in type I KH domains. The 5¢-end of the RNA (bases A42 through A45) is buried in and across the groove between the a1-helix and a2- helix and the b2-strand of KH1. Intimate contacts between protein and RNA continue across the cusp of the KH1 and KH2 domains. C46 binds to the Fig. 6. Crystal structures of the first KH domain from PCBP-2 in complex with ssDNA. The first KH domain of PCBP2 recognizes the tetrad sequence 5¢-dCCCT [(A) Protein Data Bank entry 2PQU] and 5¢-dACCC [(B) Protein Data Bank entry 2AXY). Polar and hydrophobic residues that make contacts with nucleic acid (purple sticks) are colored cyan and green, respectively. Waters (gray spheres) that bridge protein and ssDNA contacts were unambiguously resolved in the high-resolution structure in (B). Both structures are representative molecules within the asymmetric unit. In (C), the tetrad sequence (purple letters) of each structure is aligned with respect to the seven-nucleotide single repeat ssDNA ligand. The register of the sequence is shifted in the 5¢-direction in (A). In both structures, the nucleotide at the 5¢-end of the ssDNA strand sits on the top of the a1-helix, and is stabilized by contacts that can recognize an adenine or cytosine nucleotide. The central cytosine bases of the tetrad sequence occupy the hydrophobic interior of the binding cleft. The last nucleotide at the 3¢-end of the ssDNA strand (dC in 2AXY; dT in 2PQU) is participating in base-stacking interactions with the preceding cytosine base. Structure and function of KH domains R. Valverde et al. 2718 FEBS Journal 275 (2008) 2712–2726 ª 2008 The Authors Journal compilation ª 2008 FEBS loop connecting b¢ strand and a¢ helix of KH2, and U47 and C48 make contacts with the a1-helix and the GXXG loop of KH2. Finally, the nucleotides at the 3¢-end of the RNA (A49–A52) pack against the groove comprising the a1-helix and a2-helix and the b2-strand of KH2. Hydrogen bonds to both amino acid side-chains and the protein backbone, electro- static and polar interactions and, to a lesser extent, hydrophobic interactions between bases and nonaro- matic amino acid side-chains stabilize the protein RNA complex. The interaction of the NusA tandem KH domains with RNA is quite different from that seen in the dou- ble KH domain of FBP bound to ssDNA from FUSE – the only other structure of a double KH domain bound to a nucleic acid target. In FBP, the two KH domains are connected by a flexible 30-residue Gly- rich linker and behave like beads on a string [12]. In the protein DNA complex, each KH domain interacts with a separate ssDNA recognition sequence, and a five-nucleotide noninteracting spacer separates the two bound DNA recognition sequences. Although in both examples the coupling of two RNA-binding domains will effectively increase the specificity and affinity of the RNA–protein interaction, the two different binding modes have very different consequences for the type and length of RNA bound. KH crystal dimers – a tenuous relationship Crystallographic data Different KH domains crystallize as monomers, dimers, or tetramers. This and other observations have Fig. 7. Solution structure of the QUA2 and KH domains of SF1 in complex with RNA. The Qua2 and KH domain of SF1, together, recognize RNA BPS 5¢-UACUAAC (blue sticks; Protein Data Bank entry 1K1G). Protein side-chains making polar and hydrophobic con- tacts with RNA are colored cyan and green, respectively. The QUA2 domain (labeled) abuts the a2-helix of the KH domain, giving rise to an expanded contact with RNA, with the five nucleotides at the 5¢-end of the RNA contacting the QUA2 domain, exclusively. The base of Ura6 is buried between the a1-helix and the QUA2 helix. The RNA then continues in single-stranded, extended confor- mation into the ‘typical’ KH groove. Finally, the RNA loops over to the right and makes contact with the b2-strand. Note also the very long variable loop, 24 amino acids, which loops back over the RNA from the right. Fig. 8. Crystal structure of Nova-2 KH3 bound to SELEX RNA. The third KH domain of the protein Nova-2 binds to the tetranucleotide sequence 5¢-UCAC (blue sticks; Protein Data Bank entry 1EC6), which is part of the larger SELEX RNA. Protein side-chains making polar and hydrophobic contacts with RNA are shown in cyan and green, respectively. U12–C13–A14 rests on a hydrophobic platform formed by the a1-helix and the b2-strand. Electrostatic interactions between protein side-chains, nucleic acid bases and the sugar phosphate backbone further stabilize the complex. Bases A14 and C15 participate in base-stacking interactions with each other. The 2¢-hydroxyl groups of the tetrad hydrogen bond with protein or other bases, making it unlikely that ssDNA could bind tightly to this KH domain. R. Valverde et al. Structure and function of KH domains FEBS Journal 275 (2008) 2712–2726 ª 2008 The Authors Journal compilation ª 2008 FEBS 2719 led to the proposal that the functional form of certain KH domains may involve noncovalent dimers or higher-order oligomers. Here we review the data. Crystals of the single KH3 domain of the protein Nova-2 contain four KH molecules per asymmetric unit (Protein Data Bank entry 1DTJ) related by pseudo-222 noncrystallographic symmetry (NCS; Fig. 11A) with two different surfaces on each KH domain mediating protein–protein contacts (Fig. 11B,C). One protein– protein interface comprises primarily two b1-strands from two KH domains related by two-fold NCS. This arrangement creates an augmented antiparallel b-sheet stabilized by cross-strand side-chain interac- tions [42] and a buried surface area of 890 A ˚ 2 [18,43] (reported as 950 A ˚ 2 in [44]) (Fig. 12A). The other interface comprises two a¢-helices with an  50 0 packing angle [45] of the two KH domains related by NCS that buries 1000 A ˚ (reported as 1250 A ˚ 2 in [44]; Fig. 11C). Interestingly, the same KH domain in complex with a SELEX RNA crystallizes with only two KH mole- cules in the asymmetric unit related by NCS [21]. The two KH molecules interact through related a¢-helices and bury 1000 A ˚ 2 (Fig. 12B). This arrangement is identical to the protein–protein interactions observed in crystals of apo-Nova (Fig. 11C). Crystals of the first KH domain of PCBP2 in com- plex with ssDNA contain two identical dimer com- plexes per asymmetric unit related by two-fold NCS [28] (Protein Data Bank entry 2AXY; Fig. 13A). The dimer buries 1890 A ˚ 2 , and as in the protein–protein interface depicted in Fig. 12A, an augmented antipar- allel b-sheet is formed by symmetry-related b1-strands and further stabilized by interactions between a¢-helices (Fig. 13B). This dimeric arrangement is reproduced in crystals of two PCBP2 KH1 molecules tethered by one ssDNA or RNA ligand [33] (Protein Data Bank entries 2PQU and 2PYQ). In the cocrystal structure of the third KH domain of human PCBP-2 with DNA [34], however, no protein–protein contacts were observed in the crystal. Instead, crystal contacts were solely formed by base-stacking interactions of DNA molecules from adjacent asymmetric units. A1 of the heptanucleotide stacks on C3 of a symmetry-related DNA and vice versa. For neither the apo nor nucleic acid-bound forms of these KH domains are there published solution data in support of the idea that these KH domains may exist as dimers or higher-order oligomers in solution [17,44], and nor have dimers or higher-order oligomers been shown to be of functional significance in vivo. Fig. 9. Solution structure of the FBP KH3–KH4 domain bound to ssDNA. The third and fourth KH domains of FBP recognize ssDNA 5¢-dTTTT (A) and 5¢-ATTC (B), respectively. In both domains, the binding cleft makes hydrophobic contacts with the ssDNA bases, and polar residues lining the edge of the cleft contact the sugar phosphate backbone. The bases of the DNA ligand stack with each other, with the methyl groups of thymine pointing away from the binding cleft. Both domains behave independently. Although both the KH domains and both the DNA-binding sites were present as a single unit, neither the Gly-rich protein linker nor the noncontacted ssDNA were resolved. Structure and function of KH domains R. Valverde et al. 2720 FEBS Journal 275 (2008) 2712–2726 ª 2008 The Authors Journal compilation ª 2008 FEBS In crystals of the tandem KH domains from human FMRP, there are also two molecules in the asymmetric unit related by NCS [17] (Protein Data Bank entry 2QND). Contacts between NCS-related b2-strands and, to a lesser extent, a1-helices bury, 2100 A ˚ 2 (Fig. 14A). This b-sheet augmentation is similar to that seen with apo-Nova-2 KH3 and PCBP2 KH1, but its interface comprises primarily b2–b2 and not b1–b1 interactions (compare Figs 12 and 13 with Fig. 14B). When the C2 operation is applied to the asymmetric unit, another interface is formed between neigh- boring KH domains. This interface is mediated by symmetry-related a¢-helices, as seen in crystals of RNA-bound Nova-2 KH3 (Figs 11C and 12B), and buries 1200 A ˚ 2 – significantly less than observed in the asymmetric unit. In summary, two interfaces are commonly observed in the crystals: (a) helix–helix packing between symme- try-related a¢-helices with a  50 0 packing angle, as seen in the Nova-2 KH3–RNA structure; and (b) b- sheet augmentation achieved by contacts between b1 or b2 symmetry-related strands, as seen in Nova-2 KH3, hFMRP (KH1–KH2D), and PCBP2 KH1. Caution is advised in extrapolating from crystal structures to predict the solution oligomeric state of KH domains. Although several KH domains form ABC Fig. 10. Crystal structure of tandem type II KH domains of NusA in complex with RNA. The tandem KH1–KH2 domains of NusA recognize RNA ligand 5¢-GAACUCAAUAG. (A) The KH1–KH2 domains of NusA bound to cognate RNA ligand (Protein Data Bank entry 2ASB). The RNA–protein contact surface spans across both domains. In particular, A45 makes contacts with residues in both KH1 and KH2. Additional polar contacts with 2¢-hydroxyls specify RNA recognition. The KH1 and KH2 domains are shown separately in (B) and (C), respectively. Type II KH domains are connected differently. The variable loop, for example, is located at the bottom and to the left of the binding cleft. Although the connection of type II KH domains is different, the structural elements that comprise the binding cleft are the same in as type I domains, and accommodate four nucleotides as well. Fig. 11. Protein–protein interfaces in Nova-2 KH3 in crystals. This figure is an adaptation of Figs 6 and 7 from Lewis et al. [44], using Protein Data Bank coordinates 1DTJ. (A) Contents of the asymmetric unit with the two-fold NCS axis labeled. The tetrameric arrangement of mole- cules produces two protein–protein interfaces. (B) One protein–protein interface generated by two-fold NCS. (C) Other protein–protein inter- faces also generated by two-fold NCS. R. Valverde et al. Structure and function of KH domains FEBS Journal 275 (2008) 2712–2726 ª 2008 The Authors Journal compilation ª 2008 FEBS 2721 [...]... hydrophobic core, be solvent-exposed, and be involved in direct interactions with RNA [21,53,54] The lack of a consensus can be attributed, at least in part, to the extrapolation of data from other KH domains to the KH1 KH2 domains of FMRP The structure of the tandem KH1 KH2 domains of hFMRP provided the first crystallographic description of the structural environment of the Ile304 residue [17] It revealed... single-stranded DNA recognition by KH domains: solution structure of Structure and function of KH domains 24 25 26 27 28 29 30 31 32 33 34 35 36 a complex between hnRNP K KH3 single-stranded DNA EMBO 21, 3476–3485 Lunde BM, Moore C & Varani G (2007) RNA-binding proteins: modular design for efficient function Nat Rev Mol Cell Biol 8, 479–490 Chmiel NH, Rio DC & Doudna JA (2006) Distinct contributions of KH domains. .. Data Bank entries 1DTJ and 1EC6) [27], signifying that the protein backbone does not move in the presence of RNA ligand The main hydrophobic core of Nova-2 KH3 comprises residues that are similar but not identical to the residues in the main core of fragile X KH domains Analysis of the RNA-bound structure of Nova-2 KH3 reveals that the atoms of Leu28 are buried except for Cb, Cc, and Cd1, whose combined.. .Structure and function of KH domains R Valverde et al Biochemical studies A B Fig 12 Schematic representation of protein–protein surfaces of free and RNA-bound Nova-2 crystals The schematic in (A) and (B) is based on the protein–protein interactions shown in Fig 11B,C, respectively Salient secondary structure elements are labeled Cross-strand side-chain interactions are shown in open and closed... tandem KH domains, PSI KH1 KH4 , that bind pre-mRNA cooperatively As with dFXRP, introducing the Ile304 fi Asn equivalent mutation into each KH domain has relatively subtle effects on secondary structure [25] Leu28 in the KH3 domain of the protein Nova-2 is structurally equivalent to Ile304 in FMRP A study by Lewis et al [44] found that the Ile304 fi Asn mutation perturbs the structure of Nova-2 KH3 and. .. interactions tions For hnRNP K, the N-terminal two-thirds of the protein, spanning KH1 , KH2 , and the junction between KH2 and KH3 , was required for interactions with hnRNP E2, I, K, and L Deletion of the junction sequence including the Pro-rich regions (but not the KH domain) abolished protein–protein interactions, and the region spanning the junction sequence and KH3 domain was not sufficient for the protein–protein... (2005) Structure of a Mycobacterium tuberculosis NusA-RNA complex EMBO J 24, 3576–3587 16 Brykailo MA, Corbett AH & Fridovich-Keil JL (2007) Functional overlap between conserved and diverged KH domains in Saccharomyces cervisiae SCP160 Nucleic Acids Res 35, 1108–1118 17 Valverde R, Pozdnyakova I, Kajander T & Regan L (2007) Fragile X mental retardation: the structure of the KH1 KH2 domains of fragile... of a single point mutation within a KH domain Fragile X mental retardation syndrome is the most common form of inherited mental impairment in Structure and function of KH domains humans For all fragile X individuals, the underlying cause of the syndrome is lack of functional FMRP In the majority of cases, FMRP is not made because a CGG repeat expansion in the 5¢-UTR of the gene encoding it is hypermethylated,... destabilize the hydrophobic core of the KH2 domain of FMRP This group subsequently reported that introducing an iso-structural Asn in place of Leu28 would alter the electrostatic properties of a hydrophobic platform, stabilizing the RNA ligand on the protein, without changing the hydrophobic interior of the domain [21] The Ca backbones of the structures of both free and RNA-bound Nova-2 KH3 are essentially the... coordinates 2QND) The strands of one chain are represented as open arrows, and the symmetry-related strands are shaded Hydrophobic and polar side-chains are shown in closed and open circles, respectively This orientation creates an augmented b-sheet composed of six antiparallel strands This arrangement of KH molecules buries ˚ 2100 A2 of total buried surface area with cross-strand side-chain interactions . part, to the extrapolation of data from other KH domains to the KH1 KH2 domains of FMRP. The structure of the tandem KH1 KH2 domains of hFMRP provided the first. the b-sheet of KH1 and the a-helices (a¢ and a2) of KH2 (Fig. 2B). By contrast, in hFMRP (KH1 KH2 D), the a¢-helix of KH1 is linked to the b1-strand of KH2 by the

Ngày đăng: 07/03/2014, 05:20

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan