Membrane Protein Engineering with Rosetta

Protein engineering can yield new molecular tools for nanotechnology and therapeutic applications through modulating physio-chemical and biological properties. Engineering membrane proteins is especially attractive because they perform key cellular processes including transport, nutrient uptake, removal of toxins, respiration, motility, and signaling. In this chapter, we describe two protocols for membrane protein engineering with the Rosetta software: (1) ΔΔG calculations for single point mutations and (2) sequence optimization in different membrane lipid compositions. These modular protocols are easily adaptable for more complex problems and serve as a foundation for efficient membrane protein engineering calculations.

Keywords: Implicit membrane, lipid composition, monte Carlo, Rosetta, protein design

Introduction

Membrane proteins perform key cellular processes including transport, nutrient uptake, removal of toxins, respiration, motility, and signaling. They constitute over 30% of proteins and are targets for over 60% of pharmaceutical drugs [1,2]. Understanding their structures will provide insight into cellular function, the molecular basis of disease, and enable design of proteins for therapeutic and nanotechnology applications. Protein engineering is a promising tool for this purpose [3]. However, experimental methods to characterize designs in a lipid bilayer are onerous, time consuming, and expensive [4]. Thus, computational tools provide a promising route to accelerate efforts in membrane protein engineering [5].

The past 20 years of computational membrane protein engineering has largely focused on reprogramming natural membrane protein functions [6–8]. Chen et al. [9] applied structural and bioinformatics calculations to generate G-Protein coupled receptors that decrease agonist affinity via stabilization. To ease experimental difficulties, membrane proteins have been redesigned to assemble and function in the aqueous phase [10,11]. Young et al. [12] engineered the dopamine D2 receptor to favor different chemical pathways. Further, both α-helical and β-barrel proteins have been repurposed as pores for industrial scale separation of water and other solutes including glucose, sucrose, and salt [13,14]. This handful of examples demonstrates the versatility of computational methods to yield new molecular tools.

Membrane protein engineers have also pursued the holy grail of de novo design: generating new proteins from physical principles with novel sequences [15]. De novo design can inform how proteins work [16] and create a wider array of tools outside of what biology has already provided such as drug delivery [17] and process control [18,19]. For membrane proteins, there have been two significant de novo milestones. DeGrado and coworkers designed the PRIME and ROCKR alternating access transporters formed by two tightly interacting pairs of helices [20,21]. Lu et al. also achieved de novo design of helical bundles that insert into the membrane [22].

The aforementioned successes leverage many approaches for scoring and sampling both sequences and structural conformations. A key point of differentiation is how these methods capture the variable affinity of side chains for the heterogeneous lipid membrane environment. Knowledge-based algorithms that derive depth-dependent side chain probabilities from known structures in the Protein Databank are realistic [23–25]; yet, they cannot capture different lipid compositions. Thermodynamic-based scales offer a first principles approach [26–28]; yet, until recently these scales were derived from side chain analogues in organic solvents which do not resemble the behavior of biological membranes [29]. Further, both approaches fell short of fine-tuning the lipid-facing regions [30], and were subject to over-designing hydrophobic residues [31]. In contrast, biological transmembrane domains can be marginally hydrophobic and depend on local sequence context for insertion [32].

Recently, we developed an approach for membrane protein design using a thermodynamics-centered approach [33]. This advance builds upon our prior work developing a framework for membrane protein modeling in Rosetta [34,35]. There were three key developments: (1) models of different lipid compositions, (2) efficient computation of pore dimensions, and (3) the Moon & Fleming side chain hydrophobicity scale in phospholipid bilayers in the context of a real membrane protein [36]. This model is an addition to the existing useful set of mathematical rules used to discriminate near-native from non-native conformations: the energy function. First, we establish a central term based on the Lazaridis formalism for the water-to-bilayer transfer energy (Eq. 1).

Δ G memb = ( 1 − f hyd ) ( Δ G lipid − Δ G water )

In this framework, Δ G lipid is the free energy of a side chain in lipid and Δ G water is the free energy of a side chain in the aqueous phase. The term f hyd captures the shape of the bilayer. Here, the membrane is modeled as an implicit continuum [37]. The equation for f hyd again follows the Lazaridis formalism of a mixture model between the shape in the x, y, and z dimensions (Eq. 2).

f hyd = f thk + f pore − f thk f pore

The term f thk models the thickness of the hydrocarbon core and the steepness of the transition between the aqueous and lipid phase for different phospholipid bilayer compositions. The second term f pore represents cavities in the protein. Capturing cavities is key because a central function of membrane proteins is responding to extracellular signals or transporting materials by exposing the interior of a protein to water.

Next, we integrated the water-to-bilayer term with the current Rosetta all-atom energy function: ref2015 [38]. The Rosetta energy function is a hybrid model that aims to approximate the Gibbs free energy through a combination of physics-based, empirically derived, and knowledge-based terms. Currently, these terms include van der Waals, Coulomb electrostatics, Lazaridis-Karplus solvation, hydrogen bonding, and knowledge-based torsion terms. The weights on each term were determined through a Nelder-Mead optimization scheme [39]. We created the membrane version, called franklin2019, by adding the water-to-biolayer term and assigning a weight of 0.5. The weight was determined by iteratively running four scientific benchmark tests: peptide tilt angle, ΔΔG of mutation, sequence recovery, and native structure discrimination.

The energy function provides a foundation for choosing optimal sequences and conformations. The second key step of macromolecular modeling is to sample both the sequence and conformational space. To do so, we use Rosetta3’s existing Monte Carlo-plus-minimization sampling framework [34]. Here, the system degrees of freedom are perturbed, energy minimized, and then changes are accepted or rejected according to the Metropolis criteria: given that A = m i n ( 1 , e x p ( Δ E / k B T ) accept if A ≥ U ( 0,1 ) , where U ( 0,1 ) is a random number between zero and one (inclusive); otherwise, reject the changes. Importantly, Rosetta samples degrees of freedom as internal coordinates ( ϕ , ψ , ω ) rather than Cartesian coordinates ( x , y , z ). Bond lengths and angles can be sampled but are usually kept fixed. Finally, the calculation is stochastic, meaning the program is run many times and the ensemble of low energy models represents the final solution.

The basic recipe of Monte Carlo-plus-minimization forms a modular framework for creating different design protocols. In this chapter, we will discuss two membrane protein engineering protocols with franklin2019: (1) a ΔΔG calculation for single point mutations and (2) a fixed-backbone design protocol for optimizing sequences in bilayers with different lipid composition. These protocols can be adapted and expanded for wide-ranging membrane protein engineering calculations.

Materials

This chapter demonstrates membrane protein design calculations in the Rosetta macromolecular modeling package. The tutorial will require a Unix-based operating system (Linux or Mac OS). Windows users will need to install the Linux subsystem. We have included examples for both the UNIX command-line and Python environments. Additionally, the tutorial requires Python Version 3.6 or 3.7. The protocols were tested with PyRosetta revision #249 and Rosetta revision #61215.

Software

To run this tutorial, the user will need to download the following software:

Additionally, this tutorial uses the Orientations of Proteins in Membranes database (https://opm.phar.umich.edu). We have also provided the tutorial as an interactive Python notebook and standalone Python scripts on GitHub. To run the interactive session, the user will need to download the following software:

Jupyter Interactive Python notebooks (https://jupyter.org/install)

Methods

In this chapter, we will demonstrate two membrane protein engineering calculations: (1) predicting the ΔΔG of single point mutations and (2) sequence optimization in the context of bilayers with different lipid composition. Both protocols can be executed on standard computers and do not require specialized hardware. The execution time varies from minutes to hours depending on the hardware and protein size.

Protocol #1: ΔΔG of single point mutations

Accurately estimating the thermodynamic cost of a mutation is a fundamental step of protein engineering and design. This task is especially challenging for membrane proteins because the calculations must account for the heterogeneous membrane environment. Here, we will walk through the protocol for estimating the ΔΔG of point mutations at lipid facing positions using RosettaMP [35] and the franklin2019 energy function [33]. The protocol is available as an interactive Jupyter notebook at https://github.com/RosettaCommons/PyRosetta.notebooks (see Notebook #14.02).

As an example, we will examine mutations in the integral membrane enzyme PagP. PagP is a β-barrel protein that transfers a palmitoyl group from the sn1 position of a glycerophospholipid to the endotoxin of lipopolysaccharide [40]. This enzyme provides bacterial resistance to pathogens such as antimicrobial peptides [41]. Recently, Marx & Fleming measured the energetic cost of point mutations at site 111 on PagP [42]. In this tutorial, we will perform the same set of mutations with Rosetta and compare the predictions with the experimentally measured values. Text following the > symbols represent command lines that are entered in a Python interactive session or Jupyter notebook. Text following the $ symbols represent command lines that are entered into a shell terminal.

Preparing the PyRosetta working environment

To initialize the working environment, we first download the coordinates for PagP, clean the PDB file, and initialize the PyRosetta working environment. If using the PyRosetta notebooks GitHub repository, the initial PDB file is available in the inputs/ directory.

Download the coordinates for PagP (PDB: 3GP6) from the Orientations of Proteins in Membranes database (https://opm.phar.umich.edu/).

Clean the PDB file by removing all “non-ATOM” lines.
$ grep ‘^ATOM’ inputs/3gp6.pdb > inputs/3gp6_clean.pdb
Open a Python3 interactive environment or Jupyter notebook and load PyRosetta.
> from pyrosetta import *
> init(extra_options=“-mp:lipids:has_pore false”)

Load the PDB coordinate file into a Pose object. The Pose is a data structure that stores the coordinates and chemical information for the system.

> pose = pose_from_pdb(“inputs/3gp6_A.pdb”)

Initialize the protein in the membrane using AddMembraneMover . The protein is already oriented in the bilayer so we can estimate the transmembrane spans from the protein coordinates. For this reason, we use the from_structure option to initialize the transmembrane span information. The default lipid composition is a 1,2-dilauroyl-sn-glycero-3-phosphocholine (DLPC) bilayer which matches the experimental context of a DLPC vesicle.

> from pyrosetta.rosetta.protocols.membrane import *
> add_memb = AddMembraneMover(“from_structure”)
> add_memb.apply(pose)

Computing the ΔΔG of mutation

Next, we will compute the ΔΔG for several point mutations in PagP. Note, PyRosetta residue numbering may differ from PDB numbering because PyRosetta requires continuous numbering for calculations. We obtain the PyRosetta residue number through the PDBInfo object.

Create a ScoreFunction object with the franklin2019 energy function.
> sfxn = create_score_function(“franklin2019”)

Load the predict_ddG package and use the mutate_residue function to create the native conformation. Note: in the Marx & Fleming experiment, position 111 is first mutated from valine to alanine. Then, the V111A pose is used as the reference state for the remaining ΔΔG calculations.

> import predict_ddG as pd
> site = pose.pdb_info().pdb2pose(111)
> ref_pose = pd.mutate_residue(pose, site, “A”, 8.0, sfxn)
Score the alanine reference pose.
> score_A111 = sfxn.score(ref_pose)
Generate a tryptophan mutant and score the new pose.
> pose_W111 = pd.mutate_residue(pose, site, “W”, 8.0, sfxn)
> score_W111 = sfxn.score(pose_W111)

Compute the ΔΔG of mutation as the difference between the score of the mutated conformation and the native conformation.

> ddG = score_W11 – score_A11
> print(ddG)

Ultimately, we would like to compute the ΔΔG for mutating alanine to all 19 canonical amino acids. We will use a function that encapsulates this code. Then, we write a loop that computes the ΔΔG for all canonical amino acids and store the result in a Python dictionary. This step will take 10 minutes to compute depending on the available hardware and protein size.

> amino_acids = [‘A’, ‘C’, ‘D’, ‘E’, ‘F’, ‘G’, ‘H’, ‘I’, ‘K’, ‘L’, ‘M’, ‘N’, ‘P’, ‘Q’, ‘R’, ‘S’, ‘T’, ‘V’, ‘W’, ‘Y’]
> ddG_data = <>
> for aa in amino acids:
> ddG = compute_ddG(reference_pose, “A”, 104, aa, sfxn)
> ddG_data[ aa ] = ddG

Comparison between predictions and experimentally measured values

The next step is to compare the ΔΔG predictions to the experimentally measured values from Marx & Fleming. The experimental data are located in a file called PagP_Marx_Fleming_set.dat . This file can be downloaded from the inputs /directory in the PyRosetta notebook GitHub repository.

Parse the file and import the values into a Python Dictionary.
> with open(‘inputs/PagP_Marx_Fleming_set.dat’, ‘rt’) as f:
data = f.readlines()
data = [x.strip() for x in data]
data = [x.split(‘ ‘) for x in data]
> exp_ddG_data = <>
> for I in range(1, len(data)):
exp_ddG_data[data[i][2]] = float(data[i][3])
Convert the dictionary format to numpy arrays.
> import numpy as np
> mutations = np.asarray( ddG_data.keys() )
> pred_values = np.asarray( list(ddG_data.values()) )
> exp_values = np.asarray( list(exp_ddG_data.values()) )
Compute the correlation coefficient between the experimentally measured and predicted values.
> corr = np.corrcoeff(exp_values, pred_values)
> print(corr[0,1])

We find that the correlation coefficient is low (0.376). Importantly, the Pearson correlation coefficient is easily skewed by outliers. Thus, we write and execute a function to identify any outliers.

> def find_outliers(x):
> outliers = []
> upper = np.percentile(x, 75)
> lower = np.percentile(x, 25)
> IQR = (upper – lower)
> quartile_set = (lower − IQR, upper + IQR)
> for y in x.tolist():
> if (y < quartile_set[0]) or (y >quartile_set[1]):
> outliers.append(y)
> outliers = find_outliers(ddG_values)

Using this function, we find that proline is an outlier. We will investigate this more later. For now, we remove it from the set and recompute the correlation coefficient. (Note: Proline is the 13 th amino acid of 20)

> outlier_idx = list(ddG_data.values()).index(outliers[0])
> exp_data_no_P = []
> pred_data_no_P = []
> for i in range(0, 20):
if (i != outlier_idx):
exp_data_no_P.append(list(exp_ddG_data.values())[i])
pred_data_no_P.append(list(ddG_data.values())[i])
> exp_data_no_P = np.asarray( exp_data_no_P )
> pred_data_no_P = np.asarray( pred_data_no_P )
> corr = np.corrcoef( exp_data_no_P, pred_data_no_P )
> print(corr[0,1])

Examining contributions to the ΔΔG of mutation

Finally, we would like to use the models to learn why some mutations stabilized PagP, whereas other side chains did not.

We need a metric for identifying the most confident predictions, especially since the correlation coefficient is not perfect. Thus, we compute the residuals from the line of best fit and set an empirical cutoff of 1.5 REU. This value is aligned with the experimental uncertainty of 1.5 kcal/mol.

> import seaborn as sns
> resid = sns.residplot(exp_data_no_P, pred_data_no_P, color=“b”)
> resid.set_ylabel(“Residual”)
> resid.set_xlabel(“Exp (kcal/mol)”)

Next, we will hypothesize a mechanism for an example correct prediction (lysine) and rationalize incorrect predictions for proline and leucine. First, we quantify which energy components make the largest contribution to the overall ΔΔG of mutation. The function get_energy_components () is defined in the Jupyter Notebook.

> mutant_tyr = pd.mutate_residue(pose, site, “Y”, 8.0, sfxn)
> mutant_lys = pd.mutate_residue(pose, site, “K”, 8.0, sfxn)
> mutant_leu = pd.mutate_residue(pose, site, “L”, 8.0, sfxn)
> labels, tyr_ddGs = get_energy_components(reference_pose, mutant_tyr, sfxn )
> labels, lys_ddGs = get_energy_components(reference_pose, mutant_lys, sfxn )
> labels, leu_ddGs = get_energy_components(reference_pose, mutant_leu, sfxn )
Finally, we make bar graphs to visualize the individual contributions. These are shown in Fig. 1A .

An external file that holds a picture, illustration, etc. Object name is nihms-1798947-f0001.jpg

Models for evaluating the mechanism underlying single point mutations in PagP. (A) Residuals between the predicted and experimentally measured ΔΔGmut for mutations from alanine to all canonical amino acids at position 111 (104). The dotted red line represents the accuracy cutoff of 1.5 kcal/mol. The complete structure of PagP (PDB 3gp6) is shown in the top left corner of the plot, and position 111 is highlighted in red. Models of the A111L, A11K, and A11Y are shown in panels (B), (C), and (D) respectively. Within each panel, the top sub-panel shows the structural model of the mutated PagP with focus on the mutated site. The bottom sub-panel shows the contribution of individual energy terms to the overall ΔΔG for terms that contribute greater than 0.01 kcal/mol.

Further, we dump the models to PDB files for visualization in PyMOL ( Fig. 1B – D ).
> mutant_tyr.dump_pdb( “PagP_A111Y.pdb” )
> mutant_lys.dump_pdb( “PagP_A111K.pdb” )
> mutant_leu.dump_pdb( “PagP_A111L.pdb” )

Protocol #2: Sequence Optimization in different lipid compositions

The ultimate goal of membrane protein engineering is to search for a sequence that achieves a new protein stability, structure, or function. This task may involve changing a part of the sequence or allowing full protein redesign and optimization. Here, we will demonstrate a design calculation with a goal of optimizing the sequence of a membrane protein in different lipid compositions.

As an example, we will redesign the structure of the eukaryotic calcium/proton exchanger VDXC1 (PDB 4K1C) [43]. VDXC1 is part of the CAX family of proteins whose members maintain cytosolic calcium homeostasis during steep rises in intracellular calcium, or following signal transduction caused by hyperosmotic shock or hormone responses [44,45]. As a eukaryotic protein, we anticipate that the designed sequence will be more optimal in a thicker phospholipid bilayer, such as 1-palmitoyl-2-oleoyl-glycero-3-phosphocholine (POPC). To test this hypothesis, we will perform fixed-backbone redesign in both lipid compositions and compare the resulting sequences.

Preparing the Rosetta working environment

To initialize the working environment, we first download the coordinates of the crystal structure of VDXC1, clean the PDB file, generate a spanning topology file, and configure the working directory. This tutorial assumes that Rosetta is already installed in a directory that we will refer to as ROSETTA (see Materials for installation instructions). The “>” lines denote commands in a bash or other shell terminal environment.

Download the coordinates for VDXC1 (PDB 4K1C) from the Orientations of Proteins in Membranes database (https://opm.phar.umich.edu/).

Clean the PDB file by removing all “non-ATOM” lines.
$ grep ‘^ATOM’ 4k1c.pdb > 4k1c_clean.pdb

Generate a file listing the transmembrane segments in the protein using the mp_span_from_structure application. The resulting spanning topology file will be called out.span.

$ ROSETTA/main/source/bin/mp_span_from_pdb.linuxgccrelease
-in:file:s 4k1c_clean.pdb
-out:no_output true
Rename the spanning topology file to match the PDB name:
$ mv out.span 4k1c.span

Optimizing the sequence of VDXC1 in different lipid compositions

To perform sequence optimization, we use a Monte Carlo fixed-backbone design protocol that samples sequence space using a full protein rotamer-and-sequence optimization and multi-cool annealer-simulated annealing protocol [46]. Each protein is initialized in the orientation from the Orientation of Proteins in Membranes Database [47] and the orientation remains fixed during sequence optimization. Then, we compute properties of the sequence using additional Rosetta applications.

Run the fixed-backbone design protocol on the crystal structure of VDXC1 with a lipid composition of DLPC (a short chain lipid). This step will require 1-2hrs of CPU time on a standard computer. We also repeat this step using a lipid composition of POPC.

$ ROSETTA/main/source/bin/fixbb.linuxgccrelease
-in:file:s 4k1c_clean.pdb
-mp:setup:spanfiles 4k1c_clean.span
-score:weights franklin2019
-in:membrane
-mp:lipids:composition DLPC
-mp:lipids:temperature 37
-out:prefix 4k1c_DLPC_design_
-out:file:scorefile 4k1c_DLPC_design.sc
$ ROSETTA/main/source/bin/fixbb.linuxgccrelease
-in:file:s 4k1c_clean.pdb
-mp:setup:spanfiles 4k1c_clean.span
-score:weights franklin2019
-in:membrane
-mp:lipids:composition POPC
-mp:lipids:temperature 37
-out:prefix 4k1c_POPC_design_
-out:file:scorefile 4k1c_POPC_design.sc

Generate a list of native and designed files to work with the format of the sequence recovery application.

$ ls 4k1c_DLPC_design_4k1c_clean_0001.pdb > 4k1c_DLPC.list
$ ls 4k1c_POPC_design_4k1c_clean_0001.pdb > 4k1c_POPC.list
$ ls 4k1c_clean.pdb > 4k1c_native.list

Compute the number of amino acids recovered overall and within each environment (e.g., buried vs. surface exposed, and lipid-facing vs. aqueous).

$ ROSETTA/main/source/bin/mp_seqrecov.linuxgccrelease
-mp:setup:spanfiles 4k1c.span
-native_pdb_list 4k1c_native.list
-redesign_pdb_list 4k1c_DLPC.list
-seq_recov_filename 4k1c_DLPC_fixbb.txt
$ ROSETTA/main/source/bin/mp_seqrecov.linuxgccrelease
-mp:setup:spanfiles 4k1c.span
-native_pdb_list 4k1c_native.list
-redesign_pdb_list 4k1c_POPC.list
-seq_recov_filename 4k1c_POPC_fixbb.txt

Compute key metrics including sequence recovery and Kullback-Leibler divergence for all residues, subsets of amino acids, and for classes of amino acids using the process_protein_design_results.py script. The script is located in the PyRosetta notebooks GitHub repository in the directory called additional_scripts .

$ python process_protein_design_results.py
--energy_fxn f19
--seqrecov_file 4k1c_DLPC_fixbb.txt
--prefix 4k1c_DLPC_
--basedir $(pwd)
$ python process_protein_design_results.py
--energy_fxn f19
--seqrecov_file 4k1c_POPC_fixbb.txt
--prefix 4k1c_POPC_
--basedir $(pwd)

An external file that holds a picture, illustration, etc. Object name is nihms-1798947-f0002.jpg

Optimizing the sequence of a eukaryotic calcium/proton exchanger in short- and long-chain phospholipid compositions.

(A) Sequence recovery after fixed-backbone redesign of the VDXC1 calcium/proton exchanger (PDB 4k1c) subdivided by exposure of side chains to the aqueous, interface, and lipid phases. Buried positions are excluded from the calculation. Dark grey bars represent recovery in DLPC, and light grey bars represent recovery in POPC. (B) and (C) show the redesigned models of VDXC1 in DLPC and POPC respectively. The color of each position indicates the exposure phase, with blue representing the aqueous phase, teal representing the interface, and grey representing the lipid phase.

Acknowledgements

R.F.A. is funded by a Hertz Foundation Fellowship and a National Science Foundation Graduate Research Fellowship. R.F.A. and J.J.G. are also funded by NIH Grant GM-078221. We also thank Priyamvada Prathima and Kathy Le for testing the protocols.

Footnotes

There are several additional resources available for building membrane protein engineering protocols within the Rosetta software suite. Several resources are listed below.

Rosetta Forums (Q&A): https://www.rosettacommons.org/forum Documentation: https://www.rosettacommons.org/docs/latest/Home PyRosetta Workshops: http://www.pyrosetta.org/

References

1. Tan S, Tan HT, Chung MCM. Membrane proteins and membrane proteomics . Proteomics . 2008; 8 : 3924–3932. doi: 10.1002/pmic.200800597 [PubMed] [CrossRef] [Google Scholar]

2. Overington JP, Al-Lazikani B, Hopkins AL. How many drug targets are there? Nat Rev Drug Discov . 2006; 5 : 993–996. doi: 10.1038/nrd2199 [PubMed] [CrossRef] [Google Scholar]

3. Samish I, MacDermaid CM, Perez-Aguilar JM, Saven JG. Theoretical and Computational Protein Design . Annu Rev Phys Chem . 2011; 62 : 129–149. doi: 10.1146/annurev-physchem-032210-103509 [PubMed] [CrossRef] [Google Scholar]

4. Bill RM, Henderson PJF, Iwata S, Kunji ERS, Michel H, Neutze R, et al. Overcoming barriers to membrane protein structure determination. Nature Biotechnology . Nature Publishing Group; 2011. pp. 335–340. doi: 10.1038/nbt.1833 [PubMed] [CrossRef] [Google Scholar]

5. Koehler Leman J, Ulmschneider MB, Gray JJ. Computational modeling of membrane proteins . Proteins Struct Funct Bioinforma . 2015; 83 : 1–24. doi: 10.1002/prot.24703 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

6. Perez-Aguilar JM, Saven JG. Computational design of membrane proteins. Structure . Cell Press; 2012. pp. 5–14. doi: 10.1016/j.str.2011.12.003 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

7. Barth P, Senes A. Toward high-resolution computational design of the structure and function of helical membrane proteins. Nature Structural and Molecular Biology . Nature Publishing Group; 2016. pp. 475–480. doi: 10.1038/nsmb.3231 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

8. Slusky JS. Outer membrane protein design. Current Opinion in Structural Biology . Elsevier Ltd; 2017. pp. 45–52. doi: 10.1016/j.sbi.2016.11.003 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

9. Chen K-YM, Zhou F, Fryszczyn BG, Barth P. Naturally evolved G protein-coupled receptors adopt metastable conformations . 2012;109. doi: 10.1073/pnas.1205512109/-/DCSupplemental [PMC free article] [PubMed] [CrossRef] [Google Scholar]

10. Slovic AM, Kono H, Lear JD, Saven JG, DeGrado WF. Computational design of water-soluble analogues of the potassium channel KcsA . Proc Natl Acad Sci U S A . 2004; 101 : 1828–1833. doi: 10.1073/pnas.0306417101 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

11. Perez-Aguilar JM, Xi J, Matsunaga F, Cui X, Selling B, Saven JG, et al. A Computationally Designed Water-Soluble Variant of a G-Protein-Coupled Receptor: The Human Mu Opioid Receptor. Zhang Y, editor . PLoS One . 2013; 8 : e66009. doi: 10.1371/journal.pone.0066009 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

12. Young M, Dahoun T, Sokrat B, Arber C, Chen KM, Bouvier M, et al. Computational design of orthogonal membrane receptor-effector switches for rewiring signaling pathways . Proc Natl Acad Sci U S A . 2018; 115 : 7051–7056. doi: 10.1073/pnas.1718489115 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

13. Kumar M, Grzelakowski M, Zilles J, Clark M, Meier W. Highly permeable polymeric membranes based on the incorporation of the functional water channel protein Aquaporin Z . 2007. Available: www.pnas.orgcgidoi10.1073pnas.0708762104 [PMC free article] [PubMed]

14. Chowdhury R, Ren T, Shankla M, Decker K, Grisewood M, Prabhakar J, et al. PoreDesigner for tuning solute selectivity in a robust and highly permeable outer membrane pore . Nat Commun . 2018; 9 : 1–10. doi: 10.1038/s41467-018-06097-1 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

15. Huang PS, Boyken SE, Baker D. The coming of age of de novo protein design. Nature . Nature Publishing Group; 2016. pp. 320–327. doi: 10.1038/nature19946 [PubMed] [CrossRef] [Google Scholar]

16. Baker D What has de novo protein design taught us about protein folding and biophysics? Protein Sci . 2019; 28 : 678–683. doi: 10.1002/pro.3588 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

17. King NP, Sheffler W, Sawaya MR, Vollmar BS, Sumida JP, André I, et al. Computational design of self-assembling protein nanomaterials with atomic level accuracy . Science (80-) . 2012; 336 : 1171–1174. doi: 10.1126/science.1219364 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

18. Langan RA, Boyken SE, Ng AH, Samson JA, Dods G, Westbrook AM, et al. De novo design of bioactive protein switches . Nature . 2019; 572 : 205–210. doi: 10.1038/s41586-019-1432-8 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

19. Humphris EL, Kortemme T. Design of multi-specificity in protein interfaces . PLoS Comput Biol . 2007; 3 : 1591–1604. doi: 10.1371/journal.pcbi.0030164 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

20. Joh NH, Wang T, Bhate MP, Acharya R, Wu Y, Grabe M, et al. De novo design of a transmembrane zn2+-transporting four-helix bundle . Science (80- ) . 2014; 346 : 1520–1524. doi: 10.1126/science.1261172 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

21. Korendovych I V, Senes A, Kim YH, Lear JD, Fry HC, Therien MJ, et al. De novo design and molecular assembly of a transmembrane diporphyrin-binding protein complex . J Am Chem Soc . 2010; 132 : 15516–15518. doi: 10.1021/ja107487b [PMC free article] [PubMed] [CrossRef] [Google Scholar]

22. Lu P, Min D, DiMaio F, Wei KY, Vahey MD, Boyken SE, et al. Accurate computational design of multipass transmembrane proteins . Science . 2018; 359 : 1042–1046. doi: 10.1126/science.aaq1739 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

23. Koehler Leman J, Bonneau R, Ulmschneider MB. Statistically derived asymmetric membrane potentials from α-helical and β-barrel membrane proteins . Sci Rep . 2018; 8 : 4446. doi: 10.1038/s41598-018-22476-6 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

24. Senes A, Chadi DC, Law PB, Walters RFS, Nanda V, DeGrado WF. Ez, a Depth-dependent Potential for Assessing the Energies of Insertion of Amino Acid Side-chains into Membranes: Derivation and Applications to Determining the Orientation of Transmembrane and Interfacial Helices . J Mol Biol . 2007; 366 : 436–448. doi: 10.1016/j.jmb.2006.09.020 [PubMed] [CrossRef] [Google Scholar]

25. Yarov-Yarovoy V, Schonbrun J, Baker D. Multipass membrane protein structure prediction using Rosetta . Proteins Struct Funct Bioinforma . 2005; 62 : 1010–1025. doi: 10.1002/prot.20817 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

26. Lazaridis T Effective energy function for proteins in lipid membranes . Proteins Struct Funct Genet . 2003; 52 : 176–192. doi: 10.1002/prot.10410 [PubMed] [CrossRef] [Google Scholar]

27. Lazaridis T, Karplus M. Effective energy function for proteins in solution . Proteins . 1999; 35 : 133–52. Available: http://www.ncbi.nlm.nih.gov/pubmed/10223287 [PubMed] [Google Scholar]

28. Barth P, Schonbrun J, Baker D. Toward high-resolution prediction and design of transmembrane helical protein structures . Proc Natl Acad Sci . 2007; 104 : 15682–15687. doi: 10.1073/pnas.0702515104 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

29. MacCallum JL, Bennett WFD, Tieleman DP. Distribution of amino acids in a lipid bilayer from computer simulations . Biophys J . 2008; 94 : 3393–404. doi: 10.1529/biophysj.107.112805 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

30. Kroncke BM, Duran AM, Mendenhall JL, Meiler J, Blume JD, Sanders CR. Documentation of an Imperative To Improve Methods for Predicting Membrane Protein Stability . Biochemistry . 2016; 55 : 5002–9. doi: 10.1021/acs.biochem.6b00537 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

31. Duran AM, Meiler J. Computational design of membrane proteins using RosettaMembrane . Protein Sci . 2018; 27 : 341–355. doi: 10.1002/pro.3335 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

32. De Marothy MT, Elofsson A. Marginally hydrophobic transmembrane α-helices shaping membrane protein folding. Protein Science . Blackwell Publishing Ltd; 2015. pp. 1057–1074. doi: 10.1002/pro.2698 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

33. Alford RF, Fleming PJ, Fleming KG, Gray JJ. Protein structure prediction and design in a biologically realistic implicit membrane . Biophys J . 2020. [PMC free article] [PubMed] [Google Scholar]

34. Leaver-Fay A, Tyka M, Lewis SM, Lange OF, Thompson J, Jacak R, et al. Rosetta3: An object-oriented software suite for the simulation and design of macromolecules . Methods in enzymology . 2011. pp. 545–574. doi: 10.1016/B978-0-12-381270-4.00019-6 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

35. Alford RF, Koehler Leman J, Weitzner BD, Duran AM, Tilley DC, Elazar A, et al. An Integrated Framework Advancing Membrane Protein Modeling and Design. Livesay DR, editor . PLOS Comput Biol . 2015; 11 : e1004398. [PMC free article] [PubMed] [Google Scholar]

36. Moon CP, Fleming KG. Side-chain hydrophobicity scale derived from transmembrane protein folding into lipid bilayers . Proc Natl Acad Sci . 2011. doi: 10.1073/pnas.1103979108 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

37. Feig M Implicit membrane models for membrane protein simulation . Methods Mol Biol . 2008; 443 : 181–196. doi: 10.1007/978-1-59745-177-2_10 [PubMed] [CrossRef] [Google Scholar]

38. Alford RF, Leaver-Fay A, Jeliazkov JR, O’Meara MJ, DiMaio FP, Park H, et al. The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design . J Chem Theory Comput . 2017; 13 : 3031–3048. doi: 10.1021/acs.jctc.7b00125 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

39. Park H, Bradley P, Greisen P, Liu Y, Mulligan VK, Kim DE, et al. Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules . J Chem Theory Comput . 2016; 12 : 6201–6212. doi: 10.1021/acs.jctc.6b00819 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

40. Cuesta-Seijo JA, Neale C, Khan MA, Moktar J, Tran CD, Bishop RE, et al. PagP crystallized from SDS/Cosolvent reveals the route for phospholipid access to the hydrocarbon ruler . Structure . 2010; 18 : 1210–1219. [PMC free article] [PubMed] [Google Scholar]

41. Guo L, Lim KB, Poduje CM, Daniel M, Gunn JS, Hackett M, et al. Lipid A acylation and bacterial resistance against vertebrate antimicrobial peptides . Cell . 1998; 95 : 189–198. [PubMed] [Google Scholar]

42. Marx DC, Fleming KG. Influence of Protein Scaffold on Side-Chain Transfer Free Energies . Biophys J . 2017; 113 : 597–604. [PMC free article] [PubMed] [Google Scholar]

43. Waight AB, Pedersen BP, Schlessinger A, Bonomi M, Chau BH, Roe-Zurz Z, et al. Structural basis for alternating access of a eukaryotic calcium/proton exchanger . Nature . 2013; 499 : 107–110. [PMC free article] [PubMed] [Google Scholar]

44. Shigaki T, Rees I, Nakhleh L, Hirschi KD. Identification of three distinct phylogenetic groups of CAX cation/proton antiporters . J Mol Evol . 2006; 63 : 815–825. [PubMed] [Google Scholar]

45. Hirschi KD, Zhen RG, Cunningham KW, Rea PA, Fink GR. CAX1, an H+/Ca2+ antiporter from Arabidopsis . Proc Natl Acad Sci U S A . 1996; 93 : 8782–8786. [PMC free article] [PubMed] [Google Scholar]

46. Leaver-Fay A, O’Meara MJ, Tyka M, Jacak R, Song Y, Kellogg EH, et al. Scientific benchmarks for guiding macromolecular energy function improvement. Methods in Enzymology . Academic Press Inc.; 2013. pp. 109–143. [PMC free article] [PubMed] [Google Scholar]

47. Lomize MA, Pogozheva ID, Joo H, Mosberg HI, Lomize AL. OPM database and PPM web server: resources for positioning of proteins in membranes [PMC free article] [PubMed] [Google Scholar]