All methods described below should use aseptic techniques; yeast are grown at 30°C, and Escherichia coli at 37°C.
1 Constructing and Transforming a Bait Protein
1.1 Charging and Equilibration of the ZipTipMC
A prerequisite for an interactor hunt is the construction of plasmids that express the protein of interest as fusions to the bacterial protein LexA (or some other DNA-binding domain). These plasmids are transformed into a yeast reporter strain to assess the suitabilityof the bait proteins for library screening. Comparison to some established controls allows determination of whether the baits are appropriately synthesized, not transcriptionally active, and not toxic. If any of these conditions is not met, strategies for modifying bait or screening conditions are suggested. To minimize the chance of artifactual results or other difficulties, it is a good idea to move rapidly through the suggested characterization steps before undertaking a library screen. Although plasmids will be retained for extended periods in yeast maintained at 4°C on stock plates, it is recommended to perform the next step in the protocol (e.g., mating or characterization) within a week from the initial introduction of the plasmid(s) in the yeast, to avoid variable protein expression and transcriptional activation problems.
1) Clone the DNA encoding the protein of interest into the polylinker of pMW103 to enable synthesis of an in-frame protein fusion to LexA.
2) Select a colony of SKY48 and prepare competent yeast.
3) Transform competent yeast cells with the following combinations of LexA fusion and pMW109 plasmids (100–500 ng each):
a) pBait + pMW109 (test for activation)
b) pEG202-hsRPB7+ pMW109 (weak positive control for activation)
c) pSH17-4 + pMW109 (strong positive control for activation)
d) pEG202-Ras + pMW109 (negative control for activation)
4) Plate each transformation mixture on Glu/CM –Ura–His dropout plates, and maintain for 2 to 3 d to select for yeast colonies containing transformed plasmids.
2 Assessing Bait Activation and Expression
2.1 Replica Technique/Gridding Yeast: Assessing Activation of Auxotrophic Reporter
For each combination of plasmids, assay at least six independent colonies for activation phenotype of auxotrophic and colorimetric reporters. Assessment of transcriptional activation requires the transfer of yeast from master plates to a variety of selective media. This transfer can be accomplished simply by using a sterile toothpick to move cells from individual patches on the master plate to each of the selective media. However, in cases in which large numbers of colonies and combinations of bait and prey are to be examined, and particularly in genomic-scale applications, it is useful to use a transfer technique that facilitates high-throughput analysis. The following technique, based on microtiter plates, is an example of such an approach.
1) Add approx 50 μL of sterile water to each well of one-half (wells A1–H6) of a 96-well microtiter plate (a syringe-based repeater or a multichannel pipettor can be used). Place an insert grid from a rack of pipet tips over the top of the microtiter plate and attach it with tape: the holes in the insert grid should be placed exactly over the wells of the microtiter plate (placing the grid is essential to stabilize the tips in the plate, and allow their simultaneous removal)
2) Using sterile plastic pipet tips, pick six yeast colonies (1 to 2 mm diameter) from each of the transformation plates a–d (Subheading 1., step 3). Leave the tips supported by the insert grid until all the colonies have been picked.
3) Swirl the plate gently to mix the yeast into suspension, remove the tape attaching the grid to the plate, and lift the insert grid, thereby removing all the tips at once.
4) Use a replicator to plate yeast suspensions (each spoke will leave a drop approximately equal to a 3-μL vol) on the following plates: Glu/CM –Ura–His (a new master plate); Gal-Raff/CM –Ura–His–Leu, and Gal- Raff/CM –Ura–His. Incubate the plates for up to 4 d, and save the Glu/CM –Ura–His master plate at 4°C.
5) Yeast containing the strong positive control (c) should be detectably growing on –Ura–His–Leu plate within 1–2 d; yeast with the weak positive control (b), within 4 d; and the negative control (d) should not grow. If the bait under test (a) shows no growth in this period, it is probably suitable for library screening; if it gives a profile similar to (b), it may be suitable, but is likely to have a high background in library screening, suggesting that use of a different screening strain may be appropriate; if it is similar to (c), it must be reconfigured.
2.2 Assessing Activation of Colorimetric Reporter
Approx 24 to 30 h after the plating, overlay the Gal-Raff/CM –Ura–His plate with X-Gal agarose as follows:
1) Gently overlay each plate with chloroform, pipetting slowly in from the side so as not to smear colonies. Leave colonies completely covered for 5 min. Caution: CHCl3 is quite toxic, and should neither be inhaled nor come into contact with skin. Wear gloves and work in a chemical hood. Try to avoid extensive contact with the walls of the plate, as the plastic dissolved in CHCl3 may leave a film on the agar/colonies surface.
2) Briefly overlay the plates with another approx 5 mL chloroform (optional), drain, and let dry, uncovered, for another 5 min at 37°C or for 10 min in the chemical hood.
3) Overlay the plate with approx 10 mL of X-Gal-agarose, making sure that all yeast spots are completely covered.
4) Incubate plates at 30°C and monitor for color changes. It is generally useful to check the plates after 20 min, and again after 1 to 3 h. Strongly activating baits will be detectable as dark blue colonies in 20 to 60 min, whereas negative controls should remain as faint blue or white colonies; an optimal bait would either mimic the negative control or only develop faint blue color.
In an optimal result, all six colonies assayed representing the same transformation would have essentially the same phenotype. For a small number of baits, this is not the case. The most typical deviation is that of six colonies assayed for the bait, some fraction appears to be inactive (white in colorimetric assay, and not growing on auxotrophic selection medium), while the remaining fraction displays some degree of blueness and growth. Do not select the white, nongrowing colonies as the starting point in a library screen; often, these colonies are synthesizing little or no bait protein.
2.3 Detection of Bait Protein Expression
One excellent confirmation that a bait protein is correctly expressed would be its specific interaction with a known partner, expressed as an activation-domain fusion protein. In the absence of such confirmation, Western analysis of lysates of yeast containing DBD-fused baits is helpful in characterization of the bait's expression level and size. Some proteins (especially where the fusion domain is approx 60–80 kDa or larger) may either be synthesized at very low levels, or be posttranslationally clipped by yeast proteases. Proteins expressed at low levels, and apparently inactive in transcriptional activation assays, can be epigenetically upregulated to much higher levels under the auxotrophic selection and suddenly demonstrate a high background of transcriptional activation. Where proteins are proteolytically clipped, screens might inadvertently be performed with DBD fused only to the amino-terminal end of the larger intended bait. Either of the above two problems can lead to complications in library screens.
3 Transforming a Library, and Characterizing Interactors From a Screen
Currently, the most convenient source of libraries suitable for the interaction trap is commercial. The following protocols are designed with the goal of saturation screening of a cDNA library derived from a genome of mammalian complexity. Fewer plates will be required for screens with libraries derived from organisms with less complex genomes, and researchers should scale back accordingly. It is generally a good idea to additionally mate new bait strains with a negative control strain. The control strain is the same strain used for the library but containing the library vector with no cDNA insert. This control will provide a clear estimate of the frequency of cDNA-independent false positives, which is important to know when deciding how many positives to pick and characterize.
4 DNA Isolation and Second Confirmation of Positive Interactions
Execution of the above protocols for a given bait will result in the isolation of between zero and hundreds of potential "positive" interactors. These positives must next be evaluated for reproducible phenotype, and for specificity of interaction with the bait used to select them. If a large number of positives are obtained, these subsequent characterizations require prioritization. In this case, select up to approx 24–48 independent colonies with robust phenotype for the first round of characterization, while maintaining a master plate of additional positives at 4°C. This first analysis set will be tested for specificity, and screened by PCR/restriction analysis and/ or sequencing to determine whether clusters of frequently isolated cDNAs are obtained: such clusters are generally a good indication for a specific interaction.
While both utilize similar methods, the order with which techniques are applied differs; the choice between strategies depends on whether the individual investigator would rather spend time and money doing bulk yeast plasmid recovery, or bulk PCR. The latter protocol is generally 1–3 d faster, but is not as reproducibly accomplished in some investigators' hands.
5 Reiterative Scale-Up
One approach to elaborating a protein network is to perform reiterative interactor hunts. Such hunts can start with one or several different baits known to be involved in the biology under study. Subsequent interactor hunts can then be performed using the newly isolated proteins as baits. Such a protein "interaction walk" can be performed using standard protocols like those described earlier in this chapter. If the goal is to elaborate a large protein network that may require many interactor hunts, the rate-limiting step can become subcloning and characterizing new baits. This process can be streamlined by making new bait-expressing plasmids from newly isolated library clones by PCR and in vivo recombination. In this approach, a single set of primers is used that have homology to the library vector immediately upstream and downstream of the cDNA insertion site and that also have 5' tails that are homologous to the bait vector. Amplification of library clones with these primers results in a product that can be co-transformed along with a linearized bait vector. The PCR product will be recombined into the bait vector by homologous recombination in the yeast. The resulting yeast strain can be used directly in an interactor hunt by mating with an aliquot of frozen pretransformed library strain, as described in Subheading 3.3.
Before the new bait strain is used, it should be tested to ensure that it expresses the new bait. The most efficient way to accomplish this is to mate the new bait strain (expressing bait B) with a strain expressing the original protein A (previously used as a bait) as an activation domain fusion (AD-A). This will require subcloning the original bait into the library plasmid. Successful interaction can streamline the approach by dispensing with some of the bait characterization steps, e.g., Western assay. For each subsequent iteration of the protein walk, the new bait strain can be tested by mating it with strains expressing the previously isolated library clones.
An added benefit to this approach is that each interaction will get tested twice, each time with the two proteins expressed with different fusion moieties. However, there are many documented cases in which a two-hybrid interaction is not detectable when the DNA-binding domain (DBD) and AD are swapped, especially when full-length proteins are used.
6 Array Screening
Arrays can also be used as a general tool for new screening-defined subsets of proteins. When combined with bioinformatic selection of candidate interactors, arrays are powerful tools to identify new interactions. For example, we know that SH3 domains usually bind to proline-rich sequences. That is, we can predict all proline-rich candidate interactors for SH3 domains from complete genome sequences. These open reading frames can be amplified from cDNA libraries or genome sequences and cloned into prey vectors by recombination. Instead of screening a complex library, it may be sufficient to screen such a defined subset of preys. Prey sets can be selected based on a variety of criteria, e.g., sequence/homology, subcellular location, known or presumed function, and so on. Ultimately, whole genomes can be cloned into prey vectors, arrayed, and screened as has been done with the yeast and Drosophila genomes.
The main advantage of an array screen is its parallel nature, which allows one to compare preys as well as baits. Different preys have different affinities to any given bait; this will be indicated by different colony sizes on selective plates, or different color intensities when lacZ assays are performed. Furthermore, different baits can be compared when they are screened against the same prey array: some will generate only a few or no positives, while some will cause even activation. A certain group of baits will cause "random" activation, i.e., a relatively large number of positives with no or almost no background activation. In a "conventional" pretest for a library screen, such baits may not stand out because they are not tested against enough preys.
7 Data Analysis and Bioinformatics Aspects
With the advent of large-scale interaction screening, systematic collection of interaction data in a database became essential. In examining such databases, besides the obvious goal of identifying which other interactors are known for the protein of interest, a number of other questions generally arise. Do the interacting proteins for the protein of interest have homologs in other species, and do these homologs interact as well? With what additional proteins do the interactors interact? Is an interaction chain containing several connected proteins part of a single protein complex, or an ordered signaling pathway?
Integration and visualization of interactions from different sources becomes increasingly challenging.
Interestingly, the increasing numbers of interactions recorded in the public domain not only has caused a tremendous need for visualization tools, but has also induced a surprisingly large number of studies aiming at the analysis of interaction networks. For example, it has been shown that protein interaction networks appear to have a scale-free topology, i.e., they have a few hubs of highly connected proteins and many less-well-connected proteins.
Hubs in such networks are indeed more essential to a cell than peripheral proteins when mutated, as would be predicted by their central position. Interaction networks can be used to predict the function of previously uncharacterized proteins, because interacting proteins usually have related activities. Similarly, interactions can be used to predict protein complexes based on local clusters ("cliques") of interactions.
Two-hybrid interactions can also be used to predict interaction domains. When random libraries are used for screening, overlapping cDNA or genomics DNA fragments may inherently narrow down interacting fragments. However, even large-scale screens that used full-length clones contain enough information to select pairs of proteins that share common domains, subsequently allowing domain identification by sequence comparison.
Finally, with a growing database of interactions, comparative "interactomics" becomes possible. We can now analyze the evolution of protein interactions and even of whole interaction networks. Combined with data from structural proteomics, this should allow us to understand which amino acids are important for the structure and function of protein complexes. Eventually, this may even be exploited for the prediction of inhibitors or other compounds that affect protein function.
However, we are not quite there yet. We need more data, better software and databases, and most importantly, better integration of various data sources. Eventually, this will lead to a merger of many different areas of molecular biology into what is now called systems biology.
Reference
- Walker, J. M. (Ed.). (2005). The proteomics protocols handbook. Humana press.