BIO312 Final Exam List

Protein Structure and Function

Author

TC-tea

Published

2024.06.03

🚀

Tool

Courseware

Lecture

Exercise

Code

De Novo for M/Z plot

Example of M/Z analysis

# De Novo sequencing
def find_matching_indices(data, values):
    indices = []
    for i in range(len(data)):
        for j in range(i + 1, len(data)):
            diff = abs(data[i] - data[j])
            if any(abs(diff - value) < 5e-1 for value in values):  # determine the difference
                indices.append((i + 1, j + 1, diff))
    return indices
  
# M/Z peak data
data = [175.12, 262.12, 274.19, 375.21, 446.23, 487.32, 602.32, 730.43, 859.45, 875.42, 916.49, 987.52, 1100.61,
        1247.69, 1360.9]
        
# Mass of amino acids
values = [57.02147, 71.03712, 87.03203, 97.05277, 99.06842, 101.04768, 103.00919, 113.08407, 114.04293, 115.02695,
          128.05858, 128.09497, 129.04260, 131.04049, 137.05891, 147.06842, 156.10112, 163.06333, 186.07932]
          
# Calculate the difference and find the same position as the given value
matching_indices = find_matching_indices(data, values)

# Results
for index in matching_indices:
    print(f"#{index[0]} - #{index[1]} = {index[2]:.2f}")

#1 - #2 = 87.00
#1 - #3 = 99.07
#2 - #4 = 113.09
#3 - #4 = 101.02
#4 - #5 = 71.02
#5 - #7 = 156.09
#6 - #7 = 115.00
#7 - #8 = 128.11
#8 - #9 = 129.02
#8 - #11 = 186.06
#9 - #11 = 57.04
#9 - #12 = 128.07
#11 - #12 = 71.03
#12 - #13 = 113.09
#13 - #14 = 147.08
#14 - #15 = 113.21

Molecular weight of mass spectrum

Example of mass spectrum

# Calculate the charge (19+ 18+ 17+ 16+ 15+)
z18 = (3212.1-1.0078)/(3390.4-3212.1)
z18

18.0094907459338

# Mass of protein A
mA = ((12*2128.1-12)+(11*2321.5-11)+(10*2553.5-10)+(9*2837.1-9))/4
mA

25525.149999999998

# Mass of AB complex
mAB = ((19*3212.1-19)+(18*3390.4-18)+(17*3589.8-17)+(16*3814.1-16)+(15*4068.3-15))/5
mAB

61009.76000000001

Past exam paper

BIO312 final exam (2022-2023)

Question 1

Q1: 2D-DIGE gel & MS spectrometry

Solution:

They have different PH levels, and Mass Spectrometry technology can be used to separate these three proteins.
Steps:
- Excise the protein spots from the gel.
- Digest the proteins into peptides using trypsin.
- Extract and purify the peptides from the gel pieces.
- Ionize the peptides using techniques like Electrospray Ionization (ESI) or Matrix-Assisted Laser Desorption/Ionization (MALDI).
- Analyze the peptides using a Mass Spectrometer to obtain their mass-to-charge ratios (m/z).
- Select the peptides of interest for fragmentation.
- Fragmentize the selected peptides using Collision-Induced Dissociation (CID) or Higher Energy Collisional Dissociation (HCD).
- Analyze the resulting fragment ions to identify characteristic signs of phosphorylation, such as loss of the phosphate group (e.g., -H3PO4).
- Use bioinformatics tools to match the peptide sequences and confirm the presence of phosphorylated tyrosine residues.

Question 2

Q2: MS spectrum

Solution:

(681.45*2)-2=1360.9
D+P | I+V | L+V
Same as the CW1 q1, N F I/L A G E Q/K D X X V R (XX could be NV, VN, RG, GR).

Question 3

Q3: MS and MS/MS spectra

Solution:

m/z=693.1 & m/z=1422.0
Steps:
- Perform Spike-In Experiments: Introduce a known quantity of the NHL1 peptide into the sample to confirm that the selected transitions are detectable and accurately quantify the peptide.
- Optimize SRM Parameters: Adjust the collision energy and other SRM settings to ensure the selected transitions produce the most intense and specific signals.
- Analyse MS/MS Spectra: Confirm that the selected transitions match the expected fragment ions by comparing the experimental spectra to theoretical predictions or database entries.
- Evaluate Reproducibility: Assess the consistency of the transitions across multiple runs to ensure they are reliable for quantification purposes.
- Check for Cross-Contamination: Ensure that the transitions are unique to the NHL1 peptide and not shared with other peptides that could lead to false positives.
Triple quadrupole mass spectrometer is typically used for SRM assays due to its ability to selectively monitor specific peptide transitions.
List:
- Data-Dependent Acquisition (DDA): To identify and quantify peptides without prior knowledge of the sample composition.
- Data-Independent Acquisition (DIA): An unbiased approach that captures a wide range of peptides for comprehensive analysis.
- Isobaric Tagging: Such as tandem mass tag (TMT) or isobaric tags for relative and absolute quantitation (iTRAQ), for multiplexed comparative proteomics.
- Western Blot: A gel-based method to confirm protein expression levels.
- Enzyme-Linked Immunosorbent Assay (ELISA): For quantitative analysis of specific proteins using antibodies.
- Peptide Enrichment Techniques: Such as immunoprecipitation or affinity chromatography, to selectively enrich and quantify target peptides.

Question 4

Q4: X-ray Crystallography & NMR Spectroscopy

Solution:

X-ray Crystallography Limitations:
- Crystal Dependency: Requires high-quality crystals that are not always achievable.
- Radiation Sensitivity: Susceptible to radiation damage affecting the protein structure.
- Static Structure: Captures a single conformation, missing dynamic aspects.
NMR Spectroscopy Limitations:
- Size Restriction: Limited to smaller proteins due to spectral complexity.
- Sensitivity Issues: Requires larger sample amounts and longer times.
- Structural Detail: Provides less detailed structural information compared to X-ray crystallography.

Question 5

Q5: CD spectrum

Solution:

A, 1BRR protein is basically composed of alpha-helix and beta-sheets, this structure generally present a negative peak around 218 nm and a positive peak around 195 nm in the CD spectrum.
E, Urea is a denaturant that disrupts the non-covalent interactions that stabilize protein structure. If the protein were treated with 8 M urea, it would expect to cause a loss of the characteristic secondary structure peaks, shifting towards a spectrum, which is typically a weak and featureless spectrum with a minimum around 200 nm.
In the absence of ligands, mutant would have fewer alpha-helix and beta-sheets than wild type. After the addition of ligands, both alpha-helix and beta-sheet of wild type would drastically decrease, but mutant beta-sheet would be more affected at this time.

Question 6

Q6: Ramachandran plot

Solution:

The Ramachandran plot is essential as it evaluates the steriochemical quality of a protein structure by plotting the phi (φ) and psi (ψ) backbone dihedral angles. It helps identify if the protein’s backbone conforms to allowed regions based on known protein geometries, ensuring the model’s accuracy.
It turns out that the enzyme in Plot B has a well-defined conformation and the crystallography data is of higher quality.
- Plot A: The points are scattered and many are outside the favored red region, indicating potential issues with the protein model’s accuracy. This dispersion could be due to structural errors, poor data quality, or the presence of multiple conformations.
- Plot B: The points are predominantly within the favored red region, suggesting a high-quality protein model with correct backbone geometry and fewer structural anomalies.

Question 7

Q7: Surface Plasmon Resonance (SPR)

Solution:

\[K_i=\frac{koff}{kon}\]
- Association Rate (\(kon\)): The initial slope of the binding phase is proportional to the association rate constant.
- Dissociation Rate (\(koff\)): The slope of the dissociation phase is related to the dissociation rate constant.
- Equilibrium Binding Affinity (\(K_D\) or \(K_i\)): The affinity can be calculated using the ratio \(koff\)/\(kon\).
Inhibitor C, the \(koff\) is much more smaller than \(kon\), so \(K_i\) can be 0.5 nM.
Inhibitor B, the \(koff\) is relatively high, so according to \(K_i=\frac{koff}{kon}\) we can get the \(K_i\) of B is relatively higher than others.
For chronic diseases, an inhibitor with a high affinity (low K value) is desirable to maintain therapeutic levels with less frequent dosing. Inhibitor C, with the lowest \(K\) value (0.5 nM), would be the most suitable. It would likely provide a longer duration of action and potentially fewer side effects due to its high binding affinity, making it ideal for consistent management of conditions like epilepsy.

BIO312 final exam (2021-2022)

Question 1

Q1: MS1 spectrum & tandem mass spectrum

Solution:

Mass of the modified peptide: 690.87 Da
Charge of this peptide: +2
De Novo sequencing: D V A I/L A A A I/L A V

Question 2

Q2:

Solution:

To address Researcher A’s question efficiently, we can follow these steps:
- Size Exclusion Chromatography (SEC): Since Researcher A has already performed SEC and found the enzyme activity concentrated in the 30-50 kDa fraction, this step has effectively narrowed down the protein candidates.
- One-Dimensional Gel Electrophoresis (1D-PAGE): Next, apply 1D-PAGE to the 30-50 kDa fraction to further separate and purify the proteins. This will allow visualization of distinct protein bands, from which the band(s) of interest can be excised.
- Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry (MALDI-TOF MS): Use the MALDI-TOF MS in reflectron mode to determine the exact mass of the proteins in the excised band(s). This method is highly accurate for protein mass determination.
- Tandem Quadrupole Time-of-Flight Mass Spectrometry (tandem Q-TOF MS): If additional information on protein identity and sequence is required, tandem Q-TOF MS can be employed. This instrument can provide fragmentation data that, when combined with a protein database, can lead to the identification of the protein.
To assist Researcher B in identifying the synthetic peptide, we can use the following approach:
- Reverse-Phase High-Performance Liquid Chromatography (RP-HPLC): Researcher B has already utilized RP-HPLC to separate the peptide mixture. This step is based on hydrophobicity and has resulted in 56 fractions.
- Nanoflow HPLC System: To further purify and analyze the fractions, use the nanoflow HPLC system. This will increase the resolution and purity of the peptide, which is especially important given the complexity of the mixture.
- Tandem Quadrupole Time-of-Flight Mass Spectrometry (tandem Q-TOF MS): Couple the nanoflow HPLC directly to the tandem Q-TOF MS for online analysis. This setup allows for the real-time detection and identification of peptides as they elute from the HPLC column. The tandem MS capability will provide sequence information, which is crucial for identifying the synthetic peptide.
- MALDI-TOF MS: For the fractions that are suspected to contain the synthetic peptide based on the HPLC-MS/MS data, use MALDI-TOF MS to rapidly confirm the presence of the peptide by matching its mass to the known monoisotopic mass of 2689.2 Da.

Question 3

Q3:

Solution:

Serine (Ser, S), Threonine (Thr, T), and Tyrosine (Tyr, Y) are the primary amino acids that can be phosphorylated.
To enrich phosphopeptides:
- Use immobilized metal affinity chromatography (IMAC) with metal ions like Fe³⁺ or Ga³⁺ to selectively bind phosphorylated peptides.
- Employ cation exchange chromatography to separate phosphorylated peptides based on their increased negative charge.
- Apply solid-phase extraction (SPE) to purify and concentrate phosphopeptides from the digest.
Phosphopeptides often exhibit lower fragmentation efficiency in CID spectra due to the stabilizing effect of the phosphate group. This results in fewer and less intense fragment ions, making it challenging to pinpoint the exact site of phosphorylation within a peptide. The reduced fragmentation can lead to ambiguity in assigning the modification to a specific amino acid residue.

Question 4a

Q4a:

Solution:

The observation likely indicates post-translational modifications (PTMs) of a protein in cancerous tissue, altering its molecular weight (MW) and isoelectric point (pI). Verification can be done through:
- Mass Spectrometry: To identify the protein and detect PTMs.
- Western Blot: Using specific antibodies to confirm protein identity.
Biomarker Discovery Workflow:
- Sample Collection: Obtain normal and cancerous kidney tissues.
- Protein Extraction: Prepare protein samples for analysis.
- 2D DIGE: Separate proteins based on pI and MW.
- Image Analysis: Identify differentially expressed proteins.
- Protein Identification: Excise and digest protein spots for mass spectrometry.
- Bioinformatics: Analyze data to predict protein functions.
- Validation: Use Western blot or ELISA to confirm expression.
- Functional Analysis: Investigate biological roles of proteins.
- Clinical Correlation: Relate protein expression to patient data.
- Validation Studies: Validate potential biomarkers in larger cohorts.

Question 4b

Q4b:

Solution:

Fractionation Method:
- High-Resolution Gel Electrophoresis [Protein]: Such as 2D gel electrophoresis or Blue Native PAGE, to separate proteins based on their isoelectric point and molecular weight.
- Reversed-Phase Liquid Chromatography (RPLC) [Peptide]: To separate peptides based on their hydrophobicity.
- Hydrophilic Interaction Liquid Chromatography (HILIC) [Peptide]: To separate peptides based on their hydrophilicity.
Workflow/Strategy for Absolute Quantification of AMPD Enzyme:
- Standard Preparation: Synthesize or obtain a set of isotopically labeled (heavy) AMPD peptides to serve as internal standards.
- Sample Preparation: Homogenize liver tissue from both control and high fructose diet-fed rats, followed by protein extraction.
- Protein Digestion: Digest the extracted proteins using a protease like trypsin to generate peptides.
- Peptide Fractionation:
  - Use RPLC to fractionate peptides based on hydrophobicity.
  - Employ HILIC to further separate hydrophilic peptides.
- LC-MS/MS Analysis: Analyze the fractions by liquid chromatography-tandem mass spectrometry (LC-MS/MS) to identify and quantify AMPD peptides.
- Internal Standard Addition: Spike the digested samples with the isotopically labeled AMPD peptides at known concentrations.
- Data Acquisition: Collect MS/MS data to identify and quantify both endogenous AMPD peptides and their labeled counterparts.
- Quantification: Use the peak area ratios of heavy to light peptides to calculate the absolute quantity of AMPD peptides in the samples.
- Bioinformatics Analysis: Employ software to process raw data, identify peptides, and perform quantitative analysis.
- Statistical Validation: Perform statistical analysis to validate the upregulation of AMPD in the high fructose diet group compared to the control.

Question 5

Q5:

Solution:

Genome Analysis: Analyze the genome sequence of bacterium A to identify genes encoding for the secreted toxins responsible for B disease.
Protein Identification: Use bioinformatics tools to predict the open reading frames (ORFs) and identify potential toxin proteins.
Gene Cloning: Clone the identified toxin genes into an expression vector suitable for protein production in a host organism (e.g., E. coli or yeast).
Protein Expression: Express the recombinant toxin proteins in the host organism, ensuring proper folding and post-translational modifications if necessary.
Protein Purification: Purify the expressed proteins using techniques such as affinity chromatography, ion exchange chromatography, or gel filtration.
Immunogenicity Assessment: Test the purified proteins for their ability to induce an immune response using in vitro assays or animal models.
Adjuvant Selection: Select an appropriate adjuvant to enhance the immune response to the vaccine antigens.
Formulation Development: Formulate the vaccine by combining the purified proteins with the selected adjuvant to create a stable and effective vaccine formulation.
Preclinical Testing: Conduct extensive preclinical testing in animal models to evaluate the safety, efficacy, and immunogenicity of the vaccine.
Clinical Trials: Proceed with clinical trials in humans, starting with Phase I to assess safety, followed by Phase II to evaluate immunogenicity and optimal dosing, and Phase III for efficacy.

Question 6

Q6:

Solution:

Protein X is rich in alpha helix, and protein Y appears as a random coil.
Reasons:
- 190 nm Region: The higher ellipticity in the combined spectrum (X+Y) compared to when both proteins are together (X:Y) at around 190 nm could indicate that the individual proteins have a higher beta-sheet content or are more aggregated when separate. This peak is often associated with the presence of beta-sheets or aggregated structures.
- 220 nm Region: The higher ellipticity in the X:Y spectrum compared to the combined individual spectra (X+Y) at around 220 nm suggests that when proteins X and Y are mixed together, there is either an increase in alpha-helical content or a stabilization of the alpha-helical structure. This could be due to a structural change upon interaction, or the formation of a complex that stabilizes the helices.
The alpha-helical content of protein X may be reduced due to interaction with protein Y. Non-specific interactions such as electrostatic or hydrophobic forces might cause structural loosening in protein X.
Several techniques can be used to study protein-protein or protein-ligand binding and determine their stoichiometry:
- Surface Plasmon Resonance (SPR): Measures changes in refractive index upon binding, providing real-time kinetics and binding stoichiometry.
- Isothermal Titration Calorimetry (ITC): Measures heat changes during binding, allowing for the determination of binding affinity and stoichiometry.
- Fluorescence Resonance Energy Transfer (FRET): Uses energy transfer between a donor and acceptor to study binding and proximity, which can indirectly provide stoichiometry.
- Analytical Ultracentrifugation: Monitors sedimentation behavior to infer binding and complex formation.

Question 7

Q7:

Solution:

The domain type of triose phosphate isomerase is the TIM barrel (also known as α/β-barrel). This structural motif consists of eight parallel beta-strands forming a barrel-like structure with alpha-helices running along the outside.
Technique to Characterize Secondary Structure:
- Circular Dichroism (CD) Spectroscopy: To determine the protein’s secondary structure content based on the differences in the way it interacts with circularly polarized light.
- Infrared (IR) Spectroscopy: Particularly Attenuated Total Reflectance (ATR)-FTIR, which can provide information about the secondary structure by measuring the amide bonds’ vibrations.
- Nuclear Magnetic Resonance (NMR) Spectroscopy: If the protein is small enough or suitable isotopes are used, NMR can provide detailed structural information.
Molecular replacement is used to solve the structure of triose phosphate isomerase because it is a method that allows the prediction of the three-dimensional structure of a protein when a similar protein’s structure is already known. It involves placing a known structure (the search model) into the unknown density map and then refining its position.
An alternative method to solve a protein crystal structure is Ab initio phasing. When no similar structure is available, this method involves using the diffraction data to build an electron density map from scratch, often starting with small molecular fragments and building up the entire structure.

Question 8a

Q8a:

Solution:

Crystal mosaic spread, refers to the variation in orientation of small crystallites within a larger crystal. This results in a diffraction pattern where individual spots are broadened and their intensities are reduced due to the averaging effect of the slightly misaligned crystallites. Consequently, mosaicity can cause inaccurate X-ray intensity measurements because the overlapping and weakened spots make it challenging to precisely determine the position and intensity of each reflection, leading to errors in data analysis.

Question 8b

Q8b:

Solution:

NOSEY (Nuclear Overhauser Effect Spectroscopy) is an NMR technique that measures the Overhauser effect between nuclear spins to determine proximity relationships between atoms in a molecule. It provides distance constraints used to infer the 3D structure of proteins.
(During a NOSEY experiment, the magnetization of a nucleus is transferred to a neighboring nucleus via the Overhauser effect. By measuring the intensity of these transferred signals, researchers can map out the network of spatially close atoms within the protein. This information is then used in conjunction with other NMR data to calculate the three-dimensional structure of the protein through computational methods such as molecular dynamics simulations or distance geometry calculations.)

SessionInfo

R version 4.4.0 (2024-04-24 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 10 x64 (build 19045)

Matrix products: default


locale:
[1] LC_COLLATE=Chinese (Simplified)_China.utf8 
[2] LC_CTYPE=Chinese (Simplified)_China.utf8   
[3] LC_MONETARY=Chinese (Simplified)_China.utf8
[4] LC_NUMERIC=C                               
[5] LC_TIME=en_GB.UTF-8                        

time zone: Etc/GMT-8
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] digest_0.6.35     fastmap_1.1.1     xfun_0.43         Matrix_1.7-0     
 [5] lattice_0.22-6    reticulate_1.37.0 rappdirs_0.3.3    knitr_1.46       
 [9] htmltools_0.5.8.1 png_0.1-8         rmarkdown_2.26    cli_3.6.2        
[13] grid_4.4.0        withr_3.0.0       compiler_4.4.0    rprojroot_2.0.4  
[17] here_1.0.1        rstudioapi_0.16.0 tools_4.4.0       evaluate_0.23    
[21] Rcpp_1.0.12       yaml_2.3.8        rlang_1.1.3       jsonlite_1.8.8   
[25] htmlwidgets_1.6.4