It is a regulatory requirement to confirm the sequence of your protein and examine the termini for any variation that may exist (see the ICH Q6B guidelines section 6.1.1 c).
BioPharmaSpec’s protein sequencing service includes N terminal sequencing and C terminal sequencing of proteins, which allows you to determine the amino acids at the respective termini of your protein.
Terminal Structure of Proteins
Proteins are composed of a linear chain of amino acids linked to one another through an amide bond. Amino acids have an amine functional group at one end and a carboxylic acid functional group at the other. It is the linking of the carboxylic acid group of one amino acid with the amine group of another that produces this amide bond. This process is controlled by the protein translational machinery of the cell and results in the conversion of messenger RNA (mRNA) to protein.
This sequential linking of amine and carboxylic acids through the length of the protein chain means that, once the protein chain is completed, one end of the protein will have a free amine group and the other end will have a free carboxylic acid group. The free amine end of the chain is called the “N-terminus” or “amino terminus” and the free carboxylic acid end is called the “C-terminus” or “carboxyl terminus”. The fact that these two protein termini are chemically different form one another means that they will naturally have different chemical properties. This fact can be exploited and allows the use of specific chemical procedures for sequencing proteins via these termini.
Sequencing Light and Heavy Chain Terminals
Not all proteins are composed of a single chain. Monoclonal antibodies are composed of four protein chains; two identical heavy chains and two identical light chains. Therefore, they have 2 different N-termini and 2 different C-termini. For successful sequencing it is important to separate the chains of multichain proteins to prevent ambiguity in the data generated.
If the molecule exhibits any post translational processing resulting in ragged termini, more than one terminal amino acid sequence will be detected. In this case the relative amounts of the termini will be determined. The data can also therefore be used as an indication of the intactness of the molecule.
N-Terminal Sequencing
BioPharmaSpec provides an N-terminal sequencing service (also known as gas phase sequencing, Edman sequencing or Edman degradation) using Shimadzu instrumentation for automated N-terminal Edman degradation.
In order to define the protein sequence of the N-terminus, BioPharmaSpec scientists use the following N-terminal sequencing protocol:
- For multichain proteins or protein mixtures, separate proteins using SDS-PAGE, blot onto PVDF membranes and stain with Ponceau red
- Pure or single chain proteins can be transferred directly to a PVDF membrane
- Blotted samples are then treated with phenylisothiocyanate
- The derivatized N-terminal amino acid is then cleaved form the protein backbone
- The derivatized amino acid is analyzed by liquid chromatography and identified by elution position against derivatized amino acid standards
N Terminal Sequencing Applications
N-terminal sequence analysis has a number of different applications in drug development such as:
- Showing that the N-terminus of your protein is intact and as expected.
- Demonstrating batch-to-batch consistency.
- Unambiguously defining the Isoleucine and Leucine residues within the protein sequence
- Mass Spectrometry cannot be used in this instance because there is no mass difference between the two amino acids (they are isomers of one another).
This last use of N-terminal sequencing is very important. As mentioned above, it is a regulatory requirement for the protein sequence to be determined (see Q6B section 6.1.1 a)). Techniques must be used that can unambiguously define the sequence. Mass spectrometry can provide much useful structural information as part of an investigation of this nature by generating amino acid sequence information from chromatographically separated peptides. However, since Leucine and Isoleucine are isomers (they both have the same mass), they cannot be categorically identified by mass. In these cases, peptides containing Leucine or Isoleucine can be purified and subjected to N-terminal sequencing using Edman chemistry as described above. Leucine and isoleucine derivatives have unique chromatographic elution positions and can thus be categorically identified.
C-Terminal Sequencing
There is no fully analogous method akin to Edman chemistry for determining the C-terminal sequencing of a biopharmaceutical. For an assessment of the C-termini, BioPharmaSpec scientists use carboxypeptidase digestion and/or mass-spectrometric mapping strategies.
In the latter approach, intactness of a protein C-terminus or the presence of ragged ends can be assessed using data obtained from a peptide map. This includes the generation of confirmatory peptide fragment ions to confirm the nature of the C-terminal peptide or peptides.
Intact molecular weight analysis of the product can also be used as an orthogonal procedure to support conclusions regarding the structures of the N- and C-termini. Monoclonal antibodies are an example of proteins where C-terminal variation is observed. In this case, the C-terminal Lysine residues of the heavy chains can be removed, resulting in variability of the C-terminus of the heavy chains.
Contact BioPharmaSpec to find out more about our terminal sequencing services.
Sequence the N and/ or C Terminus of your product
Please contact our scientists if you would like to discuss the best methods to determine the N and/ or C terminus of your biopharmaceutical.