Antibody Drug Conjugates (ADC) are a rapidly expanding area of pharma company pipelines. They combine the targeting of an antibody with the potency of a small molecule. Such a simple and elegant approach has far reaching consequences for the IT infrastructures that were established and implemented for antibody and small molecule drug discovery. The ability to track data associated with ADCs is critical for projects to conduct structure-activity relationships (SAR) and ultimately be successful. Herein we describe a simple approach to assigning a unique ID number to ADCs that involves only minimal modification to the established registration processes for the separate antibody and small molecules components.
Antibody drug conjugates represent an increasingly important area for drug discovery. They combine the best components of both antibodies and drugs. The antibody provides the selective targeting of the therapeutic while the highly potent drug drives a high efficacious response.
In addition to the many challenging discovery and development complexities presented by these hybrid biologic-small molecule entities, data management also needs to be addressed. While drug discovery has implemented effective software solutions for registration of the individual components of an ADC (i.e. antibody, drug), the ability to describe and register the combined (ADC) product presents interesting challenges for current IT infrastructures, particularly in instances where the existing component registration workflows do not accommodate each other and may additionally have evolved in completely distinct software environments.
The ability to track data associated with ADCs is critical for projects to conduct structure-activity relationships (SAR) and ultimately be successful. While a covalent bond elegantly joins the worlds of antibody and small molecule, the marriage of these two domains in the cheminformatics arena represents a significant undertaking.
2.0 Registration of small molecules
Registration is the process of assigning corporate identifiers to unique entities for the purpose of tracking them through discovery pipelines. For small molecules, registration is routine. Card systems were originally used, but the process has since been computerized. Small molecules are registered after their chemical structures have been determined; this requirement essentially provides that for a corporate ID to be assigned to a structure the corresponding compound must have been made.
Each structurally unique molecular entity is assigned its own corporate ID. Additionally each batch, or lot, of compound material is assigned a unique lot number.
The relationship between the physical material and a lot number is always immutable. Almost always the registration system enforces a rule that the relationship between a structure and its corporate ID, once assigned, is also immutable. Since the structure must be determined prior to registration the need for changes are rare. When changes do occur, they result in the lot(s) being assigned a different corporate ID.
The registration system will normally allow for the registration of materials of unknown structure, usually by requiring that such materials be assigned a unique name, but also by allowing a special character (e.g. ‘X’) to represent an unknown component of an otherwise determined structure. The virtual registration of compounds without physical lots can also be permitted, but in these cases a different class of identifiers may be assigned.
Culturally, registration is ingrained into the thinking of chemists. In the past, productivity was sometimes assessed by the number of compounds registered. Since pharmacologically active compounds in discovery rarely have trivial names, the corporate ID serves as a substitute, being used in publications, patent applications, internal documents and presentations.
3.0 Registration of biologics
For biologics, the process of registration has been defined much more recently. For developers, the first instinct was to mirror the behavior of small molecule registration systems. This was challenging for a number of reasons.
Biological macromolecules are large and an absolute representation of their chemical structure is intractable. For proteins, the amino-acid sequence can be used as a surrogate for structure. However biological proteins, especially those that are secreted from the cell, are not simply polypeptides. Many proteins are post-translationally modified (e.g. by glycosylation). In most cases, the absolute structure of the glycans and their points of attachment will not be known, and a batch of protein may well be heterogeneous in respect of its post-translational modifications.
Usual practice in biologics registration is to use the amino acid sequence as the uniqueness-determining representation of the chemical structure. Variations in glycosylation may very well occur between lots of material. Exceptions can be made if scientists intend to make a purified form of post-translationally modified protein that differs substantially from the bulk form; such proteins can be assigned unique corporate IDs.
The registration of biologics is procedurally different in that the structure of the registered material is not always independently determined (the sequence of the protein is derived from the encoding plasmid and rarely verified by mass spectroscopy prior to registration). In many cases, the sequence of the protein will not be determined at all before a corporate ID is needed to track assay results (e.g. an antibody derived from a hybridoma). In these cases, the unique identifying information is essentially the process by which the biologic was produced (e.g. isolated from that hybridoma cell line) rather than an explicitly determined state. A consequence of this is that changes in the identifying information for a protein are much more common than for small molecules.
There are two approaches to addressing this challenge. One is to maintain the rule for small molecules that the relationship between identifying information and corporate ID is immutable once assigned. Such a system must endure the inconvenience of tube re-labelling and record modification should lots of material require a change of corporate ID.
The alternative approach is to conserve the relationship between a batch of material and its corporate ID whenever possible. In this approach the identifying information for a corporate ID can be changed provided no lots exist for which the old information remains correct. A consequence of this approach is that 2 corporate IDs can become synonymous if one is modified to have the same identifying information as another, and in this case the entities merge and retain a single preferred corporate ID.
Although the second approach may seem more reasonable for biologics, situations where a corporate ID can be assigned to a material by both state and process are very complex, and for this reason we at Abbvie have moved from the second approach to the first.
Registration is a more recent practice for biologics and the metadata that needs to be collected for each registered material is more complex than for small molecules. Consequently, processes must be designed to keep data entry as simple as possible and to ensure that it is carried out by the person most likely to know the required information. Biologists typically are less comfortable using numeric identifiers as substitutes for trivial names. They often rather prefer information-rich names (e.g. Mouse anti-Human KDR [IgG1/kappa]). We enforce uniqueness of these names, so that each corporate ID maps to a single name, but also allow a more free text lot name where variations between lots of the same material can be captured. However, lot consistency is important in any discovery endeavor and this should be an exception.
4.0 Registration of ADCs
Since ADCs comprise a small molecule component and a biologics component, information about them already resides in both the small molecule and biologics registration systems. The small molecule component itself comprises a payload (the active small molecule drug) and a linker (used to connect the drug to the protein). The payload, the linker and a reagent in which the payload and linker are attached all exist as chemical reagents and can therefore be registered. In practice, the linker, as a commercial off the shelf reagent that is not independently tested, is rarely registered. Uncertainties about the molecular nature of each of the components reside in their own systems.
For example, if we do not know the sequence of an antibody that is to be conjugated, then its corporate ID in the biologics registration system will be definite, but assigned by process. Similarly, if we do not know the structure of the combined payload/linker, perhaps because it is proprietary to a collaborator, then the small molecule corporate ID will be definite, but assigned on the basis of a unique name.
At Abbvie, two Accelrys products are used for registration. The Global Biologic Registration System (GBRS) is used to register antibodies. This uses the amino acid sequence to determine whether or not an antibody is unique and assigns both a PR# as its corporate ID (for PRotein), for example, PR-123456 and an individual lot#.
For small molecule registration, the software A-coder is used. This determines uniqueness based on chemical structure and assigns both an A# as the corporate ID (i.e. , A-1307119.0 where the .0 signifies it is the free base) and an individual lot#.
The same number sequence is used by both software packages removing the possibility of identical PR- and A-numbers.
When research into ADCs was initiated at Abbvie, it was recognized immediately that to ensure data integrity a registration process would need to be implemented. Unfortunately, neither GBRS nor A-coder had the required functionality to perform registration of ADCs alone. GBRS was not chemically intelligent and thus unable to determine uniqueness of the ADC. A-coder was only designed for small molecules and was not able to handle the large amino acid sequences of the antibody.
To minimize the impact on already established workflows for both antibodies and small molecules, a solution that leveraged both GBRS and A-coder was desired.
The first decision was that ADCs would be assigned a DC# (for Drug-Conjugate) as its corporate ID. This decision was taken so that as soon as a scientist saw data associated with the moniker A- (small molecule), PR- (protein) or DC- (ADC) the type of molecule would be immediately apparent.
Next, the decision of whether GBRS or A-coder would be used to register ADCs was addressed. Recognizing that the inventory management of ADCs was more similar to inventory management for biologics than to that for small molecules, GBRS was selected. GBRS was also selected as it enabled more sophisticated metadata capture for biologic entities and was the newer of the two registrations platforms at Abbvie.
As GBRS did not possess the chemical intelligence to determine the uniqueness of an ADC, a mechanism that enabled this was required. The solution was to use the combination of the PR# from the antibody and the A# from the linker-drug to define a unique ADC in the name field of GBRS.
For the example in Figure 2.0 “ADC-123456-1307119” would be entered in the name field of GBRS. As both the antibody and linker-drug identifiers would be generated by their respective registration systems designed to handle the appropriate entities, all of AbbVie’s registration rules would be applied appropriately.
While in principle this would provide a way to determine uniqueness of an ADC, there was a catch. Unfortunately, during conjugation the linker-drug structure is chemically modified which leaves the possibility for two unique linker-drugs to give rise to equivalent ADCs. For example, as shown in Figure 2, Linker-drug A contains a bromine, while Linker-drug B has an iodine resulting in a unique A# for each compound. During conjugation, the halogen is displaced by the antibody with both linker-drugs affording the same ADC. However, by this method of annotation GBRS would perceive that the two reactions produced different ADCs, as the two combinations of PR# from the antibody and A# from the linker-drug are unique.
This complication was resolved by introduction of a virtual compound called the “X-combo”. This virtual compound has an X representing the antibody and the chemical structure of the linker-drug after conjugation to the antibody (Figure 2.0). During registration, this enables A-coder to determine whether the X-combo is unique and to generate a corresponding A#. In GBRS, the combination of antibody PR# and X-combo A# in the name field can then be used to determine if this is a unique ADC or one that has already been registered and assign the correct DC#.
GBRS creates an ADC registration event when the scientist provides both an antibody and X-combo corporate ID. GBRS assigns a DC corporate ID based upon three pieces of information: 1) antibody corporate ID (PR-#), 2) small molecule X-Combo (A#), 3) drug-to-antibody ratio (DAR). A DAR2 and DAR4 molecule of the same antibody and X-combo will be assigned 2 different DC corporate ID’s. If an already existing antibody and X-combo have been registered this will become a new batch of material.
In order to facilitate SAR on the ADC and its individual components (antibody, linker, drug), the appropriate A#, PR# and DC# for an ADC had to be associated together. To aid in this association, the ADC Component Association Tool was developed to enable this in collaboration with Discngine. The ADC component is achieved in a simple 5 step procedure.
First, the structure of the linker-drug is retrieved using the A# (Figure 3.0). Next, the drug is identified either by modification of the retrieved linker-drug structure or using the A#.
The mechanism of action of the drug is also selected from a drop-down list at this stage. If the mechanism of action of the drug has not previously been registered, a new mechanism of action term can be entered manually and it is then captured in the drop-down list (Figure 4.0).
As the structure of both the linker-drug and drug are known, the linker is then automatically identified by the software (Figure 5).
The ADC Association Tool identifies the linker structure from the Combo molecule based upon what chemical structure was identified as the drug during the previous step and removing this from the Combo chemical structure leaving the linker chemical structure.
The shorthand name for the linker is selected from a drop down list, for example, MC-Val-Cit-PABC. If the linker has not previously been registered, a new linker term can be entered manually and it is then captured in the drop-down list. Then the type of linker, for example, dipeptide or non-cleavable, is also captured. For linker-drugs with a non-cleavable linker, the free drug is not likely to exist. As a result, for these linker-drugs, the cysteinylated analogue is registered to represent the active species that is released from the lysosome (Figure 5.0).
The final step is exemplification of the X-combo structure. The software retrieves the structure of the linker-drug, which can then be modified to represent the chemical structure of the linker-drug after conjugation to the antibody, with X representing the antibody (Figure 6.0).
Finally, the ADC Component Association Tool registers the X-combo in A-coder thereby conforming to AbbVie’s registration process rules on structure. The association between the ADC components along with the additional criteria on MOA and linker are stored in a custom ORACLE database. The element table in the A-coder registration system was modified to allow the X-combo molecules to contain the element X, which represents the antibody. The ADC Association Tool sends all of the metadata required for the X-combo molecule registration and assignment of its corporate ID.
Having created an association between all the components of an ADC, it is now possible to data mine on any aspect of an ADC. For example, one can easily search for all the ADCs with non-cleavable linkers that contain drug A-1581855. To enable substructure searching of ADCs, the structure of the X-combo was associated with the DC# of the ADC on the chemistry cartridge.
Figure 7.0 shows an example of ADCs with an MOA of auristatin. Due to the complexity and size of the structure of X-combos and linker-drugs, their visualization is not optimal. The use of metadata fields like linker, type and MOA can therefore be used to identify the structural variations within a set of ADCs being visualized.
Having associated all the components of an ADC facilitates comprehensive evaluation of SAR. All in vitro, in vivo and PK data can be uploaded to the corporate database and associated, at the lot level, with the relevant ADC component. Then, for example, it is possible to correlate the cell efficacy of the ADC with that of the free drug or the naked antibody.
5.0 Maleimide Hydrolysis
A known liability of ADCs using Cys-maleimide conjugation is the loss of the linker-drug through a reverse Michael reaction. Scientists at Genentech  published data showing 2 important facts:
- hydrolysis of the maleimide ring affords a stable attachment;
- the environment surrounding the cysteine influences hydrolysis of the maleimide ring.
They showed that sites with a positively charged environment promoted hydrolysis of the maleimide ring. Seattle Genetics  published data on maleimide hydrolysis showing that both a basic moiety proximal to the maleimide and also a short alkyl chain between the maleimide and amide can catalyze ring hydrolysis at basic pH. Pfizer  have nicely shown that a PEG spacer between the maleimide and amide enables base catalyzed ring hydrolysis.
Maleimide ring hydrolysis is also achieved for linker-drugs with an ethyl spacer between the maleimide and valine by treatment at pH 9 for 3 days. The ring hydrolyzed maleimide structure is captured during registration of the X-combo (Figure 8.0).
Hydrolysis of the maleimide ring after conjugation can afford two possible hydrolyzed products. For clarity when visualizing the ADC structure only a single product with the X positioned alpha to the amide from the maleimide ring (as depicted in Figure 8.0) is captured in the database.
6.0 DAR Homogeneity
Having initially defined the criteria to determine a unique ADC as the combination of PR-# (antibody) + A-# (X-combo), it was decided that DAR should also be included. To enable data mining of this information, a minor modification to GBRS was made which added separate fields for aggregation, DAR and DAR separation.
ADCs produced by conjugation to inter-chain cysteines results in a heterogeneous DAR population. To improve both quality and consistency of ADCs synthesized at AbbVie, routine separation of the DAR species by hydrophobic interaction chromatography (HIC) was implemented. To enable immediate recognition of whether an ADC was a heterogeneous or DAR separated population, a simple terminology was adopted. For a heterogeneous DAR population the DAR was reported to one decimal place, for example, DAR 3.6. For a specific DAR peak following separation by HIC the DAR was reported as a whole number, for example, DAR 4.
7.0 Site of Conjugation
The final consideration was how to register ADCs when the site of conjugation is known, for example, with cysteine deletion and/or addition mutants. In these cases, the site of conjugation is captured in the antibody structure during the registration process for the antibody. As this is a novel antibody, it receives a different PR# to the native antibody so GBRS will recognize this and determine that the ADC is unique.
To make this mutation more readily apparent, the mutated amino acid along with its location is captured in the name field during registration in GBRS. For example “ADC-123456-1307119-CYS237” would be entered in the name field to designate conjugation at CYS237. Using this format for entries in the name field not only ensures the correct identification of this ADC by the registration system, it also provides immediate clarity of the amino acid mutation(s).
A custom and novel ADC registration process has been implemented with minimal modification to AbbVie’s small or large molecule registration systems software or compound workflow. This new process enables in-depth SAR interrogation based on all components of the ADC, including the ability to perform searches based on the structure of the linker-drug. A simple terminology was implemented to discriminate between heterogeneous and separated DAR populations as well as other ADC property metadata.
ADC, antibody drug conjugate; Cit, citrulline; Cys, cysteine; DAR, drug to antibody ratio; GBRS, global biologics registration system; HIC, hydrophobic interaction chromatography; IT, information technology; MC, maleimide-caproyl; MOA, mechanism of action; MMD, monomethyl dolastatin 10; PABC, para-amino benzylic carbamate; SAR, structure-activity relationship; Val, valine.
August 14, 2017 | Authors: Adrian D. Hobson,* [a] Jeremy C. Packer, [b] Chris C. Butler [b] and Dirk A. Bornemeier.[b]
[a] AbbVie Bioresearch Center, 381 Plantation Street, Worcester, MA 01605
[b] AbbVie, Inc., 1 North Waukegan Road, North Chicago, IL 60064
* [email protected]
The manuscript was written through contributions of all authors. / All authors have given approval to the final version of the manuscript.
ADH, JCP, CCB and DAB are employees of AbbVie (or Abbott Laboratories prior to separation) and may own AbbVie/Abbott stocks or stock options and participated in the interpretation of data, review, and approval of the publication. The financial support for this work was provided by AbbVie.
We acknowledge Doug Pulsifier, Robert Gregg, Michael Huang, Sreekumar Menon, Randy Metzger, Hetal Patel, Teresa Rosenberg, Jennifer Van Camp and Philip Hajduk for their input with this project.
Original manuscript received: July 24, 2017 | Manuscript accepted for Publication: August 3, 2017 | Published online August 14, 2017 | DOI: 10.14229/jadc.2017.14.08.002
Last Editorial Review: August 11, 2017