ISSN 0006-2979, Biochemistry (Moscow), 2024, Vol. 89, No. 4, pp. 637-652 © Pleiades Publishing, Ltd., 2024.
637
Modification of the Hi-C Technology
for Molecular Genetic Analysis of Formalin-Fixed
Paraffin-Embedded Sections of Tumor Tissues
Maria M. Gridina
1,2,a
*, Yana K. Stepanchuk
1,2
, Miroslav A. Nurridinov
1,2
,
Timofey A. Lagunov
1,2
, Nikita Yu. Torgunakov
1,2
, Artem A. Shadsky
1,2
,
Anastasia I. Ryabova
3
, Nikolay V. Vasiliev
3
, Sergey V. Vtorushin
3,5
,
Tatyana S. Gerashchenko
3
, Evgeny V. Denisov
3
, Mikhail A. Travin
4
,
Maxim A. Korolev
4
, and Veniamin S. Fishman
1,2
1
Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences,
630090 Novosibirsk, Russia
2
Novosibirsk State University, 630090 Novosibirsk, Russia
3
Research Institute of Oncology, Tomsk National Research Medical Center, Russian Academy of Sciences,
634009 Tomsk, Russia
4
Research Institute of Clinical and Experimental Lymphology, Institute of Cytology and Genetics,
Siberian Branch of the Russian Academy of Sciences, 630117 Novosibirsk, Russia
5
Siberian State Medical University, Ministry of Health of Russia, 634050 Tomsk, Russia
a
e-mail: gridinam@gmail.com
Received September 5, 2023
Revised October 31, 2023
Accepted October 31, 2023
AbstractMolecular genetic analysis of tumor tissues is the most important step towards understanding the
mechanisms of cancer development; it is also necessary for the choice of targeted therapy. The Hi-C (high-through-
put chromatin conformation capture) technology can be used to detect various types of genomic variants, includ-
ing balanced chromosomal rearrangements, such as inversions and translocations. We propose a modification
of the Hi-C method for the analysis of chromatin contacts in formalin-fixed paraffin-embedded (FFPE) sections
oftumor tissues. The developed protocol allows to generate high-quality Hi-C data and detect all types of chro-
mosomal rearrangements. We have analyzed various databases to compile a comprehensive list of translocations
that hold clinical importance for the targeted therapy selection. The practical value of molecular genetic testing is
its ability to influence the treatment strategies and to provide prognostic insights. Detecting specific chromosomal
rearrangements can guide the choice of the targeted therapies, which is a critical aspect of personalized medicine
in oncology.
DOI: 10.1134/S0006297924040047
Keywords: chromosomal rearrangements, three-dimensional nuclear organization, oncology, FFPE sections
* To whom correspondence should be addressed.
INTRODUCTION
Together with single nucleotide variants (SNVs),
chromosomal rearrangements, including balanced trans-
locations and inversions, play a key role in the patho-
genesis of various cancers. Current genomic diagnostic
approaches enable genome-wide detection of SNVs and
copy number variations, offering significant insights
into oncogenic processes. However, efficient detection
of balanced chromosomal rearrangements remains
elusive. At the same time, these chromosomal rear-
rangements have been found in almost all types of can-
cer. Moreover, for some tumors, detection of balanced
chromosomal rearrangements is critical for the diag-
nosis, clarification of prognosis, and choice of therapy.
GRIDINA et al.638
BIOCHEMISTRY (Moscow) Vol. 89 No. 4 2024
In many tumors, chromosomal rearrangements
not only accompany the process of tumor develop-
ment, but act as the main cause (driver) of cell onco-
logical transformation. One of the examples is recipro-
cal translocations observed in Burkitt’s lymphoma, in
which the translocation of the MYC gene from chromo-
some 8 to chromosome 14 under the influence of the
immunoglobulin heavy chain enhancer results in dys-
regulation of its expression [1]. If breakpoints occur
within the genes, this can lead to the gene fusions re-
sulting in the formation of chimeric proteins. Such fu-
sion proteins often involve transcription factors (ERG,
MYB) or protein kinases (ABL1, ALK, BRAF, EGFR, JAK2,
RET) that play a pivotal role in the oncogenic process.
The TMPRSS2-ERG gene fusion, which is prevalent in
a majority of prostate adenocarcinomas and approxi-
mately 20% of high-grade prostate intraepithelial neo-
plasias, illustrates this mechanism. TMPRSS2 is a serine
protease regulated by the androgen-dependent promot-
er and its fusion with the ERG oncogene results in the
ERG overexpression, a key event in the prostate cancer
pathogenesis [2]. Similar mechanisms involving ERG
fusion with other partners, such as NDRG1, EWS, and
FUS, have been implicated in other cancer types[3-5].
Gene fusion, a hallmark of various cancers, can
dysregulate gene expression and alter the function of
the encoded protein. Thus, if gene fusion results in the
truncation of one of the fusion partners, this can lead
to its overexpression due to the loss of negative reg-
ulatory elements (e.g., binding sites for microRNA) or
domains determining the protein lifespan. A notable
example is the MYB-NFIB gene fusion in adenoid cys-
tic carcinoma, resulting from the t(6;9) translocation
[6]. In this fusion, the chimeric transcript partially or
completely loses a region encoding the C-terminal reg-
ulatory domain of MYB containing the sites for protein
post-translational modification, as well as a non-cod-
ing sequence essential for the binding of microRNAs.
Consequently, the absence of these regulatory ele-
ments in the MYB portion of the fusion protein leads
to the upregulation of MYB expression and prolonged
protein lifespan [7].
Gene fusions can also lead to the production of
chimeric proteins with significantly altered functional
domains. In the norm, the FGFR3 receptor tyrosine ki-
nase is activated through the homo/heterodimerization
in the presence of fibroblast growth factor (FGF) as a
ligand [8]. The translocation between chromosomes
4 and 7 results in the FGFR3 fusion with BAIAP2L1.
Theresulting chimeric protein possesses the ability for
constitutive, ligand-independent homodimerization.
This aberrant dimerization is facilitated by the BAR
domains of BAIAP2L1, resulting in the FGFR3 kinase
activation and potent oncogenic activity [9].
For the diagnostic purposes and long-term stor-
age, tumor samples are preserved as formalin-fixed
paraffin-embedded (FFPE) tissue blocks through for-
malin fixation and subsequent embedding in paraffin.
FFPE blocks have many advantages, including stability
at room temperature, extended shelf life, and compat-
ibility with immunohistochemical analysis. However,
such fixation and storage of samples can lead to the
degradation of nucleic acids and appearance of arti-
facts, which requires optimization of molecular anal-
ysis methods [10]. Furthermore, the degradation and
modification of nucleic acids in FFPE samples compli-
cate the use of RNA sequencing for the detection of
biomarkers [11].
Routine methods for identification of chromosom-
al rearrangements in tumor tissues include FISH (flu-
orescence in situ hybridization), immunohistochemical
analysis, and RT-PCR. These approaches have obvious
limitations in the detection of novel or complex chro-
mosome rearrangements. Recent advances in high-
throughput sequencing have revolutionized clinical
genetics. Whole-genome sequencing (WGS) and whole-
exome sequencing (WES) using the short-read technol-
ogy have excelled in identifying SNVs and unbalanced
chromosomal rearrangements, but their accuracy in
repetitive genome regions is limited. Detection of bal-
anced rearrangements using WGS and WES depends
on the presence of chimeric reads encompassing the
rearrangement breakpoints and therefore requires a
high sequencing depth. Long-read sequencing methods
(PacBio and Oxford Nanopore) are effective for detect-
ing balanced chromosomal rearrangements, but their
efficiency diminishes when analyzing FFPE samples
due to the DNA degradation. Balanced chromosomal
rearrangements often trigger carcinogenesis through
two mechanisms: gene fusion and disruption of gene
expression resulting from alterations in the gene reg-
ulatory environment. Consequently, RNA sequencing
has emerged as an important tool for analyzing tumor
samples [12-14]. However, this technique demands a
high RNA quality, which is challenging when RNA is
isolated from FFPE samples [11-16]. Degraded RNA
fragments may lack crucial information on the fusion
sites. Moreover, RNA-seq technology faces sensitivity
issues in the case of low expression of fusion tran-
scripts [13] and fusions with non-coding regions [17]
and requires significant sequencing depth (20-30 mil-
lion paired-end reads) or targeted gene enrichment
[11]. The Hi-C (high-throughput chromatin conforma-
tion capture) method has been increasingly used in
recent years as an alternative approach for detect-
ing various types of chromosomal rearrangements.
Theadvantage of the method is its ability to detect bal-
anced rearrangements at a lower sequencing depth.
This efficiency is partly due to the fact that Hi-C does
not rely solely on the reads containing the breakpoint.
Instead, it analyzes changes in the chromatin contact
frequency within broad genomic regions and, therefore,
HI-C ANALYSIS OF FFPE TUMOR SECTIONS 639
BIOCHEMISTRY (Moscow) Vol. 89 No. 4 2024
requires less sequencing depth for the detection of re-
arrangements [18-27].
Here, we propose a new Hi-C protocol for analyz-
ing material from FFPE tumor sections. We introduced
significant modifications to the existing protocols
[28, 29], resulting in a highly reproducible technique
capable of generating high-quality Hi-C data and de-
tecting all types of chromosomal rearrangements. A key
innovation in our approach that distinguishes it from
traditional 3C methods, is the use of a sequence-ag-
nostic nuclease, which not only facilitates detection
of chromosomal rearrangements but also expands the
application of this method to identification of SNVs in
clinically significant loci. We have analyzed various
databases to compile a comprehensive list of translo-
cations that hold clinical importance for selection of a
targeted therapy. The results of modeling conducted in
our study demonstrate that our method has a substan-
tial promise for clinical application.
MATERIALS AND METHODS
Analyzed samples. FFPE sections were obtained
from patients treated at several medical centers of the
Russian Federation. Eight patients were from the On-
cology Research Institute of the Tomsk National Med-
ical Research Center. Three patients (age, 43.6 ± 8.62
years) had morphologically verified grade 4 (G4) brain
tumors (glioblastoma, giant cell glioblastoma, and
diffuse astrocytoma). Five patients (age range, 28-65
years; average age, 50.4 ± 12.9 years) had morpholog-
ically confirmed chondrosarcomas of different lo-
calization (humerus, femur, tibia, pelvic bones, and
sternum); the tumor grades ranged from G2 to G3. Six
patients were treated at the Kemerovo Regional Clin-
ical Hospital; three of them had chronic lymphocytic
leukemia (CLL) and three patients had large cell lym-
phoma (LCL). The diagnoses were established based
on pathomorphological studies of excisional lymph
node biopsies and immunohistochemical verification
using a specialized antibody panel.
Tumor tissue samples were collected during surgi-
cal procedures. The samples were fixed in 10% neutral
buffered formalin for 24 h and embedded in paraffin
using standard techniques. For each tumor specimen,
10μm thick sections were prepared.
FFPE Hi-C. The developed protocol was based on
our previously proposed S1 Hi-C method [30] and in-
cluded the following steps:
1. Deparaffinization:
1.1. An FFPE section was placed in a 1.5-ml tube and
1 ml of lysis buffer Y (150mM Tris pH8.0; 140mM
NaCl, 0.5% Igepal, 1% Triton X-100) was added.
1.2. The FFPE section was incubated at 80°C
for3min.
1.3. Centrifugation was performed at 2500g for 5min.
1.4. The paraffin layer was removed from the solu-
tion surface.
1.5. Steps 1.2.-1.4. were repeated (i.e., the total num-
ber of incubations was two).
2. Lysis:
2.1. After the second centrifugation, the superna-
tant was removed and the pellet was resuspended
in 1ml of lysis buffer H (10mM Tris pH8.0, 10mM
NaCl, 1% Triton X-100, 0.1% SDC, 20% EtOH).
2.2. The sample was incubated at 45°C overnight.
2.3. Centrifugation was performed at 2500g for
5min.
2.4. The precipitate was washed once with 1ml of
lysis buffer Y.
2.5. The sample was incubated in 1ml of lysis buf-
fer Y for 1 h at room temperature on an orbital
shaker.
2.6. Centrifugation was performed at 2500g for
5min.
2.7. The supernatant was removed, and the pel-
let was resuspended in 500 μl of lysis buffer D
(50mM Tris pH 7.5, 0.5mM CaCl
2
, 0.3% SDS).
2.8. The sample was incubated at 37°C for 1h.
2.9. SDS was quenched by adding 91μl of 10% Tri-
ton X-100 for 10min at room temperature.
2.10. Centrifugation was performed at 2500g for
5min.
2.11. The pellet was washed once with 500 μl of
1× S1 nuclease buffer (Thermo Scientific) contain-
ing 1% Triton X-100.
3. Chromatin fragmentation:
3.1. The pellet was resuspended in 80 μl of 1× S1
nuclease buffer.
3.2. 200 U of S1 nuclease (Thermo Scientific) was
added and the mixture was incubated at 37°C
for1h.
3.3. The reaction was stopped by adding 5 μl
of 500 mM EDTA and purified with 1 volume of
AMPure magnetic beads according to the manu-
facturers recommendations.
3.4. Chromatin associated with magnetic beads
was resuspended in 100μl of H
2
O (chromatin re-
mained bound to the beads until the DNA isola-
tion stage).
4. Further steps, including biotin labeling, ligation,
DNA isolation, ligation fragment enrichment, and
preparation of NGS libraries, were performed ac-
cording to the protocol described by Gridina etal.
[31]. DNA quantity was determined with a Qubit
dsDNA HS Assay Kit.
The prepared libraries were sequenced using the
BGI sequencing platform with 150-bp paired-end reads.
The sequencing depth was 10-100 thousand reads per
sample for shallow sequencing and ~80 million paired-
end reads for deep sequencing.
GRIDINA et al.640
BIOCHEMISTRY (Moscow) Vol. 89 No. 4 2024
Modeling of chromosomal rearrangements was
performed with the Charm software (https://github.com/
genomech/Charm/) using the published results of ge-
nome-wide Hi-C studies in IMR90 cells [32] (identifiers
SRR1658675, SRR1658676, SRR1658679) to create a da-
tabase of reference contacts. Each chromosomal rear-
rangement was modeled as heterozygous, with a total
number of ~30 million Hi-C contacts. The coordinates
for the boundaries of modeled chromosomal rear-
rangements (Online Resource 1) were rounded to the
nearest 5kb. Rearrangements smaller than 25kb were
scaled up to this threshold.
Construction of contact maps and analysis of
quality metrics of Hi-C libraries. Analysis of Hi-C
data and construction of Hi-C heatmaps were conduct-
ed as outlined in [31]. A modified version of the Juicer
software version 1.6 (available on GitHub: https://
github.com/genomech/juicer1.6_compact) was used to
calculate the quality scores.
RESULTS
Development of Hi-C protocol for FFPE samples
using S1 nuclease. The majority of published Hi-C
protocols have been designed for living cells or fresh
tissues [19, 31-33]; few of them were adapted for frozen
samples [34]. While these protocols have been well-es-
tablished for the respective sample types, with known
details and critical points [35-37], their applicability
to FFPE tumor sections is limited. Currently, there are
only two Hi-C protocols for the analysis of FFPE sec-
tions [28, 29]. We compared these two existing proto-
cols (Table1) and identified significant methodological
differences, particularly, at the deparaffinization and
sample lysis stages. For deparaffinization, Troll etal. [28]
recommended xylene treatment followed by alcohol
washing, while Allahyar et al. [29] suggested a 3-min
incubation at 80°C, centrifugation, and removal of
the paraffin layer. After deparaffinization, in order
to ensure the availability of chromatin for restriction
enzymes, Troll et al. treated the samples with protein-
ase K (0.5 mg/ml for 1 h at 37°C), whereas Allahyar etal.
used sonication and subsequent incubation for 2 h
at 80°C. Both proteinase treatment and prolonged in-
cubation at 80°C can lead to the destruction of cross-
links formed by formaldehyde [38] and DNA release
from the chromatin. Finally, both studies suggested us-
ing restriction endonucleases with four-nucleotide rec-
ognition sites for chromatin fragmentation. However,
this approach can result in a low coverage of genom-
ic regions distant from the enzyme recognition sites,
which might limit identification of certain genomic
variants, such as SNVs in the oncogene exons located
far from the restriction sites.
We have developed a modified Hi-C protocol spe-
cifically designed for preparing libraries from FFPE
sections (Table 1). The lysis conditions were adjusted
to be less harsh, and we used S1 nuclease instead of
restriction endonucleases (see Materials and methods)
to provide uniform genome coverage [30].
During the preparation of the Hi-C libraries from
living cells or tissues, the quality of DNA fragments is
tested after the following key steps:
1. Post-lysis, pre-fragmentation;
2. Post-fragmentation, pre-ligation;
3. Post-ligation.
These control checkpoints are crucial for eval-
uating the quality of prepared libraries (Fig. 1, a, b).
Before fragmentation, a band corresponding to the
high-molecular-weight DNA should be detected, which
disappears after fragmentation with the formation
of many low-molecular fragments of various lengths.
After ligation, the distribution of fragment lengths
shifts to a higher molecular weight region.
We found that the standard quality controls typ-
ically employed in the Hi-C library preparation from
living tissues were not applicable or representative
Table 1. Comparison of FFPE Hi-C protocols
Steps
Conditions
Troll et al. [28] Allahyar et al. [29] Our protocol
Deparaffinization xylene
3 min at 80°C,
centrifugation
3 min at 80°C,
centrifugation
Lysis proteinase K, 1h at 37°C
0.6% SDS, sonication,
incubation at 80°C for 2 h
lysis in the presence of ionic
and nonionic detergents
Chromatin fragmentation MboI, 1h at 37°C NlaIII, 1 h at 37°C S1 nuclease, 1h at 37°C
Biotin labeling + +
Ligation 2 h, room temperature 1 h, 16°C overnight, 16°C
Ligation product enrichment + +
HI-C ANALYSIS OF FFPE TUMOR SECTIONS 641
BIOCHEMISTRY (Moscow) Vol. 89 No. 4 2024
Fig. 1. Results of chromatin digestion and ligation in Hi-C experiments in living cells and tissues using DpnII(a) and S1 nucle-
ase(b) and in FFPE sections using our protocol(c). Lanes: 1,pre-fragmentation, 2,post fragmentation, 3,post-ligation; M,100-bp
ladder (SibEnzyme). d)DNA quantification(ng) in Hi-C libraries obtained from cells and FFPE sections of different tumor types.
The number of analyzed libraries from peripheral blood mononuclear cells (PBMCs) was 16, from FFPE sections: LCL – 13,
CLL– 9, gliomas– 4, sarcomas– 5.
inthe case of FFPE samples. Long-term fixation of tu-
mor tissues in formaldehyde and subsequent embed-
ding in a paraffin block lead to a significant DNA deg-
radation [39, 40]. Consequently, DNA extracted from
FFPE blocks is already in a highly fragmented state.
According to our data, further nuclease treatment and
ligation do not result in contrasting changes in the
fragment length (Fig.1c). However, our analysis of the
sequencing data quality and visual examination of the
resulting FFPE Hi-C maps indicated successful com-
pletion of the key Hi-C protocol stages. For instance,
the Hi-C libraries represented in Fig.1, as well as ad-
ditional libraries detailed in Online Resource 2 (sam-
ples s11-s15), demonstrated acceptable quality metrics.
Hence, we believe that in the case of FFPE Hi-C, the de-
scribed controls are not necessary. Instead, we recom-
mend assessing the library quality based on the results
of shallow sequencing.
Unlike the FFPE Hi-C method, Hi-C analysis of live
cells using restriction endonuclease DpnII or S1 nu-
clease allows to easily vary the amount of the starting
material, while in the case of FFPE sections, accurate
estimation of the number of cells in each section can
be a challenge. We observed a significant variability in
the amount of DNA isolated from FFPE samples of dif-
ferent tumors, as well as in the samples of the same tu-
mor type (Fig.1d). Thus, FFPE sections from sarcomas
consistently yielded the lowest amount of DNA, which
forced us to utilize three FFPE sections from a single
block for analysis. Therefore, we recommend to deter-
mine the required number of sections for each tumor
type to obtain sufficient yield of DNA libraries.
GRIDINA et al.642
BIOCHEMISTRY (Moscow) Vol. 89 No. 4 2024
Fig. 2. Quality metrics of FFPE Hi-C datasets showing proportion of unmapped reads(a), proportion of PCR duplicates(b), pro-
portion of DEs(c), proportion of cis contacts among all Hi-C contacts(d). Each dot represents an independent Hi-C library prepa-
ration. The number of analyzed libraries from PBMCs was 16 and from FFPE sections: LCL– 13, CLL– 9, gliomas– 4, sarcomas– 2.
Using the newly developed protocol, we prepared
FFPE Hi-C libraries from CLL, LCL, gliomas, and sar-
coma samples. The CLL sections were also used to
construct the libraries according to the protocols sug-
gested by Troll etal. [28] and Allahyar etal. [29]. After
sequencing, the libraries were assessed for their quali-
ty. We observed a low number of unmapped reads and
PCR duplicates across all the libraries (Fig.2, a and b,
respectively). Another key quality metric for Hi-C li-
braries is a proportion of dangling ends (DEs), unin-
formative fragments that are not ligation products.
The DE content in the Hi-C libraries prepared from
peripheral blood mononuclear cells (PBMCs) using S1
nuclease averaged 40%. However, in the FFPE Hi-C li-
braries prepared according to the developed protocol,
this proportion was higher and varied depending on
the tumor type. The highest DE content was observed
in the libraries prepared following the protocols of