An integrated approach to the study of Hypericum occurring in Sicily

: An integrated approach to the study of taxa of the genus Hypericum occurring in Sicily is proposed. The results of morphological, biochemical, and molecular analyses are combined to better assess the relationships between the species investigated and test the suitability of DNA barcoding technique in the discrimination of these taxa. For the name Hypericum aegypticum subsp. webbii (Spach) N. Robson a lectotype is designated. For Hypericum triquetrifolium Turra a lectotype and a supporting epitype are designated. The presence of Hypericum perforatum L. subsp. perforatum is excluded from Sicily and the previous reports have to be referred to H. perforatum subsp. veronense (Schrank) Ces. Hypericum perfoliatum L. and H. pubescens Boiss. are close morphologically and chemically, as well as based on the results from rcb L marker, although belonging to different sections. Biochemical analyses confirmed the relevant amounts in bioactive metabolites of the studied taxa . Hypericum perfoliatum L. is proposed as a valid alternative to H. perforatum L. for cultivation with phytotherapic purposes.

with phytochemical and genetic discrimination. In particular, the suitability of DNA barcoding technique was investigated in discriminating the Hypericum taxa. As already successfully assessed in other species such as Allium spp. (İpek et al., 2014), this technique can contribute in developing an easy authentication assay, helpful in solving taxonomic doubts or in commercial trade traceability of whole plants, portions or derived products. This study also aimed to clarify the presence in Sicily of the subspecies of H. perforatum. In fact, Robson (2002) and Ciccarelli and Garbari (2004) reported H. perforatum subsp. perforatum occurring in Italy only in the northern part of the Peninsula and attribute the Sicilian populations to H. perforatum subsp. veronense (Schrank) Ces. Oppositely, Bartolucci et al. (2018) reported both subspecies in the whole peninsula, Sardinia, and Sicily.

Plant material
The 10 taxa of Hypericum, 9 species and 1 subspecies, were collected from natural populations in Sicily during the flowering period (from May to June) in 2013 and 2014 and were studied from the morphological, biochemical and genetic points of view. Voucher specimens were deposited in the Herbarium SAF (Table 1). The selection of sampled populations has followed extensive surveys of the whole regional territory.
The plant specimens were collected in bioclimatic belts between Lower Mesomediterranean to Lower Oromediterranean ones  and in the subunits: Lampedusa Is., Northern Sicilian coast, Western Sicilian plain, Upper Madonie Mts, Lower and Upper Nebrodi Mts, Peloritani Mts, Lower Etna Mt., Iblei Mts . Field identification was based on morphological characters in the mature stage in comparison with the original descriptions, relevant literature (Robson and Adams, 1968;Robson, 1985Robson, , 1993Robson, , 2010, and with the original materials.

Chemical analyses
For chemical determinations, flowering tops (15-20 cm) of at least 10 individuals per population were collected in full flowering during the central hours of the day. The collected material was carried in paper bags and dried at 20-25 °C in the dark. The analyses were performed according to Napoli et al. (2018); briefly, 5 g of dry material were chopped up, homogenized and subjected to extraction in 50 mL of ethanol at room temperature for 72 h, in the dark and under constant agitation. The extract was filtered with filter paper and the filter was washed 3 times with 10 mL of ethanol. The obtained mixture was brought to dryness with a rotary evaporator. The chemical determinations were conducted by means of high performance liquid chromatography equipment with a diode array detector (HPLC-DAD), injecting 20 µL of a 10 mg/mL solution in methanol "HPLC grade VWR" for each extract. Each analysis was carried out in triplicate. Since the amount of H. triquetrifolium was too small for chemical determination, comparison data were obtained from literature (Hosni et al., 2011). Mean values of the 20 chemical determinations were used for multivariate analyses, and their totals, are reported in the Supplementary Information 2, whereas the box plots of these values, averaged by species, are reported in Figure 2.

DNA barcoding
The barcoding approach was adopted in support of the morphological and phytochemical investigation. Multiple individuals for each taxon were used for molecular analysis. Plant material for DNA extraction consisted of young lyophilized leaves. Genomic DNA extraction was based on CTAB protocol for plant tissue (Doyle and Doyle, 1987).
When making the choice of markers we considered the relevance of the compromise between the discrimination level supported by a marker and amplification and sequencing success (Chase et al., 2005;Hollingsworth et al., 2009). The choice of trnH-psbA, as an additional marker, appeared logical to discriminate morphologically close samples.
Polymerase chain reaction amplifications were performed with the GeneAmp PCR System 9700 (Applied Biosystems, Foster City, CA, USA). Products were purified and bidirectionally sequenced (Amersham Biosciences DYEnamic ET Terminator Cycle Sequencing Kits), according to the Sanger protocol for AB3730XL DNA Analyzer (Applied Biosystems). The resulting electropherograms were screened for errors and assembled into contigs using Sequencer software 4.10 (Gene Codes Corporation, Ann Arbor, MI, USA). The sequence alignments were carried out by MUSCLE and phylogenetic Neighbour-Joining. A tree was generated for molecular identification, based on a Kimura 2 parameter model, using Mega 6 software (Kimura, 1980;Saitou and Nei, 1987;Tamura et al., 2013;Giovino et al., 2016). The comparison included all new sequences generated, a subset of the most closely related sequences, and the significant BLAST results, downloaded from GenBank database (Table 3).

Statistical treatment of data
According to Giovino et al. (2015), Domina et al. (2017), and Domina (2018), each morphological character was subjected to a preliminary univariate variance analysis (data not shown) according to the specific data structure, setting each morphological character as independent variable (X) and the taxon as dependent variable (Y), using PAST version 3.26b (Hammer et al., 2001;Hammer, 2019). Pearson correlation coefficients (r) among the 11 measured characters were calculated, as presented in the Supplementary Information 3. Multivariate analyses, including discriminant analysis (DA - Figure 3) and principal component analysis (PCA - Figure 4) were performed. A cluster analysis with paired group (UPGMA) algorithm and Euclidean similarity index was carried out for morphological observation ( Figure 5), as well as for chemical components (Figure 6), and molecular markers (rbcL, Figure 7; matK, Figure 8; trnH-psbA, Figure 9).
Sampling, morphological and molecular data generated in this investigation were submitted to the BOLD database under the dedicated project code FMED (Ratnasingham and Hebert, 2007).

Morphology
Pearson correlation coefficients ( Supplementary  Information 3) showed a high association level between the 2 measurements of leaf dimension (Leaf L and Leaf W, r = 0.965) and capsule dimension (caps L and Caps W, r = 0.876). Other highly correlated measurements were Petal W and Sepal L (r = 0.880) and Stylus L and Stamen L (r = 0.864). Otherwise, not significant negative correlations showed up about Petal L and Plant height (r = -0.08), and Petal L and Leaf W (r = -0.05).
The examined taxa were well discriminated using PCA (Fig. 4), with the only exception of the 4 populations identified as H. perforatum subsp. perforatum and H. perforatum subsp. veronense, which showed great variability and overlap. According to the DA (Figure 3, Table 2) the characters that showed a greater ability to discriminate were the Stylus length, the Sepal length, and the Stamen length. In addition, from the box-plots analysis ( Figure 1) these characters had extreme values that allowed the discrimination of the largest part of taxa and partially overlapped in the populations of the H. perforatum group. The reduced morphological variability of the population of H. calycinum observable in the box-plots analysis ( Figure  1) could be explained by rather recent introduction of this taxon, reasonably originated from a reduced number of cultivated individuals.
More than 90% (94.17%) cases resulted were correctly classified by DA according to the a priori group assignment, and the only case that was not correctly classified belonged to the subspecies of H. perforatum. The cluster analysis ( Figure 5) showed a branch with H. calycinum and H. hircinum subsp. majus, separated from the other taxa. It was highlighted the admixture of the specimens belonging to H. perforatum subsp. perforatum and H. perforatum subsp. veronense.

Chemical analysis
As main part of the complex bioactive secondary metabolism of Hypericum, in this study the attention was focused on polyphenols, naphthodianthrones and phloroglucinols contents of the ethanolic extract. Results are reported in Table S2.
A high intraspecific variability showed up, with different amounts of hypericins (hypericin + pseudohypericin) and hyperforin according to the genotype. From the biochemical aspect, the taxa were well distinguished, above all based on their content in hyperforin and, to a lesser extent, quercetin-3-O-rutinoside (rutin) and quercetin-3-O-galactoside (hyperoside).
The high discriminatory power found for the hyperforin content confirmed previous findings (Napoli et al., 2018). Hyperforin was detected in quite high amounts in H. perforatum (37-43 g kg −1 ), followed by H. perfoliatum (24 g kg −1 ) and H. pubescens (15 g kg -1 ). Relevant quantities of this metabolite were also recorded in H. androsaemum (9 g kg -1 ). Noticeably, hypericins (given as the sum of hypericin and its 3 biosynthetic precursors, namely protohypericin, pseudohypericin, and protopseudohypericin) were found in larger amounts in H. perforatum and in rather similar quantities in H. perfoliatum and H. tetrapterum, whereas they were almost absent in H. androsaemum, H. calycinum, H. hiricinum, and H. triquetrifolium. The latter taxon stood out, instead, for the high detected quantities of 3-O-caffeoylquinic acid and flavonols, compared to all the other taxa. Due to the increasing interest that surrounds the biological activities ascribed to biflavones (biapigenin and amentoflavone), it is also worth noting the high content of these, that was retrieved in H. perfoliatum, H. perforatum, H. pubescens,

DNA barcoding
The rbcL locus showed the best performance in terms of amplification and sequencing success, while trnH-psbA and matK markers showed instead higher potential in species level resolution (Table 2). Particularly, trnH-psbA discriminated 100% of the taxa successfully sequenced. Therefore, a multilocus approach (rbcL + trnH-psbA), able to resolve 80% of the taxa analysed (8/10), appeared the best compromise between sequencing success and discrimination power (Table 2).
According to the results in Table 2, phylogenetic trees of each barcoding markers showed the genetic relationship between the taxa included in this study (Figures 7,8,and 9). Only 2 subspecies H. perforatum subsp. perforatum and H. perforatum subsp. veronense were not discriminated.

Discussion
The integrated approach applied in this work has been able to achieve full characterization of several Hypericum species from Sicily. Morphological, chemical and genetic observations, offered distinct points of view of Hypericum's diversity; however, a multidisciplinary procedure allowed us to point out similarities and differences among the different Hypericum taxa from Sicily that would have not been detected otherwise.
A combined comparison of the results from the 3 used approaches showed that H. perfoliatum and H. pubescens are close morphologically and chemically, as well as based on the results from rcbL marker, although belonging to different sections (Robson et al., 2013-onwards): Hypericum Sect. Drosocarpium Spach, the former, and Sect. Adenosepalum Spach, the latter. Similarly, also H. calycinum and H. hircinum subsp. majus are morphologically and chemically close, although belonging to different sections (Robson et al., 2013-onwards): Hypericum Sect. Ascyreia Choisy, the former, and Sect. Androsaemum (Duhamel) Godron, the latter.
Biochemical analyses confirmed the relevant amounts in bioactive metabolites of the studied taxa, assessing the high quality of the investigated materials. Furthermore, H. perfoliatum showed values very close to H. perforatum, allowing to be suggested as a potential alternative to the former. Wild populations from Sicily confirmed their suitability to straightforward cultivation, aimed to obtain high-quality plant material. It appeared necessary, however, to perform thorough biochemical screenings, extended to a larger number of populations. The results indicate the effectiveness of DNA barcoding in discriminating the taxa of Hypericum, suggesting the possibility to build a fast and accurate molecular identification method. This finding may be greatly helpful in view of taxa identification, even from herbal formulations.
According to the principal component analysis, there was no statistically significant morphological variation in the populations originally recognized as H. perforatum subsp. perforatum and H. perforatum subsp. veronense, collected in 4 different localities. Furthermore, none of the applied techniques was able to distinguish the populations of H. perforatum subsp. veronense from those populations that, based on their morphological traits, had been formerly attributed to H. perforatum subsp. perforatum. Hence, it is possible to attribute all the H. perforatum studies populations to H. perforatum subsp. veronense, and exclude the presence in Sicily of H. perforatum subsp. perforatum.