Introduction

Perturbations in folate-mediated one-carbon metabolism (FOCM) are associated with numerous pathologies including neural tube defects (NTDs)1, stroke2, colorectal and other types of cancer3,4,5. Furthermore, enzymes in FOCM have been successful targets for the development of antineoplastic pharmaceutical agents including0020methotrexate and 5-flurouracil6. FOCM in the cytoplasm is composed of three interconnected biosynthetic pathways, which include de novo thymidylate (dTMP) synthesis, de novo purine synthesis and homocysteine remethylation to methionine (Fig. 1). The FOCM network is sensitive to nutritional status for several vitamins that serve as enzyme cofactors (folate, riboflavin, vitamin B6 and vitamin B12) and genetic factors (coding and expression variants in folate-dependent enzymes) that can alter network outputs, including DNA synthesis, DNA repair and chromatin methylation7,8,9. Understanding the molecular basis of disease etiology has been limited by the ability to ascribe specific FOCM pathways and their biomarkers to clinical outcomes, because the pathways of FOCM are tightly interconnected8. FOCM complexity is manifest by: (a) competition among the pathways for a limiting pool of folate cofactors10, (b) long-range and indirect regulatory processes, (c) formation of multi-enzyme complexes, (d) cellular compartmentation, (e) interactions with other metabolic pathways, (f) nutritional status (g) penetrant genetic variants9, 11. Mathematical models have been developed to assess this complexity and gain an understanding of the cause-and-effect relationships that regulate FOCM functioning in health and disease. The overall goal is to provide an understanding of function of the entire system in silico that can be used to accelerate discovery and guide the design of biological experimentation.

Figure 1
figure 1

The reaction-based specification of the model according to the notation introduced in Gostner et al.61. Rectangles identify model variables, non-boxed substrates are model constants, green circles identify enzymes, dark blue arcs identify matter transformation, and light blue arcs identify regulatory events (dotted lines indicate activations and solid lines indicate inhibitions). The purple boxes indicate reactions and variables associated with the folate cycle and the homocysteine remethylation cycle, respectively.

Early models of FOCM quantified steady-state concentrations of folates, and predicted how methotrexate and 5-fluorouracil affect the rates of de novo purine and thymidylate biosynthesis in L1210 cells12,13,14,15. These models considered FOCM reactions as occurring in a common cellular compartment, described the reactions in terms of Michaelis-Menten kinetics, and investigated biological consequences of inhibiting de novo purine and dTMP biosynthesis. Other models have been constructed to simulate the homocysteine remethylation in liver16,17,18. Using data from experimental animals, non-trivial mathematical functions were adopted to address long-range inhibition and activation processes involving folates and other metabolites that cannot be translated into simple Michaelis-Menten equations17. These models can reproduce changes in metabolite concentrations in response to nutritional deficiencies and the effects of gene variants, thus corroborating human clinical and epidemiological data. These models have also integrated other pathways such as glutathione metabolism and polyamine synthesis, studying the influence of indirect effects beyond FOCM19, 20.

These existing FOCM models describe the continuous flux of metabolite concentrations in terms of ordinary differential equations (ODEs) with time as an independent variable13, 20. Sensitivity analysis is used to summarize the effects on metabolite concentrations and reaction velocities in response to changes in inputs or enzyme activities21. Here we present a hybrid stochastic model for simulating the FOCM dynamics where state-of-the art deterministic simulation (based on ODEs) has been coupled with exact stochastic simulation to assess metabolite variabilities in the FOCM network at steady state. A detailed description of this combined approach can be found in Methods. The deterministic approach used in isolation can only provide a rough estimate of FOCM model dynamics, because the deterministic approach is limited when enzyme substrates, such as folate cofactors, are present in the cells at low micromolar concentrations, and because reactions within the network occur randomly at discrete time points. FOCM is expected to exhibit variability (i.e. stochasticity) in its behavior22. Capturing system stochasticity is essential when substrate concentrations are low and limiting, but requires consideration of molecules as discrete entities, rather than describing concentrations as continuous variables through ODEs23. Simulation strategies that combine both deterministic and stochastic approaches can give a more accurate and more detailed understanding of FOCM network functioning and stability. In contrast to approaches based solely on deterministic simulation, these studies can be used to assess the contributions of factors such as genetic variation and nutritional status on the stochastic behavior of individual pathways within the network, thereby aiding in establishing which system inputs (i.e. nutrition) and outputs (i.e. biomarkers) are most closely associated with human health outcomes.

Dysregulation of the partitioning of one-carbon units in the form of 5,10-methylenetetrahydrofolate (CH2F) cofactors between the de novo dTMP biosynthesis and homocysteine remethylation pathways is believed to underlie FOCM-associated pathologies including NTDs (Fig. 1). A common variant of MTHFR, the C677T polymorphism, has been associated with numerous pathologies including birth defects, cancer, cardiovascular events, and other pathologies24. MTHFR catalyzes the FADH-dependent, irreversible conversion of CH2F to 5mTHF, which commits folate cofactors away from dTMP synthesis and towards homocysteine remethylation in the cytosol (Fig. 1). The variant results from an alanine to valine substitution in the protein that decreases MTHFR activity by decreasing its affinity for the FADH cofactor. Such substitution affects enzyme stability and hence the partitioning of folates between dTMP synthesis and homocysteine remethylation25, 26. Decreased MTHFR activity resulting from the polymorphism decreases 5mTHF synthesis, leading to impaired homocysteine remethylation and elevated serum homocysteine24. The 677 T variant is also associated with a redistribution of cellular folate cofactors; 5mTHF is the predominate form of folate in red blood cells in MTHFR 677CC carriers, whereas 10fTHF is the predominate form of folate in MTHFR 677TT carriers27,28,29. 10fTHF is less chemically stable than 5mTHF, and the MTHFR 677TT variant is associated with lower folate status30 and higher folate requirement31. Recent studies suggest that the contribution of the MTHFR variant to NTD risk is due to its impact on cellular folate status, rather than impaired homocysteine remethylation32. Likewise, mouse models of NTDs indicate that impaired dTMP synthesis, and not homocysteine remethylation, cause folate-responsive NTDs33,34,35. Reed et al. investigated the consequences of the MTHFR C677T polymorphism, assuming 70% enzyme activity for heterozygote and 30% enzyme activity for homozygote, in comparison to CC homozygotes (which was set to 100% activity) using parameters for folate monoglutamates, which are not the physiological form of folate cofactors in cells. Under these conditions, the variant allele decreased concentrations of 5mTHF and SAM and increased the concentrations of homocysteine, SAH, and rates of dTMP and purine biosynthesis21, which is inconsistent with current understanding of biochemical changes associated with NTD risk. The effect on the redistribution of folate cofactors towards 10fTHF that is associated with the 677 T variant, or its impact on other pathways within the network, was not reported21.

Here, we studied the partitioning of CH2F, a cofactor for both homocysteine remethylation and de novo dTMP biosynthesis36,37,38, and the effects of known genetic and nutritional variables that impact movement of CH2F through the network. Existing models are limited by adopting kinetic parameters determined from the use of folate monoglutamate substrates19, 20. Folate polyglutamates are the functional form of folate cofactors in cells and have much higher affinity for their respective FOCM enzymes than the corresponding monoglutamate forms of folate39, 40. Therefore, we updated the parameters in the deterministic model to include the physiologically-relevant polyglutamate forms of folate cofactors and demonstrate that it faithfully recapitulates existing data in the literature (see Methods, Model Validation and Fig. 2). The decisions for selecting individual enzyme kinetic parameters for this model were driven by: 1) available data for physiologically relevant polyglutamate forms of the folate cofactors derived from characterization of mammalian enzymes, and 2) data from human models, specifically L1210 cells, because of the richness and quality of the data used to derive kinetic parameters. Given the high conservation of folate enzymes among mammals, our model could be applied to mammalian systems in general, even though we are not proposing a completely homogeneous model with respect to species.

Figure 2
figure 2

Steady states for five different values of glycine (folate cycle in % and the homocysteine remethylation cycle in μM). Where possible, a trend arrow is provided on the right to show the experimental outcome observed in Herbig et al.28. All the trends are consistent with literature (green arrows) except for the total % of 5 mTHF in the glycine range 5–10 mM (red arrow).

We were able to identify key nodes in the network of Fig. 1 that exhibit high degrees of stochastic behavior, including the influence of nutrient status and genetic variation on stochasticity through simulations. We explored the impact of the MTHFR C677T polymorphism and its interaction with folate status on partitioning of CH2F within the network, including its impact on de novo dTMP biosynthesis to understand the etiology of NTDs. The results of the computational model provide evidence that the rates of de novo dTMP synthesis as currently modeled in the cytosol are insufficient to support DNA synthesis in S-phase in mammals, accounting for uracil misincorporation into DNA that occurs in folate deficiency and in mouse models of NTDs.

Results

The effect of folate status and the MTHFR polymorphism on pathways affecting NTD risk

In the current model, MTHFR activity was decreased to model the effect of the MTHFR C677T polymorphism. In addition, a two-fold increase in MTHFR activity was modeled to examine whether there was a dose-response relationship between MTHFR activity and various readouts of the network. This model shows that 5mTHF levels decrease as MTHFR activity decreases, reflecting the effects of the MTHFR C677T polymorphism (Table 1). The current model also recapitulates the biological observation that decreased MTHFR activity results in accumulation of 10fTHF and THF (Table 1). Inclusion of 10fTHF in the folate distribution is a strength of the current model in that it allows for estimation of the effect that perturbations to the system have on accumulation of this unstable form of folate, which likely accounts for the decreased folate status linked to NTDs in carriers of the polymorphism30, 32.

Table 1 Distribution of folate (in percentage of total folate) for different levels of MTHFR activity (ranging from 2x to 0.3x of wild type). CC, CT and TT refer to the C677T polymorphism.

SAM and SAH levels vary markedly with changes in MTHFR activity (SAM levels decrease and SAH levels increase by more than 50% when comparing the “CC” model to the “TT” model), with a SAM/SAH ratio around one being achieved in CC homozygotes compared to 0.24 in TT homozygotes (Table 2). Although methylation potential, otherwise known as the SAM/SAH ratio, changes markedly with varying MTHFR activity, homocysteine concentrations appear relatively insensitive to changes in MTHFR in this model (Table 2), inconsistent with known effects of the MTHFR TT genotype in elevating serum homocysteine levels41. However, in this model the lack of elevation in homocysteine due to the MTHFR C677T polymorphism reflects that the model represents a closed system leading to intracellular conversion of cellular homocysteine to SAH as opposed to export of homocysteine into the circulation (Table 2).

Table 2 Concentrations of model variables (in µM) for different levels of MTHFR activity (ranging from 2x to 0.3x of wild type). CC, CT and TT refer to the C667T polymorphism.

The activity of each enzyme in the FOCM network as predicted by the current model indicates that accumulation of 10fTHF resulting from decreased MTHFR activity is due to two factors: (1) an increased flux through both the 10fTHF synthetase activity leading to increased synthesis (FTS), and (2) an increased flux through the cyclohydrolase/dehydrogenase activity of MTHFD1 which converts CH2F to CHF to 10fTHF (MTCH, MTD activities, respectively, Table 3). Interestingly, the accumulation of 10fTHF does not affect flux through the enzymes that use 10fTHF as a co-factor for de novo purine synthesis (PGT, AICART; Table 4). This is consistent with empirical experimental findings that increasing cellular levels of 10-formytetrahydrofolate dehydrogenase, which consumes 10fTHF, do not affect de novo purine synthesis42. Flux through the de novo dTMP synthesis pathway increases with decreasing MTHFR activity, consistent with empirical studies indicating these two pathways compete for CH2F (Table 4, columns DHFR, TYMS and SHMT)43. These findings are in agreement with empirical data showing that the TT polymorphism results in an increase in CH2F available for dTMP synthesis as indicated by isotope tracer studies in humans26.

Table 3 Fluxes of the reactions catalyzed by the enzymes FTS, MTCH, MTD, MTHFR and MTR and the binding of 5mTHF and SHMT for different levels of MTHFR activity (ranging from 2x to 0.3x of wild type). CC, CT and TT refer to the C667T polymorphism.
Table 4 Fluxes of the reactions catalyzed by the enzymes PGT, AICARFT, DHFR, TYMS, SHMT and the unbinding of 5mTHF and SHMT for different levels of MTHFR activity (ranging from 2x to 0.3x of wild type). CC, CT and TT refer to the C667T polymorphism.

5mTHF binds to and is a potent inhibitor of SHMT and glycine N-methyltransferase (GNMT). Decreased MTHFR activity, which lowers cellular 5mTHF levels (Table 2), increases the flux through the SHMT1-catalyzed reaction in the direction of serine catabolism to glycine (Table 4). This reflects the depletion of intracellular 5mTHF as well as glycine, resulting from increased GNMT activity that results from 5mTHF depletion (Table 5). Methionine synthase (MTR) flux is highly sensitive to MTHFR genotype reflecting its dependence on availability of its substrate 5mTHF that is generated by MTHFR (Table 3).

Table 5 Fluxes of the reactions catalyzed by the enzymes MTR, BHMT, MAT-I, MAT-III, GNMT, DNMT and SAHH for different levels of MTHFR activity (ranging from 2x to 0.3x of wild type). CC, CT and TT refer to the C667T polymorphism.

The MTHFR C677T variant affects both CH2F partitioning (between homocysteine remethylation and dTMP biosynthesis) and intracellular folate concentrations; the 677 T variant lowers intracellular folate levels. Therefore, the impact of this variant on FOCM was modeled at two different levels of folate (Tables 610) (folate replete conditions, 18 μM, and low-folate conditions, 9 μM) to understand how the variants function within the FOCM network as a function of folate cofactor availability. The results demonstrate that the changes in the percentage of 5mTHF (as well as other major one-carbon forms of folate) are more pronounced in the TT genotype than in the CC genotype when cellular folate levels are decreased (Table 6). The percentage of 5 mTHF in CC homozygous does not change in the folate replete and deficiency states, whereas the accumulation of 5 mTHF in TT homozygotes differs between the deficient and replete states (Table 6).

Table 6 Distribution of folate (in percentage of total folate) for replete (18 µM) and low (9 µM) levels of total folate and for the CC and TT case of the C667T MTHFR polymorphism.
Table 7 Concentrations of model variables (in µM) for replete (18 µM) and low (9 µM) levels of total folate and for the CC and TT case of the C667T MTHFR polymorphism.
Table 8 Fluxes of the reactions catalyzed by the enzymes FTS, MTCH, MTD, MTHFR and MTR for replete (18 µM) and low (9 µM) levels of total folate and for the CC and TT case of the C677T MTHFR polymorphism.
Table 9 Fluxes of the reactions catalyzed by the enzymes PGT, AICARFT, DHFR, TYMS and SHMT and the binding/unbinding of 5 mTHF and SHMT for replete (18 µM) and low (9 µM) levels of total folate and for the CC and TT case of the C677T MTHFR polymorphism.
Table 10 Fluxes of the reactions catalyzed by the enzymes BHMT, MAT-I, MAT-III, GNMT, DNMT and SAHH for replete (18 µM) and low (9 µM) levels of total folate and for the CC and TT case of the C677T MTHFR polymorphism.

Fluxes through FOCM pathways are affected by both the MTHFR C677T polymorphism and folate levels. The most sensitive pathways to folate deficiency are the MTCH and MTD activities of MTHFD1, MTHFR, MTR, DNMT, DHFR, and TYMS (Tables 810, row showing absolute flux differences between folate levels). Flux through the dTMP synthesis pathway (DHFR and TYMS) is highly sensitive to folate status for both the MTHFR CC and TT genotypes (Table 9), with the TT homozygotes being the most sensitive. Flux through GNMT was also highly sensitive to folate status in the CC homozygotes, whereas GNMT flux in TT homozygotes was insensitive to folate status (Table 10). Similar but less pronounced effects were seen for flux through SHMT (Table 9).

To understand if MTHFR genotype affects the stability of the FOCM network at steady state, the deterministic simulation was coupled with stochastic simulation using a hybrid simulation strategy (see Methods and Supplementary Material). Model steady states were obtained under four different conditions that differed by MTHFR 677 genotype (CC and TT case) and intracellular folate levels (replete and low). Interestingly, the most stable steady state (the one with lowest total sum of reaction propensities a 0(x), see Methods for details), was the CC case with folate replete concentrations, consistent with numerous epidemiological studies associating the MTHFR C677T genotype with folate-related pathologies (Table 11)43. The enzyme that exhibited the greatest level of stochasticity in response to folate levels and/or the MTHFR C677T polymorphism was SHMT1 (Table S6).

Table 11 Total propensities obtained in four steady state conditions according to folate polymorphism (CC and TT case) and total concentration of available folate (replete, 18 µM; low, 9 µM).

The molecular basis of uracil misincorporation into DNA

Mouse models implicate SHMT1 and impaired de novo dTMP synthesis in NTD risk. Impaired de novo dTMP synthesis causes an increase in dUMP, which when converted to dUTP causes uracil misincorporation into DNA because DNA polymerases do not distinguish between dTTP and dUTP44. The dTMP biosynthesis pathway enzymes (MTHFD1, SHMT, TYMS, and DHFR) are present in both the cytosol and recently have been found to function in the nucleus. In the nucleus, they comprise a multi-enzyme complex at sites of DNA synthesis that may be critical to limit rates of uracil misincorporation into DNA, but regulatory mechanisms remain unknown45. These enzymes are modified by the Small Ubiquitin-like MOdifier (SUMO) protein at the G1/S boundary, which permits their nuclear translocation during S-phase of the cell cycle46. One study showed that when nuclear translocation of this complex is impaired in a mouse model over-expressing SHMT1, rates of uracil misincorporation into DNA increased several fold45. In this model, SHMT1 protein levels were elevated several fold in the liver, yet its localization was restricted to the cytoplasm and nuclear SHMT1 levels were depleted compared to wild-type mice45. Furthermore, nuclei isolated from SHMT1 overexpressing mice exhibited lower rates of de novo dTMP synthesis compared to nuclei isolated from wild-type mice45. This suggests that de novo dTMP synthesis occurs when the enzymes are present in the multi-enzyme complex within the nucleus in mammals. However, no definitive experiment has been performed that identifies the relative contribution of nuclear and cytosolic dTMP synthesis to overall dTMP synthesis. Interestingly, S. cerevesiae do not import the dTMP synthesis pathway into the nucleus47.

To determine if nuclear import of the de novo dTMP pathway was required to meet cellular demands for dTTP during DNA replication, rates of dTMP synthesis were modeled for mammalian cells using standard Michaelis-Menten kinetics (Table 12; Table 4). Based on the number of A-T base pairs in the human genome and an 8-hour S-phase in embryonic stem cells (S-phase in L1210 cells is also 6–10 h48, 49), the rate of dTMP synthesis required for faithful cell replication is 7.8 µM/min (calculations are in Supplementary Material, see also Table 12)50, 51. In the current model, which does not account for SHMT1/TYMS/DHFR/MTHFD1 nuclear localization nor complex formation, cytosolic dTMP synthesis rates are 4.4 µM/min (Table 4, DHFR and TYMS flux, 263.4 µM/h, assuming MTHFR 677 CC genotype). This computational deficit between dTMP requirements and dTMP synthesis rates suggests that dTMP synthesis as currently modeled in the cytosol where the enzymes are not present in a complex cannot meet cellular needs. Nuclear localization and complex formation of the de novo dTMP synthesis complex seem to be unique to mammalian cells. In S. cerevesiae, TYMS is not SUMOylated and localizes to the nuclear periphery47. The measured rate of dTMP synthesis in S. cerevesiae is 1.8 µM/min52 (Table 12). The rate of dTMP synthesis required to replicate the S. Cerevesiae genome over the course of an S-phase (less than one hour52, 53) is 0.5 µM/min, indicating that yeast synthesize dTMP at a rate that is more than 3-fold greater than necessary for adequate dTMP synthesis (calculations are in Supplementary Material). Furthermore, in response to DNA damage, yeast increase dNTP concentrations 6–8 fold54 and E. coli increase dNTP concentrations 1.8–3.7 fold55, but dNTP concentrations do not increase after DNA damage in mammals56, 57.

Table 12 Cellular capacity for de novo dTMP synthesis in mammals and yeast at S-phase.

Discussion

Understanding the dynamics of FOCM and its responsiveness to both genetic and environmental perturbations is the key to understanding the etiology of folate-related pathologies. Computational models and related simulations permit an identification of the most sensitive reactions within the network that exhibit the greatest degree of stochastic behavior leading to variability in network outputs. Furthermore, computational models allow an understanding of how both genetics and environmental factors can enhance or repress stochastic behavior at defined locations within the network, accelerating the development of diagnostics to identify those at risk for folate-related pathologies as well as lead to the development of targeted nutritional interventions for disease prevention.

The Shmt1 knockout mouse model (Shmt1 +/−, Shmt1 −/− embryos) exhibits impaired de novo dTMP synthesis in the absence of perturbations of homocysteine remethylation. It also recapitulates risk for NTDs in humans. Specifically, the mouse model exhibits folate-responsive NTDs that occur with minimal perturbation in FOCM, and exhibit low and variable penetrance33, 34. In fact, most if not all, folate-related pathologies whose etiology involves interactions among genetic risk variants and nutrient exposures also exhibit low and/or variable clinical presentation. Understanding the stochastic behavior of the various reactions within FOCM that results in increased variability in FOCM network outputs is essential to understand which enzymes in the network contribute to folate-related pathologies.

Existing FOCM models rely on the limited quantity of kinetic data present in the literature, and the performance of the model will be dependent upon the kinetic parameters chosen to include in the model. Much of the available kinetic data for FOCM enzymes present in the literature was collected using the commercially available monoglutamate folate substrates, with few studies using the physiologically relevant polyglutamate forms of the cofactor. In the cell, newly transported monoglutamate folates are converted to folate polyglutamates, containing 3 to 7 polyglutamate moieties, though the action of folylpolyglutamate synthetase58. The polyglutamate chain (N = 3 glutamate and higher) increases the affinity of folate cofactors for many folate-dependent enzymes by one to two orders of magnitude39, 40. Models that include kinetic parameters derived from the use of folate monoglutamates can limit model reliability. Here we established a hierarchy of criteria to select a more homogeneous set of kinetic parameters (i.e. K m and V max ) by referring, when possible, to L1210 cells because of the richness and quality of the data used to derive kinetic parameters. Furthermore, our preference was to select kinetic coefficients generated using folate polyglutamate cofactors and purified proteins from animal models closest to humans, as the variability in kinetic parameters among mammals is much less than the differences observed between folate monoglutamate and polyglutamate cofactor substrates.

The current model was validated by demonstrating that it recapitulates empirical observations regarding the impact of intracellular glycine on behavior of the FOCM network (Fig. 2). The validated model was then used to understand how the MTHFR C677T polymorphism, a known genetic risk factors for NTDs in humans, affects FOCM. This model shows that the lower levels of 5mTHF associated with the MTHFR 677 T variant are accompanied by elevated levels of 10fTHF, which has been observed in animal models and in humans27,28,29 (Table 1). The model also indicates that 10fTHF accumulates in TT homozygotes as a result of increased flux through both the synthetase activity of MTHFD1 (FTS activity, Table 3), but also due to increased flux through MTHFD1 activity in the direction converting CH2F to 10fTHF (Table 3). Therefore, the model accurately predicts perturbations in FOCM that have been observed in human clinical and epidemiological studies. A recent study suggested that the risk of the MTHFR C677T polymorphism for NTDs was due to its known effect on lowering intracellular folate concentrations, rather than its role in providing 5 mTHF for homocysteine remethylation32. This model demonstrates that the MTHFR C677T polymorphism elevates levels of 10fTHF, which is known to be a chemically unstable form of folate that is susceptible to oxidative degradation, providing a mechanism by which the MTHFR C677T polymorphism depletes intracellular folate levels. Importantly, the model reported here demonstrates that the de novo dTMP biosynthesis enzymes are the most sensitive to low intracellular folate concentrations, with both DHFR and TYMS activities being repressed by 63% (Table 9). This finding is consistent with the finding that mouse models with impaired de novo dTMP biosynthesis are susceptible to NTDs in folate deficiency34. The hybrid stochastic simulation also reveals that both folate deficiency and the MTHFR C677T polymorphism create overall instability in the network (Table 11), consistent with a vast body of literature demonstrating an association of both folate deficiency and the MTHFR C677T polymorphism with various pathologies24, 30. Interestingly, SHMT1 exhibits the greatest increase in stochastic behavior as a result of the MTHFR C677T polymorphism (Table S6); the SHMT1 enzyme is the only FOCM enzyme that when disrupted results in folate-responsive NTDs34.

The primary findings of this study are that the FOCM network is destabilized as a result of folate deficiency and the MTHFR677T polymorphism, and that SHMT is the most sensitive enzyme within the network to this network instability. This finding nicely connects the MTHFR genetic variant, a known risk factor for human NTDs, and SHMT1, the only folate enzyme whose disruption results in folate-responsive NTDs in mice. Furthermore, this model predicts that de novo dTMP synthesis rates in mammals are about half of what is required to meet DNA replication demands for dTMP (Table 12). Although mammals contain two pathways for dTMP synthesis, the folate-dependent de novo dTMP synthesis pathway described here and a salvage pathway catalyzed by thymidine kinase 1, the salvage pathway activity is insufficient to meet cellular needs based on observations that folate deficiency results in elevated uracil accumulation in DNA. In mammalian cells, the de novo dTMP synthesis enzymes form a multi-enzyme complex that interacts with DNA replication enzymes46. The discrepancy between de novo dTMP synthesis rates required to replicate the genome and the rate of dTMP synthesis currently predicted by the model indicates that the model should be extended to include multi-enzyme complex formation and substrate channeling in the nucleus to model more accurately determinants of FOCM and dTMP synthesis. The inclusion of the dTMP multi-enzyme metabolic complex in the model is expected to limit substrate diffusion and increase the rate of dTMP synthesis.

Methods

Description of the Model and of the Simulation Techniques

The model was constructed as a closed system using the subset of reactions that describe the FOCM pathways and homocysteine remethylation in cytoplasm19 (Fig. 1). For the simulation, we employed a hybrid stochastic approach using deterministic simulation to compute the initial phase of the dynamics until a model steady state was reached, and then we assessed the stability of the achieved steady state by relying on exact stochastic simulation. We adopted a hybrid approach, rather than one entirely based on exact stochastic simulation59, because of the intensive computational effort introduced by the stiffness of the system during the simulation.

The deterministic simulation was based on ODEs, where reactions were described in terms of Michaelis-Menten equations consistent with the original model of Reed19 and computed using the MATLAB integrator ode15s, whereas parameter estimates were derived from literature or calculated by nonlinear least squares optimization. The kinetic constants were obtained from folate polyglutamate cofactors and their interaction with enzymes purified from L1210 cells where possible and otherwise from other mammalian tissue (see Supplementary Material for a detailed discussion of the model and parameter estimates).

The set of ODEs was further translated into a stochastic reaction-based model and a hybrid simulation approach60 was employed to quantify the level of stochasticity in the considered FOCM steady states. We refer to the Supplementary Material for a detailed description of how the ODE model has been translated to a reaction based stochastic one. According to the seminal work of Gillespie59, exact stochastic simulation allows simulation of each reaction event asynchronously when such reaction event is most probable to occur. To allow this, the simulation algorithm computes a propensity function a j (x) at each simulation step for each modeled reaction R j , where x is the current state of the system providing the abundances of all modeled species at the considered time. The propensity value of a reaction has a direct link with the probability of its execution, that is, reactions with higher propensity are more likely to be fired in the near future. To evaluate when the next reaction event will occur, also the total sum of propensities \(\,{a}_{0}(x)=\,\sum _{{R}_{j}}{a}_{j}(x)\) is computed, because this quantity is linked to the number of reaction events occurring in the next time unit, that is, with increasing total propensity the number of reaction events per unit of time also increases. In Table 11, we provide the total sum of propensities for the considered steady states. On average, we would need to generate up to 1015–1016 reaction events per unit of time during the stochastic simulation, due to stiffness of the system. To circumvent the problem, we relied on the concept of total propensity to evaluate the stability of the steady state, by assuming that a steady state is more stable when it has a lower value of a 0(x), that is, a lower averaged number of reaction events that can perturb the equilibrium of the steady state.

Model validation

To validate the FOCM model, in silico experiments were performed to determine if the model could recapitulate empirical data generated in MCF-7 cells by Herbig et al.10 focusing on the effect of glycine on FOCM (Fig. 2). Glycine is important because, as a second substrate, it has a direct influence on the reaction catalyzed by the enzymes GNMT as well as on the reversible reaction transforming CH2F to THF catalyzed by the enzyme SHMT. The purpose is to understand how the steady state of FOCM is affected by altering intracellular glycine concentrations. This was achieved by running several model simulations starting from different glycine concentrations and comparing the corresponding steady states with empirical data from Herbig et al.10. This study examined the effect of exogenous glycine at concentrations from 0 to 10 mM on the relative distribution of folate one-carbon forms as well as S-adenosylmethionine (SAM) and S-adenosylhomocysteine (SAH) levels. In summary, the empirical data revealed that as glycine concentrations increase intracellular 10fTHF levels increase at the expense of 5mTHF levels that decrease. Furthermore, as glycine concentrations increase SAM levels are depleted and SAH levels rise. These changes were interpreted by the effects of glycine concentration driving the reversible SHMT reaction in the direction of serine synthesis10.

We simulated the effect of glycine on folate distribution, SAM, and SAH concentrations using the computational model for values of glycine ranging from 0 to 10 mM (Fig. 2). The trends obtained by the model simulations were in agreement with the literature (green arrows in Fig. 2), confirming the coherence between model outcomes and empirical data. We observed only one exception related to the total % of 5 mTHF at 10 mM glycine (red arrow in Fig. 2). This discrepancy could be mainly due to two reasons: (1) 10 mM glycine is an extreme and non-physiological intracellular glycine concentration that could cause pharmacological effects, (2) the large magnitude of the experimental error in the Herbig et al. study at this glycine concentration.