\theorembodyfont\theoremheaderfont\theorempostheader

: \theoremsep
\jmlrvolume \firstpageno1 \jmlryear2024 \jmlrworkshopSymmetry and Geometry in Neural Representations

Hamiltonian Matching for Symplectic Neural Integrators

\NamePriscilla Canizares \Email[email protected]
\NameDavide Murari \Email[email protected]
\NameCarola-Bibiane Schönlieb \Email[email protected]
\NameFerdia Sherry \Email[email protected]
\NameZakhar Shumaylov \Email[email protected]
\addrDepartment of Applied Mathematics and Theoretical Physics
   University of Cambridge    Wilberforce Road    Cambridge CB3 0WA    UK
Abstract

Hamilton’s equations of motion form a fundamental framework in various branches of physics, including astronomy, quantum mechanics, particle physics, and climate science. Classical numerical solvers are typically employed to compute the time evolution of these systems. However, when the system spans multiple spatial and temporal scales numerical errors can accumulate, leading to reduced accuracy. To address the challenges of evolving such systems over long timescales, we propose SympFlow, a novel neural network-based symplectic integrator, which is the composition of a sequence of exact flow maps of parametrised time-dependent Hamiltonian functions. This architecture allows for a backward error analysis: we can identify an underlying Hamiltonian function of the architecture and use it to define a Hamiltonian matching objective function, which we use for training. In numerical experiments, we show that SympFlow exhibits promising results, with qualitative energy conservation behaviour similar to that of time-stepping symplectic integrators.

keywords:
Hamiltonian systems, backward error analysis, physics-informed machine learning, scientific machine learning.

1 Introduction

In this work, we will study the use of neural networks to integrate Hamiltonian systems, which were first defined in the context of classical mechanics and which have since found many applications in physics (Arnold, 1978). More specifically, when we speak of Hamiltonian systems, we are considering ordinary differential equations (ODEs) of the following form for a state variable x𝐑2d𝑥superscript𝐑2𝑑x\in\mathbf{R}^{2d}italic_x ∈ bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT and Hamiltonian function H:𝐑2d𝐑:𝐻superscript𝐑2𝑑𝐑H:\mathbf{R}^{2d}\to\mathbf{R}italic_H : bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT → bold_R:

dxdt=𝐉H(x),with𝐉=(0iddidd0).formulae-sequenced𝑥d𝑡𝐉𝐻𝑥with𝐉matrix0subscriptid𝑑subscriptid𝑑0\frac{\mathrm{d}x}{\mathrm{d}t}=\mathbf{J}\nabla H(x),\quad\text{with}\quad% \mathbf{J}=\begin{pmatrix}0&\operatorname{id}_{d}\\ -\operatorname{id}_{d}&0\end{pmatrix}.divide start_ARG roman_d italic_x end_ARG start_ARG roman_d italic_t end_ARG = bold_J ∇ italic_H ( italic_x ) , with bold_J = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL roman_id start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - roman_id start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) . (1)

Typically, the state variable is partitioned into a position q𝐑d𝑞superscript𝐑𝑑q\in\mathbf{R}^{d}italic_q ∈ bold_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and momentum p𝐑d𝑝superscript𝐑𝑑p\in\mathbf{R}^{d}italic_p ∈ bold_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. Under standard (non-restrictive) assumptions on H𝐻Hitalic_H (Arnold, 1991), the corresponding initial value problem has a unique solution for any initial condition and initial time, which can be used to define the corresponding flow map ϕH:𝐑×𝐑2d𝐑2d:subscriptitalic-ϕ𝐻𝐑superscript𝐑2𝑑superscript𝐑2𝑑\phi_{H}:\mathbf{R}\times\mathbf{R}^{2d}\to\mathbf{R}^{2d}italic_ϕ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT : bold_R × bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT → bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT by

ddtϕH,t(x)=𝐉H(ϕH,t(x))andϕH,0(x)=x.formulae-sequencedd𝑡subscriptitalic-ϕ𝐻𝑡𝑥𝐉𝐻subscriptitalic-ϕ𝐻𝑡𝑥andsubscriptitalic-ϕ𝐻0𝑥𝑥\frac{\mathrm{d}}{\mathrm{d}t}\phi_{H,t}(x)=\mathbf{J}\nabla H(\phi_{H,t}(x))% \quad\text{and}\quad\phi_{H,0}(x)=x.divide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG italic_ϕ start_POSTSUBSCRIPT italic_H , italic_t end_POSTSUBSCRIPT ( italic_x ) = bold_J ∇ italic_H ( italic_ϕ start_POSTSUBSCRIPT italic_H , italic_t end_POSTSUBSCRIPT ( italic_x ) ) and italic_ϕ start_POSTSUBSCRIPT italic_H , 0 end_POSTSUBSCRIPT ( italic_x ) = italic_x . (2)

Since the exact flow map in eq. 2 is generally not accessible, it is necessary to make a “satisfactory” approximation to such an exact flow map. Depending on the context, this can have different meanings, but for Hamiltonian systems, there are important structural properties that can provide guidance: the flow map is symplectic, meaning that the Jacobian matrix DϕH,t(x)𝐷subscriptitalic-ϕ𝐻𝑡𝑥D\phi_{H,t}(x)italic_D italic_ϕ start_POSTSUBSCRIPT italic_H , italic_t end_POSTSUBSCRIPT ( italic_x ) satisfies the identity [DϕH,t(x)]𝐉[DϕH,t(x)]=𝐉,superscriptdelimited-[]𝐷subscriptitalic-ϕ𝐻𝑡𝑥top𝐉delimited-[]𝐷subscriptitalic-ϕ𝐻𝑡𝑥𝐉[D\phi_{H,t}(x)]^{\top}\mathbf{J}[D\phi_{H,t}(x)]=\mathbf{J},[ italic_D italic_ϕ start_POSTSUBSCRIPT italic_H , italic_t end_POSTSUBSCRIPT ( italic_x ) ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_J [ italic_D italic_ϕ start_POSTSUBSCRIPT italic_H , italic_t end_POSTSUBSCRIPT ( italic_x ) ] = bold_J , and the Hamiltonian H𝐻Hitalic_H is conserved, i.e. H(ϕH,t(x))=H(x)𝐻subscriptitalic-ϕ𝐻𝑡𝑥𝐻𝑥H(\phi_{H,t}(x))=H(x)italic_H ( italic_ϕ start_POSTSUBSCRIPT italic_H , italic_t end_POSTSUBSCRIPT ( italic_x ) ) = italic_H ( italic_x ). Symplecticity also implies that volumes in phase space are preserved. When numerically approximating the flow map, it is desirable to take these structural properties into account, and doing so has led to the celebrated field of geometric numerical integration (Hairer et al., 2006).

The neural network architecture that we propose takes inspiration from the methods of geometric numerical integration to define a time-dependent symplectic neural network architecture, SympFlow, that can be used to approximate Hamiltonian flow maps. There are similarities between SympFlow and previous works on symplectic neural networks (Jin et al., 2020; Burby et al., 2021), but SympFlow differs from these approaches by incorporating a time dependence which allows an underlying Hamiltonian to be identified.

In this work, we study SympFlow from the perspective of physics-informed machine learning (Karniadakis et al., 2021): we design an unsupervised learning approach for approximating Hamiltonian flow maps. It is in principle also possible to go in the other direction, fitting the approximate flow map to trajectories, and extracting the underlying Hamiltonian to discover a physical model, a task which has previously been studied in the scientific machine learning literature (Bertalan et al., 2019; Greydanus et al., 2019).

2 Methodology

2.1 The SympFlow architecture

Fundamentally, SympFlow is defined by iterated composition of exact flow maps of time-dependent Hamiltonians, each of which depends either on position or momentum, but not both. Given an arbitrary C2superscript𝐶2C^{2}italic_C start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT function Vp:𝐑×𝐑d𝐑:subscript𝑉p𝐑superscript𝐑𝑑𝐑V_{\mathrm{p}}:\mathbf{R}\times\mathbf{R}^{d}\to\mathbf{R}italic_V start_POSTSUBSCRIPT roman_p end_POSTSUBSCRIPT : bold_R × bold_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → bold_R, we can consider the map

ϕp,t((q,p))=(qp(qVp(t,q)qVp(0,q))),subscriptitalic-ϕp𝑡𝑞𝑝matrix𝑞𝑝subscript𝑞subscript𝑉p𝑡𝑞subscript𝑞subscript𝑉p0𝑞\phi_{\mathrm{p},t}((q,p))=\begin{pmatrix}q\\ p-(\nabla_{q}V_{\mathrm{p}}(t,q)-\nabla_{q}V_{\mathrm{p}}(0,q))\end{pmatrix},italic_ϕ start_POSTSUBSCRIPT roman_p , italic_t end_POSTSUBSCRIPT ( ( italic_q , italic_p ) ) = ( start_ARG start_ROW start_CELL italic_q end_CELL end_ROW start_ROW start_CELL italic_p - ( ∇ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT roman_p end_POSTSUBSCRIPT ( italic_t , italic_q ) - ∇ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT roman_p end_POSTSUBSCRIPT ( 0 , italic_q ) ) end_CELL end_ROW end_ARG ) , (3)

which is the flow map (starting from time 00) corresponding to the Hamiltonian Hp,t((q,p))=V˙p(t,q),subscript𝐻p𝑡𝑞𝑝subscript˙𝑉p𝑡𝑞H_{\mathrm{p},t}((q,p))=\dot{V}_{\mathrm{p}}(t,q),italic_H start_POSTSUBSCRIPT roman_p , italic_t end_POSTSUBSCRIPT ( ( italic_q , italic_p ) ) = over˙ start_ARG italic_V end_ARG start_POSTSUBSCRIPT roman_p end_POSTSUBSCRIPT ( italic_t , italic_q ) , where V˙psubscript˙𝑉p\dot{V}_{\mathrm{p}}over˙ start_ARG italic_V end_ARG start_POSTSUBSCRIPT roman_p end_POSTSUBSCRIPT stands for ddtVpdd𝑡subscript𝑉p\frac{\mathrm{d}}{\mathrm{d}t}V_{\mathrm{p}}divide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG italic_V start_POSTSUBSCRIPT roman_p end_POSTSUBSCRIPT and the subscript p indicates that the Hamiltonian depends on position, but not momentum. Similarly, for a C2superscript𝐶2C^{2}italic_C start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT function Vm:𝐑×𝐑d𝐑:subscript𝑉m𝐑superscript𝐑𝑑𝐑V_{\mathrm{m}}:\mathbf{R}\times\mathbf{R}^{d}\to\mathbf{R}italic_V start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT : bold_R × bold_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → bold_R, we can consider the map

ϕm,t((q,p))=(q+(pVm(t,p)pVm(0,p))p),subscriptitalic-ϕm𝑡𝑞𝑝matrix𝑞subscript𝑝subscript𝑉m𝑡𝑝subscript𝑝subscript𝑉m0𝑝𝑝\phi_{\mathrm{m},t}((q,p))=\begin{pmatrix}q+(\nabla_{p}V_{\mathrm{m}}(t,p)-% \nabla_{p}V_{\mathrm{m}}(0,p))\\ p\end{pmatrix},italic_ϕ start_POSTSUBSCRIPT roman_m , italic_t end_POSTSUBSCRIPT ( ( italic_q , italic_p ) ) = ( start_ARG start_ROW start_CELL italic_q + ( ∇ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT ( italic_t , italic_p ) - ∇ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT ( 0 , italic_p ) ) end_CELL end_ROW start_ROW start_CELL italic_p end_CELL end_ROW end_ARG ) , (4)

which is the flow map (starting from time 00) corresponding to the Hamiltonian Hm,t((q,p))=V˙m(t,p)subscript𝐻m𝑡𝑞𝑝subscript˙𝑉m𝑡𝑝H_{\mathrm{m},t}((q,p))=\dot{V}_{\mathrm{m}}(t,p)italic_H start_POSTSUBSCRIPT roman_m , italic_t end_POSTSUBSCRIPT ( ( italic_q , italic_p ) ) = over˙ start_ARG italic_V end_ARG start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT ( italic_t , italic_p ). As above, the subscript m indicates that the Hamiltonian depends on momentum but not position. Although the Hamiltonians above take a very particular form, they naturally arise when applying splitting integration methods to separable Hamiltonians. By parametrising Vpsubscript𝑉pV_{\mathrm{p}}italic_V start_POSTSUBSCRIPT roman_p end_POSTSUBSCRIPT and Vmsubscript𝑉mV_{\mathrm{m}}italic_V start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT as multi-layer perceptrons (MLPs), say, and composing such steps, we get a time-dependent symplectic map, the parameters of which can be optimised to fit data or, more generally, minimise an objective function.

2.2 The Hamiltonian of the SympFlow architecture

As we will see in Proposition 2.1, we can find a time-dependent Hamiltonian function corresponding to the SympFlow architecture. This allows us to essentially perform a backward error analysis: while SympFlow does not generally solve the true ODE under consideration, it solves a time-dependent Hamiltonian ODE, the Hamiltonian of which we can give an expression for. We can apply the following result for this purpose:

Proposition 2.1 (Proposition 1.4.D from Polterovich (2001)).

Let H1,H2:𝐑×𝐑2d𝐑:superscript𝐻1superscript𝐻2𝐑superscript𝐑2𝑑𝐑H^{1},H^{2}:\mathbf{R}\times\mathbf{R}^{2d}\to\mathbf{R}italic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_H start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : bold_R × bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT → bold_R be continuously differentiable functions, and ϕHt1,t,ϕHt2,t:𝐑2d𝐑2d:subscriptitalic-ϕsuperscriptsubscript𝐻𝑡1𝑡subscriptitalic-ϕsuperscriptsubscript𝐻𝑡2𝑡superscript𝐑2𝑑superscript𝐑2𝑑\phi_{H_{t}^{1},t},\phi_{H_{t}^{2},t}:\mathbf{R}^{2d}\to\mathbf{R}^{2d}italic_ϕ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_t end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , italic_t end_POSTSUBSCRIPT : bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT → bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT the exact flows (starting from time 00) of the Hamiltonian systems they define. Then, the map ψt=ϕHt2,tϕHt1,t:𝐑2d𝐑2d:subscript𝜓𝑡subscriptitalic-ϕsuperscriptsubscript𝐻𝑡2𝑡subscriptitalic-ϕsuperscriptsubscript𝐻𝑡1𝑡superscript𝐑2𝑑superscript𝐑2𝑑\psi_{t}=\phi_{H_{t}^{2},t}\circ\phi_{H_{t}^{1},t}:\mathbf{R}^{2d}\to\mathbf{R% }^{2d}italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_ϕ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , italic_t end_POSTSUBSCRIPT ∘ italic_ϕ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_t end_POSTSUBSCRIPT : bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT → bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT is the exact flow of the time-dependent Hamiltonian system defined by the Hamiltonian function

Ht3(x)=Ht2(x)+Ht1(ϕHt2,t1(x)).subscriptsuperscript𝐻3𝑡𝑥subscriptsuperscript𝐻2𝑡𝑥subscriptsuperscript𝐻1𝑡superscriptsubscriptitalic-ϕsuperscriptsubscript𝐻𝑡2𝑡1𝑥H^{3}_{t}(x)=H^{2}_{t}(x)+H^{1}_{t}\Big{(}\phi_{H_{t}^{2},t}^{-1}(x)\Big{)}.italic_H start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_x ) = italic_H start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_x ) + italic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_ϕ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_x ) ) .

As a result, given an overall SympFlow of the following form (with ϕm,tisuperscriptsubscriptitalic-ϕm𝑡𝑖\phi_{\mathrm{m},t}^{i}italic_ϕ start_POSTSUBSCRIPT roman_m , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT taking the form of eq. 4 for some Vmisuperscriptsubscript𝑉m𝑖V_{\mathrm{m}}^{i}italic_V start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT and ϕp,tisuperscriptsubscriptitalic-ϕp𝑡𝑖\phi_{\mathrm{p},t}^{i}italic_ϕ start_POSTSUBSCRIPT roman_p , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT taking the form of eq. 3 for some Vpisuperscriptsubscript𝑉p𝑖V_{\mathrm{p}}^{i}italic_V start_POSTSUBSCRIPT roman_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT)

ψ¯t=ϕm,tLϕp,tLϕm,t1ϕp,t1,subscript¯𝜓𝑡superscriptsubscriptitalic-ϕm𝑡𝐿superscriptsubscriptitalic-ϕp𝑡𝐿superscriptsubscriptitalic-ϕm𝑡1superscriptsubscriptitalic-ϕp𝑡1\bar{\psi}_{t}=\phi_{\mathrm{m},t}^{L}\circ\phi_{\mathrm{p},t}^{L}\circ\ldots% \circ\phi_{\mathrm{m},t}^{1}\circ\phi_{\mathrm{p},t}^{1},over¯ start_ARG italic_ψ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_ϕ start_POSTSUBSCRIPT roman_m , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT ∘ italic_ϕ start_POSTSUBSCRIPT roman_p , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT ∘ … ∘ italic_ϕ start_POSTSUBSCRIPT roman_m , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∘ italic_ϕ start_POSTSUBSCRIPT roman_p , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , (5)

we can associate the SympFlow with a time-dependent Hamiltonian. To denote this Hamiltonian, we introduce the operator \mathcal{H}caligraphic_H sending a SympFlow into one of its generating Hamiltonian functions (all of which differ just by a constant), so one has (ψ¯):𝐑×𝐑2d𝐑:¯𝜓𝐑superscript𝐑2𝑑𝐑\mathcal{H}(\bar{\psi}):\mathbf{R}\times\mathbf{R}^{2d}\to\mathbf{R}caligraphic_H ( over¯ start_ARG italic_ψ end_ARG ) : bold_R × bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT → bold_R.

To assemble such a function, we can group the pairs of alternated momentum and position flows, finding the Hamiltonian associated with ϕm,tiϕp,tisuperscriptsubscriptitalic-ϕm𝑡𝑖superscriptsubscriptitalic-ϕp𝑡𝑖\phi_{\mathrm{m},t}^{i}\circ\phi_{\mathrm{p},t}^{i}italic_ϕ start_POSTSUBSCRIPT roman_m , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ∘ italic_ϕ start_POSTSUBSCRIPT roman_p , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT, which is Hti((q,p))=V˙mi(t,p)+V˙pi(t,q(pVmi(t,p)pVmi(0,p))).subscriptsuperscript𝐻𝑖𝑡𝑞𝑝superscriptsubscript˙𝑉m𝑖𝑡𝑝superscriptsubscript˙𝑉p𝑖𝑡𝑞subscript𝑝superscriptsubscript𝑉m𝑖𝑡𝑝subscript𝑝superscriptsubscript𝑉m𝑖0𝑝H^{i}_{t}((q,p))=\dot{V}_{\mathrm{m}}^{i}(t,p)+\dot{V}_{\mathrm{p}}^{i}(t,q-(% \nabla_{p}V_{\mathrm{m}}^{i}(t,p)-\nabla_{p}V_{\mathrm{m}}^{i}(0,p))).italic_H start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( ( italic_q , italic_p ) ) = over˙ start_ARG italic_V end_ARG start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ( italic_t , italic_p ) + over˙ start_ARG italic_V end_ARG start_POSTSUBSCRIPT roman_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ( italic_t , italic_q - ( ∇ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ( italic_t , italic_p ) - ∇ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT roman_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ( 0 , italic_p ) ) ) . The Hamiltonian (ψ¯)¯𝜓\mathcal{H}(\bar{\psi})caligraphic_H ( over¯ start_ARG italic_ψ end_ARG ) can then be expressed iteratively, aggregating from last layer to first as

HtL:i(x)=HtL:(i+1)(x)+Hti(ϕHtL:(i+1),t1(x)),i=1,,L1,formulae-sequencesuperscriptsubscript𝐻𝑡:𝐿𝑖𝑥superscriptsubscript𝐻𝑡:𝐿𝑖1𝑥superscriptsubscript𝐻𝑡𝑖superscriptsubscriptitalic-ϕsuperscriptsubscript𝐻𝑡:𝐿𝑖1𝑡1𝑥𝑖1𝐿1H_{t}^{L:i}(x)=H_{t}^{L:(i+1)}(x)+H_{t}^{i}\left(\phi_{H_{t}^{L:(i+1)},t}^{-1}% (x)\right),\,\,i=1,\ldots,L-1,italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L : italic_i end_POSTSUPERSCRIPT ( italic_x ) = italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L : ( italic_i + 1 ) end_POSTSUPERSCRIPT ( italic_x ) + italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ( italic_ϕ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L : ( italic_i + 1 ) end_POSTSUPERSCRIPT , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_x ) ) , italic_i = 1 , … , italic_L - 1 ,

where HtL:L=HtLsuperscriptsubscript𝐻𝑡:𝐿𝐿superscriptsubscript𝐻𝑡𝐿H_{t}^{L:L}=H_{t}^{L}italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L : italic_L end_POSTSUPERSCRIPT = italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT, and

ϕHtL:i,t1=(ϕHtL:(i+1),tϕHti,t)1=ϕHti,t1ϕHtL:(i+1),t1.superscriptsubscriptitalic-ϕsuperscriptsubscript𝐻𝑡:𝐿𝑖𝑡1superscriptsubscriptitalic-ϕsuperscriptsubscript𝐻𝑡:𝐿𝑖1𝑡subscriptitalic-ϕsuperscriptsubscript𝐻𝑡𝑖𝑡1superscriptsubscriptitalic-ϕsuperscriptsubscript𝐻𝑡𝑖𝑡1superscriptsubscriptitalic-ϕsuperscriptsubscript𝐻𝑡:𝐿𝑖1𝑡1\phi_{H_{t}^{L:i},t}^{-1}=\left(\phi_{H_{t}^{L:(i+1)},t}\circ\phi_{H_{t}^{i},t% }\right)^{-1}=\phi_{H_{t}^{i},t}^{-1}\circ\phi_{H_{t}^{L:(i+1)},t}^{-1}.italic_ϕ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L : italic_i end_POSTSUPERSCRIPT , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = ( italic_ϕ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L : ( italic_i + 1 ) end_POSTSUPERSCRIPT , italic_t end_POSTSUBSCRIPT ∘ italic_ϕ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , italic_t end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = italic_ϕ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∘ italic_ϕ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L : ( italic_i + 1 ) end_POSTSUPERSCRIPT , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

The Hamiltonian of the network (ψ¯)¯𝜓\mathcal{H}(\bar{\psi})caligraphic_H ( over¯ start_ARG italic_ψ end_ARG ) then corresponds to HtL:1superscriptsubscript𝐻𝑡:𝐿1H_{t}^{L:1}italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L : 1 end_POSTSUPERSCRIPT.

2.3 Training SympFlow: Flow learning with Hamiltonian matching

The natural task to use the SympFlow architecture for is to approximate the flow map of a Hamiltonian system as in eq. 1. We will assume the existence of a compact subset Ω𝐑2dΩsuperscript𝐑2𝑑\Omega\subset\mathbf{R}^{2d}roman_Ω ⊂ bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT which is forward invariant, meaning that ϕH,t(Ω)Ωsubscriptitalic-ϕ𝐻𝑡ΩΩ\phi_{H,t}(\Omega)\subseteq\Omegaitalic_ϕ start_POSTSUBSCRIPT italic_H , italic_t end_POSTSUBSCRIPT ( roman_Ω ) ⊆ roman_Ω for every t0𝑡0t\geq 0italic_t ≥ 0.

To train SympFlow, we consider a loss function composed of two terms. The first is the usual physics-informed (PI) loss function (Raissi et al., 2019), based on the residual of eq. 2,

1(ψ¯)=1Ni=1Nddtψ¯ti(x0i)𝐉H(ψ¯ti(x0i))22,subscript1¯𝜓1𝑁superscriptsubscript𝑖1𝑁superscriptsubscriptnormdd𝑡subscript¯𝜓subscript𝑡𝑖superscriptsubscript𝑥0𝑖𝐉𝐻subscript¯𝜓subscript𝑡𝑖superscriptsubscript𝑥0𝑖22\mathcal{L}_{1}(\bar{\psi})=\frac{1}{N}\sum_{i=1}^{N}\Big{\|}\frac{\mathrm{d}}% {\mathrm{d}t}\bar{\psi}_{t_{i}}(x_{0}^{i})-\mathbf{J}\nabla H(\bar{\psi}_{t_{i% }}(x_{0}^{i}))\Big{\|}_{2}^{2},caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( over¯ start_ARG italic_ψ end_ARG ) = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ∥ divide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG over¯ start_ARG italic_ψ end_ARG start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) - bold_J ∇ italic_H ( over¯ start_ARG italic_ψ end_ARG start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ,

where x0iΩsuperscriptsubscript𝑥0𝑖Ωx_{0}^{i}\in\Omegaitalic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ∈ roman_Ω and ti[0,Δt]subscript𝑡𝑖0Δ𝑡t_{i}\in[0,\Delta t]italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ [ 0 , roman_Δ italic_t ] for every i=1,,N𝑖1𝑁i=1,\ldots,Nitalic_i = 1 , … , italic_N. Unlike classical numerical integrators, ΔtΔ𝑡\Delta troman_Δ italic_t can be chosen to be large. Using the analysis in the previous section, it is also possible to extract the underlying Hamiltonian of a given instance of SympFlow. This gives rise to a natural training objective, which we call the Hamiltonian matching loss and which constitutes the second term in our loss, defined as

2(ψ¯)=1Mi=1M((ψ¯)(ti,xi)H(xi))2,subscript2¯𝜓1𝑀superscriptsubscript𝑖1𝑀superscript¯𝜓subscript𝑡𝑖superscript𝑥𝑖𝐻superscript𝑥𝑖2\mathcal{L}_{2}(\bar{\psi})=\frac{1}{M}\sum_{i=1}^{M}\left(\mathcal{H}(\bar{% \psi})(t_{i},x^{i})-H(x^{i})\right)^{2},caligraphic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( over¯ start_ARG italic_ψ end_ARG ) = divide start_ARG 1 end_ARG start_ARG italic_M end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT ( caligraphic_H ( over¯ start_ARG italic_ψ end_ARG ) ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) - italic_H ( italic_x start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ,

where xiΩsuperscript𝑥𝑖Ωx^{i}\in\Omegaitalic_x start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ∈ roman_Ω and ti[0,Δt]subscript𝑡𝑖0Δ𝑡t_{i}\in[0,\Delta t]italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ [ 0 , roman_Δ italic_t ] for every i=1,,N𝑖1𝑁i=1,\ldots,Nitalic_i = 1 , … , italic_N. The loss function we optimise over the space of SympFlows, is then =1+2subscript1subscript2\mathcal{L}=\mathcal{L}_{1}+\mathcal{L}_{2}caligraphic_L = caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + caligraphic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and we use PyTorch (Paszke et al., 2019) to get its gradient for use in the Adam optimiser (Kingma and Ba, 2017).

Given a trained SympFlow network, representing a function ψ¯:[0,Δt]×𝐑2d𝐑2d:¯𝜓0Δ𝑡superscript𝐑2𝑑superscript𝐑2𝑑\bar{\psi}:[0,\Delta t]\times\mathbf{R}^{2d}\to\mathbf{R}^{2d}over¯ start_ARG italic_ψ end_ARG : [ 0 , roman_Δ italic_t ] × bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT → bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT, we can apply it to longer time horizons as follows: extend ψ¯¯𝜓\bar{\psi}over¯ start_ARG italic_ψ end_ARG to [0,+)0[0,+\infty)[ 0 , + ∞ ) via ψ:[0,+)×𝐑2d𝐑2d:𝜓0superscript𝐑2𝑑superscript𝐑2𝑑\psi:[0,+\infty)\times\mathbf{R}^{2d}\to\mathbf{R}^{2d}italic_ψ : [ 0 , + ∞ ) × bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT → bold_R start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT defined as

ψt(x0):=ψ¯tΔtt/Δt(ψ¯Δt)t/Δt(x0),assignsubscript𝜓𝑡subscript𝑥0subscript¯𝜓𝑡Δ𝑡𝑡Δ𝑡superscriptsubscript¯𝜓Δ𝑡𝑡Δ𝑡subscript𝑥0\psi_{t}(x_{0}):=\bar{\psi}_{t-\Delta t\lfloor t/\Delta t\rfloor}\circ\left(% \bar{\psi}_{\Delta t}\right)^{\lfloor t/\Delta t\rfloor}(x_{0}),italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) := over¯ start_ARG italic_ψ end_ARG start_POSTSUBSCRIPT italic_t - roman_Δ italic_t ⌊ italic_t / roman_Δ italic_t ⌋ end_POSTSUBSCRIPT ∘ ( over¯ start_ARG italic_ψ end_ARG start_POSTSUBSCRIPT roman_Δ italic_t end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ⌊ italic_t / roman_Δ italic_t ⌋ end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , (6)

which can be considered an approximation of ϕH,t(x0)subscriptitalic-ϕ𝐻𝑡subscript𝑥0\phi_{H,t}(x_{0})italic_ϕ start_POSTSUBSCRIPT italic_H , italic_t end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) for every t0𝑡0t\geq 0italic_t ≥ 0.

3 Experiments

We consider two Hamiltonian systems: the harmonic oscillator, with H(q,p)=(q2+p2)/2𝐻𝑞𝑝superscript𝑞2superscript𝑝22H(q,p)=(q^{2}+p^{2})/2italic_H ( italic_q , italic_p ) = ( italic_q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_p start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) / 2, and the Hénon-Heiles system, with H(q1,q2,p1,p2)=(p12+p22)/2+(q12+q22)/2+q12q2q23/3𝐻subscript𝑞1subscript𝑞2subscript𝑝1subscript𝑝2superscriptsubscript𝑝12superscriptsubscript𝑝222superscriptsubscript𝑞12superscriptsubscript𝑞222superscriptsubscript𝑞12subscript𝑞2superscriptsubscript𝑞233H(q_{1},q_{2},p_{1},p_{2})=(p_{1}^{2}+p_{2}^{2})/2+(q_{1}^{2}+q_{2}^{2})/2+q_{% 1}^{2}q_{2}-q_{2}^{3}/3italic_H ( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) / 2 + ( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) / 2 + italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT / 3. The Hénon-Heiles system is notable for exhibiting chaos. We train SympFlow and a comparable MLP (using only physics-informed loss), on an interval with Δt=1Δ𝑡1\Delta t=1roman_Δ italic_t = 1. We also compare to an adaptive integrator, ODE45, as implemented in SciPy (with the default tolerances) (Virtanen et al., 2020). More results are shown in appendix A, but note in particular the good long-time energy behaviour of SympFlow, compared to the other methods:

Refer to caption
Refer to caption
Figure 1: A comparison of the long-time energy conservation of SympFlow and an unconstrained neural flow map approximator on the Hénon-Heiles system.

4 Conclusions and discussion

We have given a demonstration of the SympFlow architecture and its application to integrating Hamiltonian systems. As shown in our experiments in section 3, SympFlow exhibits good long-time conservation of energy, as is common for classical time-stepping symplectic numerical integrators as well. There is a large body of theoretical research on such properties for time-stepping integrators, which may be leveraged in future research to theoretically support the behaviour that we have observed for SympFlow. The setting of Hamiltonian systems is not as restrictive as it may appear at a first glance: we have only considered ODEs here, but extensions are possible to PDEs (Bridges and Reich, 2006), and even to non-conservative systems Galley (2013); Galley et al. (2014); Tsang et al. (2015). Studying such extensions in more detail is a promising avenue for future research.

References

  • Arnold (1978) V. I. Arnold. Mathematical Methods of Classical Mechanics, volume 60 of Graduate Texts in Mathematics. Springer New York, New York, NY, 1978. ISBN 978-1-4757-1695-5 978-1-4757-1693-1. 10.1007/978-1-4757-1693-1.
  • Arnold (1991) V. I. Arnold. Ordinary Differential Equations. MIT Press, Cambridge, Mass., 8. print edition, 1991. ISBN 978-0-262-51018-9 978-0-262-01037-5.
  • Bertalan et al. (2019) Tom Bertalan, Felix Dietrich, Igor Mezić, and Ioannis G. Kevrekidis. On learning Hamiltonian systems from data. Chaos: An Interdisciplinary Journal of Nonlinear Science, 29(12):121107, December 2019. ISSN 1054-1500, 1089-7682. 10.1063/1.5128231.
  • Bridges and Reich (2006) Thomas J Bridges and Sebastian Reich. Numerical methods for Hamiltonian PDEs. Journal of Physics A: Mathematical and General, 39(19):5287–5320, May 2006. ISSN 0305-4470, 1361-6447. 10.1088/0305-4470/39/19/S02.
  • Burby et al. (2021) J. W. Burby, Q. Tang, and R. Maulik. Fast neural Poincaré maps for toroidal magnetic fields. Plasma Physics and Controlled Fusion, 63(2):024001, 2021. ISSN 0741-3335, 1361-6587. 10.1088/1361-6587/abcbaa.
  • Galley (2013) Chad R. Galley. Classical mechanics of nonconservative systems. Physical Review Letters, 110(17), 2013. 10.1103/physrevlett.110.174301. URL https://doi.org/10.1103%2Fphysrevlett.110.174301.
  • Galley et al. (2014) Chad R. Galley, David Tsang, and Leo C. Stein. The principle of stationary nonconservative action for classical mechanics and field theories. 2014. 10.48550/ARXIV.1412.3082. URL https://arxiv.org/abs/1412.3082.
  • Greydanus et al. (2019) Samuel Greydanus, Misko Dzamba, and Jason Yosinski. Hamiltonian neural networks. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper_files/paper/2019/file/26cd8ecadce0d4efd6cc8a8725cbd1f8-Paper.pdf.
  • Hairer et al. (2006) Ernst Hairer, Christian Lubich, and Gerhard Wanner. Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations. Number 31 in Springer Series in Computational Mathematics. Springer, Berlin ; New York, 2nd ed edition, 2006. ISBN 978-3-540-30663-4.
  • Jin et al. (2020) Pengzhan Jin, Zhen Zhang, Aiqing Zhu, Yifa Tang, and George Em Karniadakis. SympNets: Intrinsic structure-preserving symplectic networks for identifying Hamiltonian systems. Neural Networks, 132:166–179, 2020. ISSN 0893-6080. 10.1016/j.neunet.2020.08.017.
  • Karniadakis et al. (2021) George Em Karniadakis, Ioannis G. Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning. Nature Reviews Physics, 3(6):422–440, June 2021. ISSN 2522-5820. 10.1038/s42254-021-00314-5.
  • Kingma and Ba (2017) Diederik P. Kingma and Jimmy Ba. Adam: A Method for Stochastic Optimization. arXiv:1412.6980, 2017.
  • Paszke et al. (2019) Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems, volume 32, pages 8026–8037, 2019.
  • Polterovich (2001) Leonid Polterovich. The Geometry of the Group of Symplectic Diffeomorphisms. Lectures in Mathematics ETH Zürich. Springer Basel AG, Basel, 2001. ISBN 978-3-7643-6432-8.
  • Raissi et al. (2019) M Raissi, P. Perdikaris, and G. E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019. ISSN 0021-9991. 10.1016/j.jcp.2018.10.045.
  • Tsang et al. (2015) David Tsang, Chad R. Galley, Leo C. Stein, and Alec Turner. “Slimplectic” Integrators: Variational Integrators for General Nonconservative Systems. The Astrophysical Journal, 809(1):L9, 2015. ISSN 2041-8213. 10.1088/2041-8205/809/1/l9.
  • Virtanen et al. (2020) Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020. 10.1038/s41592-019-0686-2.

Appendix A Additional experimental results

A.1 Harmonic oscillator

Refer to caption
Figure 2: A trajectory of the harmonic oscillator predicted by SympFlow and ODE45.
Refer to caption
Refer to caption
Figure 3: A comparison of the long-time energy conservation of SympFlow and an unconstrained neural flow map approximator on the harmonic oscillator.

A.2 Hénon-Heiles system

Refer to caption
Figure 4: A trajectory of the Hénon-Heiles system predicted by SympFlow and ODE45.