An integrative data-driven model simulating C. elegans brain, body and environment interactions

Details of the neural network modelNeuron models

Neurons were modeled by morphologically derived multicompartmental models with somatic Hodgkin–Huxley dynamics58 and passive neurites. These neuron models captured both the biophysical properties and the spatial complexity of neurons. The models integrated three kinds of neuron data: ion channel, morphology and electrophysiology.

Ion channel models were determined according to published studies33. There were 14 ion channels divided into 4 categories: voltage-gated potassium channels (SHK-1, EGL-36, SHL-1, KVS-1, KQT-3, EGL-2 and IRK1/3), calcium-regulated potassium channels (SLO1, SLO2 and KCNL), voltage-gated calcium channels (EGL19, UNC2 and CCA1) and passive sodium channel (NCA). Model equations and parameters for ion channels are presented in Supplementary Tables 1 and 2.

For morphologies of neuron models, the spatial structures were adopted from the C. elegans neuronal anatomy models created by the Virtual Worm Project32 and the OpenWorm Project9. Each neuron model had a soma section and several neurite sections. Neurite sections were further divided into several compartments, each of which was shorter than 2 μm.

Only soma compartments had active ion currents. Soma compartments and neurite compartments shared the same passive properties. To identify the passive and active parameters that reproduce experimental data of voltage–current clamps, we initially used a deep learning tool59 that uses model-generated synthetic data to accurately and efficiently infer neural parameters. Then we manually fine-tuned these parameters. Parameters with extremely low values (−2) have negligible impact on the target dynamics. Therefore, they were manually set to 0 in simulation to simplify the model without influencing the model’s accuracy. The value of the parameters are presented in Supplementary Table 3.

We compared two features of the patch-clamp traces between experimental data and our single-neuron models, the voltage or current at steady state and the initial peak point (Supplementary Fig. 1). The steady state is where the membrane potential becomes steady. The initial peak point is identified as the first time its gradient decreased to a certain threshold (150 mV s−1) after the peak of the voltage or current gradient. Before identifying the initial peak point, the voltage or current gradient is smoothed using a Hamming filter to reduce the impact of experimental noise.

Less than 20% of the C. elegans neurons have been recorded for the desired electrophysiology data. To estimate model parameters for those neurons that were not recorded, a method was implemented to generalize parameter values. First, considering the available data and hierarchy of information flow in C. elegans nervous system15,27, the neurons were divided into five functional groups (sensory neuron, interneuron, command interneuron, head motor neuron and body motor neuron; Supplementary Table 4). Second, five neurons of each functional group (AWC35, AIY36, AVA37, RIM36 and VD5 (ref. 38)) were selected and their model parameters were optimized as described above. These parameter values of optimized neuron models were then generalized to other neuron models according to their functional categories. In other words, neuron models in the same functional group share identical electrical parameters. Each neuron model has its own morphologies, yet it has electrical properties similar to those of neurons in the same functional group.

The electrophysiology data in Supplementary Fig. 1 (gray traces) were collected from published papers. For the AIY (Liu et al.36, fig. 1a) and RIM (Liu et al.36, fig. 1a), the raw data were included as supplementary files in the original publications. However, for the AWC (Ramot et al.35, fig. 8b), AVA (Mellem et al.37, fig. 1d) and VD5 (Liu et al.38, fig. 8 (wild type)), the numerical data were not provided. Therefore, we extracted the numerical data from the figures using a digital image processing software (GetData Graph Digitizer). The extracted numerical traces are saved as data files in our GitHub respository (https://github.com/Jessie940611/BAAIWorm/tree/main/eworm/model_figure/single_neuron/electrophysiology_data).

Graded synapse and gap junction models

The neuron models were connected by graded synapses36,60 and gap junctions61 in the C. elegans neural network. Graded synapses, including excitatory and inhibitory synapses, were modeled according to published C. elegans models9,40,41. The synaptic conductance is continuously changed with the presynaptic membrane potential. The postsynaptic current is given by

$$\left\{\begin{array}{c}I=-ws(E-V)\\ \displaystyle\frac{{\mathrm{d}}s}{{\mathrm{d}}t}=\displaystyle\frac{{s}_{\infty }-s}{\tau }\\ {s}_{\infty }=\displaystyle\frac{1}{1+{{\mathrm{e}}}^{\displaystyle\frac{{V}_{{\mathrm{th}}}-{V}_{{\mathrm{pre}}}}{\delta }}},\tau =\displaystyle\frac{1-{s}_{\infty }}{k}\end{array}\right.,$$

where w is the maximal postsynaptic membrane conductance for the synapse, s is a voltage-sensitive parameter controlling the conductance, E is the reversal potential, τ is the time constant of s, Vth is the presynaptic equilibrium potential at the middle of the voltage-sensitive range and Vpre is the presynaptic membrane potential.

Gap junctions were modeled as simple resistances, where current flowing from presynaptic cell to postsynaptic cell is given by

$$I=-w\left({V}_{{{\mathrm{pre}}}}-V\right),$$

where w is the conductance of gap junction.

The parameter values of these synapses were set according to published models61: the conductance of the synapses and gap junctions had the same order of magnitude (1 × 10−4 μS). The specific weight of each synaptic connection in a specific neural network can be optimized according to a target, as described in ‘Connection parameter optimization’ section.

Connection locations

Detailed neuron connections in the neural network model were located on the neuronal neurite (axons or dendrites) of the neurons, which were based on the neuron connectivity matrices and a neurite connectivity algorithm.

The neuron connectivity matrices were acquired from published connectome data15. The original data in the matrices were the total number of electron microscopy (EM) series of all synapses (or gap junctions) between two neurons. In our work, the number of EM series was transformed into the number of synapses (or gap junctions) by

$$N=a\times\tanh \left(b\times C\right),$$

where C is the number of EM series and a and b are constants. For synapses, $a=23.91$ and $b=0.02285$, while for gap junctions, $a=20.49$ and $b=0.02184$. These parameters were inferred from partial-animal connectome data15.

After the number of synapses (or gap junctions) was determined, the locations of synapses (or gap junction) on neural neurites were identified by a connection algorithm. The algorithm was based on the rule that distances between the centroid coordinates of possible presynaptic and postsynaptic neurites follow a certain statistical distribution17. We fit the synapse and gap junction distribution by two inverse Gaussian distributions

$$f\left(x,\mu \right)=\frac{1}{k\times \sqrt{2\uppi {x}^{3}}}\exp \left(-\frac{{(x-\mu )}^{2}}{2x{\mu }^{2}}\right),$$

where μ = 0.44 and k = 0.63 for synapse distribution, and μ = 0.70 and k = 0.40 for gap junction distribution. The algorithm of generating connection locations is in Supplementary Algorithm.

C. elegans neural network model

Our neural network model consists of 136 neurons, including 15 sensory neurons for inputs and 80 motor neurons for outputs (Supplementary Table 6). We chose these neurons for the following reasons. First, 65 head ganglia neurons, whose neural dynamics were recorded in the published whole-brain Ca2+ imaging experiment24, should be selected to enable the comparison between experiment and simulation. Second, other 71 neurons, most of which were motor neurons in the locomotion circuit of C. elegans27,28, were added to the network so that the model could be complete in a sensory–motor functional structure. In summary, this model contained neurons for chemosensory, decision and locomotion.

In our multicompartment neural network, our single-neuron models are not isopotential. For example, in the AVAL neuron, depolarization from the soma attenuates as it travels along the axon (Supplementary Fig. 9). However, distal synapses receive inputs from other neurons, which helps to achieve sufficient depolarization. This design ensures that the model works well in the network.

Connection parameter optimization

The weights of synapses and gap junctions in the neural network were initialized randomly. The polarities of synapses were initialized randomly according to a given excitatory/inhibitory ratio (2.33). These parameters were optimized using an unpublished optimization algorithm for multicompartment neural network. The algorithm was adapted from our previous work62, which combined cable theory and gradient-descent methods. It used a mathematical approximation approach to calculate gradients for backpropagation through time to update parameters. Except for the weights (of synapses and gap junctions) and the polarities (of synapses), the inputs (of some sensory neurons) can be optimized. The self-contained and executable code of this optimization is available in our GitHub repository (https://github.com/Jessie940611/BAAIWorm/tree/main/eworm_learn).

PCA of neural activities

PCA was applied to the time series data of soma membrane potential of the 65 head neurons. Before conducting the analysis, the time series data were preprocessed by subtracting the mean value of the trace.

Correlation analysis

The correlation between two neurons was quantified by the Pearson correlation coefficient of their neural activities, which ranged from −1 to 1. A coefficient of −1 or 1 indicates a strong negative or positive correlation of neural activity, respectively. A coefficient of 0 indicates that the neural activities of the two neurons are unrelated. In this analysis, the neural activity was represented by the membrane potential of soma.

Simulation environment

The C. elegans neural network simulation and optimization were conducted with the code written in Python 3.8, and NEURON 8.0.0 was used as the simulation engine63. Experiments were performed primarily on a system configured with Ubuntu 18.04, an NVIDIA GeForce RTX 3060 graphics processing unit, 16 GB random-access memory and an Intel Core i7-10700K central processing unit.

Details of the body–environment modelThe body model

Tetrahedron is the simplest ordinary convex polyhedra in 3D space. It has four vertices and six edges and is bounded by four triangular faces. We used the tetrahedron as the basic element to construct the body model. The surface shape of the body was extracted as a triangle list from Sibernetic membrane data64. Then, we triangulated the body with a robust and high-quality algorithm65. The triangulated tetrahedron mesh was refined for the mechanics solver. The final body mesh was in a straight, natural resting pose that we defined as the standard body (Fig. 3a). The triangulated tetrahedron body was not strictly symmetric in the dorsal–ventral direction and the left–right direction. The triangulation algorithm had additional constraints on the quality of tetrahedrons. This nonstrict symmetry is probably more consistent with real C. elegans. The above-mentioned Sibernetic worm body data can be obtained from OpenWorm8 version ow-0.9.5. The triangulation algorithm of tetrahedron mesh can be obtained from TetWild65.

Muscles

Ninety-sixmuscles were arranged from head to tail according to the C. elegans anatomy. Each of the 24 muscles was grouped into a longitudinal string, marked as VR, VL, DR and DL in four quadrants (Fig. 3b)64. We imported the body membrane mesh into Blender and modeled VR, VL, DR and DL by four Bezier curves. Then, the muscle geometry was exported as a wavefront obj file. We used FEM to simulate the soft body and used finite element constraints to simulate elastic muscle cells of the body. Each muscle cell (Fig. 3a) corresponds to one or more FEM constraints according to the discretized tetrahedron body (Fig. 3c). Only nearby tetrahedrons can be driven by corresponding muscle cells.

Soft-body solver

We used projective dynamics66 as the solver of the soft body. The purpose of the mechanics solver is to calculate the positions and velocities of all 3D vertices of the deformed body over time. The key idea of projective dynamics is to compute a local step and a global step iteratively. At the local step, vertex positions of the body are fixed and projection variables are updated. At the global step, projections are fixed and vertex positions of the body are updated. The local step is well suited for parallelism and, thus, can be optimized for performance. The global step is a linear system and can be precomputed if the body geometry and constraints are not changed. After a few local–global iterations, the vertex positions of deformed body are converged. Then velocity of deformed body can be computed by Euler integration.

Interaction with environment

A simplified model of hydrodynamics was used to simulate the interaction between the body and the fluid, similarly to ref. 43. For simplicity, we supposed that external force is formed only at the surface of the body. Two kinds of external force were considered here: thrust force by body movement and drag force by fluid. With this simplification, computation in the whole 3D fluid environment is avoided. Thus, more agents, larger environments and more complex tasks are achievable through this simplification. SoftCon43 was used as the basis of the body–environment model.

Numerical stable coordinate system for body movement

C. elegans is a soft-body animal that has special patterns during movement. Every position of the body is oscillating during movement. Moreover, as there are no apparent important marks on the body during locomotion, it is hard to find a stable coordinate system for C. elegans movement. Simply using the head or body center as a coordinate system is not sufficient to analyze the body’s movement quantitatively and stably. Here, we proposed an effective and simple method to measure C. elegans locomotion. The core of our method is to find a numerically stable coordinate system relative to the body itself. The main steps of our method are as follows:

(1)

Define the target body, standard body and SBCS.

(2)

Obtain transformation between the target body and the standard body.

(3)

Use the transformation in step 2 to perform 3D transformation on the SBCS to obtain the TBRCS.

Details of the three steps are described below:

The target body is the research subject. It can change its shape over time and exhibits various behaviors. The standard body is a special case of the target body. Here, we define the standard body in a natural state where the head and tail are in a straight line, without stretching and bending. There is no internal force in the standard body soft body, and no external force acting on the standard body surface. The SBCS is a 3D orthogonal coordinate system used to reflect the overall structure and function of the standard body. In this case, we define the SBCS with origin at the standard body center. The X axis of the SBCS is toward the head of the standard body. The Z axis of the SBCS is toward the ventral–dorsal direction of the standard body. Three axes of the SBCS form a right-handed coordinate system. A schematic diagram of the target body, the standard body and the SBCS is shown in Fig. 4a.

In this step, we define the corresponding matching points on the target body and the standard body by using all the vertices of the body. The transformation between the matching points on the target body and the standard body is calculated as the transformation between the target body and the standard body. The calculation formulas of the transformation are as follows:

$$\overline{{p}^{0}}=\frac{1}{m}\mathop{\sum }\limits_{i=1}^{m}{p}_{i}^{0}$$

$$\bar{p}=\frac{1}{m}\mathop{\sum }\limits_{i=1}^{m}{p}_{i}$$

$${q}_{i}^{0}={p}_{i}^{0}-\overline{{p}^{0}}$$

$${q}_{i}={p}_{i}-\bar{p}$$

$$H={q}_{i}^{0}{{q}_{i}}^{{\mathrm{T}}}$$

$$R=V{U}^{{\mathrm{T}}}$$

$$T=\bar{p}-R\overline{{p}^{0}}$$

$$M=\left[\begin{array}{cc}R & T\\ 0 & 1\end{array}\right].$$

The matching points are represented by column vectors. ${p}_{i}^{0}$ is matching points on the standard body, and pi is matching points on the target body. m is the number of matching points. $\overline{{p}^{0}}$ is the center point of matching points on the standard body, and $\overline{p}$ is the center point of matching points on the standard body. ${q}_{i}^{0}$ refers to the standard body matching points with the center offset subtracted, and qi denotes the standard body matching points with the center offset subtracted. H is a matrix computed by ${q}_{i}^{0}{{q}_{i}}^{{\mathrm{T}}}$. ${{q}_{i}}^{{\mathrm{T}}}$ is the transpose of qi, ${q}_{i}=\left[{q}_{1},{q}_{2}\ldots {q}_{m}\right]$, ${q}_{i}^{0}=\left[{q}_{1}^{0},{q}_{2}^{0}\ldots {q}_{m}^{0}\right]$. U and V are obtained by the singular value decomposition $H=U\Lambda V$ of the matrix H. UT is the transpose of U. M is the transformation matrix between the target body and the standard body, which will be used in the next step. R is the rotation part of M, and T is the translation part of M. A schematic diagram of matching points and transformation M is shown in Fig. 4a.

In this step, we use matrix M to perform 3D transformation on the SBCS to obtain the TBRCS. The matrix of the TBRCS can be formulated as $\widehat{M}=M{M}^{0}=\left[\begin{array}{cc}R{R}^{0} & R{T}^{\,0}+T\\ 0 & 1\end{array}\right]$. $M=\left[\begin{array}{cc}R & T\\ 0 & 1\end{array}\right]$. ${M}^{0}=\left[\begin{array}{cc}{R}^{0} & {T}^{\,0}\\ 0 & 1\end{array}\right]$ is a matrix used to represent the SBCS. T0 is the origin of the SBCS, and R0 indicates orientation of the SBCS. The column vector of R0 represents the orthogonal coordinate axis. Finally, weobtain the TBRCS through $\widehat{M}$: $R{T}^{\,0}+T$ as the origin and ${{RR}}^{0}$ as the orthogonal coordinate axis.

Numerically stable measurements of body movement
Measurement of body trajectory

During behavior analysis, C. elegans can be seen as a deformable biology composed of multiple points. When studying the locomotion of a deformable body, generally one or more points of the biology body is defined as a tracking object, and locomotion of the deformable body is studied through tracking these defined points. For deformable biology bodies such as C. elegans, the position and the velocity of local tracking points do not reflect well the overall state of body locomotion (Fig. 4c). As a result, values such as body trajectory or steering angle obtained through these tracking points are usually unstable over time. Using the TBRCS, we can better describe the overall trend of body movement. A comparison of trajectories is shown in Fig. 4a. If the center point of the body is used as the trajectory, the trajectory has a zigzag shape. However, if the TBRCS is used as the trajectory of body movement, the trajectory is a smooth curve, which means that it has stable values. These results indicate that the TBRCS is suitable as a measurement of overall path and direction of body locomotion.

Tracking body movement

Here, we demonstrate how to track body movement with the TBRCS. If the standard body is drawn in the TBRCS, then it can be used as a reference for the target body. The standard body and the target body are centrally aligned, translated and rotated together. We define ${s}_{k}^{0}$ as 3D tracking points of the standard body and sk as 3D tracking points of the target body (k = 1, 2, &hellips;, n). n is the number of tracking points. The relative movement of the body can be defined as ${s}_{k}-{s}_{k}^{0}$. It is a measurement of body locomotion relative to itself (the standard body), which is more numerically stable than relative to the environment. ${s}_{k}-{s}_{k}^{0}$ is also a better calculation than that relative to the body center, since both the translation part and the rotation part of the whole deformation body are considered in the TBRCS. Moreover, the sampling bias of tracking points is eliminated through subtraction, since bias minus bias equals zero.

In Fig. 4b, we sampled 17 tracking points on the body uniformly. The offset of the tracking points is indicated by the pink lines between the tracking points. The relative velocity of 3D tracking points is indicated with blue arrows. As a common method, trajectories of sampled 17 tracking points that relate to world are also shown in Fig. 4c. We can see that, compared with all 17 tracking points trajectories, the trajectory of the TBRCS is the smoothest. The relative position and the relative velocity of 3D tracking points over time are plotted in Fig. 4e,f. In Fig. 4e, the positions of the 3D tracking points are tiled from bottom to top for clear seeing. In this case, the swing direction (Z direction of the TBRCS) of the swimming body can be clearly shown with the TBRCS. We also find that there is a spring movement in the X direction of the TBRCS (tail to head) during forward swimming. The locomotion in the Y direction of the TBRCS is nearly flat because muscle activations are approximately symmetrical in this direction. From Fig. 4f we can further deduce that the biggest propulsion of forward swimming comes from the tail (pink line), as faster swing generates more thrust force.

Steering angle of body movement

If ${O}_{{t}_{i}}$ is the trajectory of TBRCS origin at time ti, ${t}_{i}={t}_{1},{t}_{2}\ldots {t}_{m}$, then we can calculate the velocity of TBRCS by ${v}_{i}=\frac{{O}_{{t}_{i}}-{O}_{{t}_{i-1}}}{{t}_{i}-{t}_{i-1}}$. In this way, velocity magnitude and velocity direction over time are numerically stable. We used angle θ, which is between the TBRCS X axis and vi as a measurement of steering, as shown in Fig. 4d. Both vi and the TBRCS are numerically stable over time; thus, θ is also numerically stable.

Locomotion behavior simulation of C. elegans
Simulation configuration

The neural network simulation was coupled with the physical body–environment simulation to achieve closed-loop interaction. Food was at the center of the simulated physical environment, acting as an attractor to C. elegans. The food concentration was linearly distributed in the environment. The neural system received currents regarded as sensory signals. The amplitude of the currents was proportional to the gradient of food concentration. We assumed that all sensors were located at the head of C. elegans and that the injected currents of all sensory neurons were the same (the gradient at the tracking point at the head of the C. elegans body). The outputs of the neural network controlled the movements of the body in the environment through a reservoir readout weight matrix. As the body moved in the physical environment, the distance between the body and the food changed, causing the food concentration gradient changed, which was transformed into currents injected into the soma of sensory neurons.

During the interaction of the two submodels, the simulation time step of the neural network was 5/3 ms and that of the physical environment simulation was 100 ms. The two systems synchronized every 100 ms to update the sensory signals and muscle forces.

Reservoir configuration

To establish the neuron–muscle connection between the neural system and physical body muscles, a reservoir computing framework was used. The neural network model acted as the nonlinear reservoir. The membrane potentials of 80 motor neurons were linearly combined and mapped to 96 muscle forces via an 80-by-96 matrix.

The reservoir was trained using data extracted from a 10 s movement generated in the simulated physical environment. The pairs of food concentration and muscle force during the movements were used as inputs to the neural network and target outputs of the readout layer, respectively. Only the reservoir readout matrix was optimized, whereas the parameters in the neural network model were fixed.

User-friendly interfaces

For simulation configuration, the neural network model consists of ion channel, neuron and network connection modules. The parameter and algorithm settings of each module can be modified independently. Parameters were saved in JavaScript Object Notation (JSON) files that are easy to read and modify.

For quantification, membrane potential of neurons, worm body data outputs, such as positions, velocities and trajectories of tracking points (Fig. 4), and muscle signals (Supplementary Fig. 5) can be exported seamlessly with our latest code (https://github.com/Jessie940611/BAAIWorm/blob/main/eworm/ghost_in_mesh_sim/worm_in_env.py). Tracking points, including the head, center and tail of the worm body, are configurable within the model. Future updates will include 3D tetrahedron analysis algorithms and an extended Worm tracker Commons Object Notation (WCON)67 file format for the worm’s 3D structure.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

An integrative data-driven model simulating C. elegans brain, body and environment interactions

Tags: