Self-Sensing Control for Soft-Material Actuators Based on Dielectric Elastomers

Due to their energy density and softness that are comparable to human muscles dielectric elastomer (DE) transducers are well-suited for soft-robotic applications. This kind of transducer combines actuator and sensor functionality within one transducer so that no external senors to measure the deformation or to detect collisions are required. Within this contribution we present a novel self-sensing control for a DE stack-transducer that allows to control several different quantities of the DE transducer with the same controller. This flexibility is advantageous e.g., for the development of human machine interfaces with soft-bodied robots. After introducing the DE stack-transducer that is driven by a bidirectional flyback converter, the development of the self-sensing state and disturbance estimator based on an extended Kalman-filter is explained. Compared to known estimators designed for DE transducers supplied by bulky high-voltage amplifiers this one does not require any superimposed excitation to enable the sensor capability so that it also can be used with economic and competitive power electronics like the flyback converter. Due to the behavior of this converter a sliding mode energy controller is designed afterwards. By introducing different feed-forward controls the voltage, force or deformation can be controlled. The validation proofs that both the developed self-sensing estimator as well as the self-sensing control yield comparable results as previously published sensor-based approaches.


INTRODUCTION
Entirely soft-bodied robots exploit the full potential of robotic systems in terms of safe humanmachine-interactions and, thus, are in the scope of research. However, novel mechanical designs in conjunction with smart and soft materials as well as innovative approaches for modeling and the development of control strategies to handle such a highly sophisticated robot species are necessary (Navarro et al., 2013;Robla-Gomez et al., 2017). Due to their behavior that resembles human muscles, dielectric elastomers (DEs) are a promising approach that could pave the way for softbodied robots. As a DE transducer consists of a very thin, elastomeric dielectric film covered with compliant electrodes, its behavior can be described by a shape varying capacitor.
By applying a voltage v p to the electrodes of the DE transducer with permittivity ε 0 · ε r and thickness d the resulting electrostatic pressure compresses the elastomer. This pressure is used to operate a DE transducer in actuator mode. However, if the change of the transducer's capacitance is detected that is caused by its deformation, a simultaneous operation as sensor is enabled. If the deformation dependency of the capacitance is known, the mechanical transducer state can be determined. By exploiting this self-sensing capability, soft and smart transducers can be realized that do not require additional external sensors and, thus, can be comparably easy integrated into various applications not limited to soft robotics. For various types of DE transducers different approaches to control their displacement or force (Maas et al., 2011;Sarban and Jones, 2012;Rizzello et al., 2015;Wilson et al., 2016;Maas, 2017, 2018b) or to use them for active vibration attenuation (Dubois et al., 2008;Kaal and Herold, 2011;Sarban, 2011) have been presented previously. Within these approaches the control variables are directly measured with external sensors, so that the DE transducer is only operated as actuator. Due to the additional sensor, these controls are referred to as sensor-based control schemes.
Within this paper, the focus is given on the development of a model-based self-sensing control for DE transducers that allows to control the voltage, force and deformation of the transducer without measuring any mechanical quantities. Figure 1 gives an overview of the overall developed control circuit.
As shown in the center and on the right hand side of Figure 1 the terminal voltage v DE and current i DE have to be measured to enable the combined actuator-sensor-operation. In order to determine the mechanical state based on these measurement quantities adequate self-sensing algorithms are required. Anderson et al. (2012) summarizes different approaches for this purpose. The goal of most self-sensing algorithms is to identify the capacitance of the DE transducer in a first step and afterwards estimate the deformation and force based on a model or experimentally obtained information about the deformation dependency of the capacitance. For almost all approaches the driving voltage v DE is superimposed with a harmonic excitation that is used for the sensor functionality. Chuc et al. (2008) and Jung et al. (2008) published first frequency domain based approaches by experimentally identifying changes of the electrical impedance of a DE transducer under deformation when it is excited by a harmonic voltage v DE . Beside the capacitance C p they also considered losses in the polymer and the electrode by adding the resistances R s and R p , respectively, see Figure 1.
In Hoffstadt et al. (2014) another model-based identification algorithm in the frequency domain is presented that estimates the electrical parameters of a DE transducer by evaluating the amplitudes of and the phase shift between the superimposed terminal voltage and current. Furthermore, it was shown that the behavior of a DE transducer can be sufficiently modeled by neglecting the parallel resistance R p representing losses in the dielectric, if the DE transducer is excited with a comparable high frequency.
The extended Kalman-filter introduced in Hoffstadt and Maas (2018a) estimates the strain of a DE transducer without any superimposed excitation so that it can be used independent of the utilized power electronics. Other approaches in the time domain estimate the charge q p of the capacitance C p . Under further consideration of the measured voltage v DE the capacitance C p ≈ q p /v DE can be determined (Matysek et al., 2011;Gisby et al., 2013). Rizzello et al. (2017) developed a self-sensing algorithm based on the recursive least squares (RLS) method. For this purpose, he takes into account the equivalent circuit diagram with three parameters (see Figure 1). In a first step, the parameters of a discrete transfer function describing the behavior of the considered circuit are estimated. As these parameters depend on the electrical parameters, they can be calculated afterwards. For the identification a harmonic excitation signal is superimposed.
Although several self-sensing approaches have been developed only a few closed-loop self-sensing controller designs have been published, so far. Gisby et al. (2011) controls the deformation of a single-layer circular DE transducer by using the already mentioned self-sensing approach . Here, the terminal voltage is PWM generated. While the deformation of the DE transducer mainly depends on the mean of this voltage, the included higher harmonics are used to enable the sensor functionality. The manually adjusted proportional gain controller yields comparable low dynamics and accuracy. Therefore, Rosset et al. (2013) extends this controller to a PI-controller, using the same self-sensing approach . Here, the parameters of the controller are optimized for one particular operating point of the nonlinear control plant. The derived controller is used to control an optical grid. Rizzello et al. (2016) systematically combines his RLSbased self-sensing approach (Rizzello et al., 2017) with his robust position controller (Rizzello et al., 2015) to control the deformation of a DE membrane actuator. For the combined actuator-sensor-operation the required driving voltage determined by the controller is superimposed with a harmonic excitation with a high frequency of 1 kHz and an amplitude of 75 V. Compared to the sensor-based control (Rizzello et al., 2015) almost no drawbacks in terms of the accuracy are observed, while the bandwidth of the closed-loop self-sensing control is reduced due to the dynamics of the parameter identification.
Within the referenced publications costly and bulky highvoltage amplifiers were used to feed the DE transducer. However, due to the capacitive behavior of DE transducers voltage-fed current sources are well suited instead of high-voltage amplifiers (Eitzen et al., 2011a). Here, compact and efficient driving electronics can be realized when using switched-mode operated topologies like the bidirectional flyback converter. This converter allows not only to supply the DE transducer with a certain voltage but also to recover the energy stored in the DE transducer when discharging it.
Under consideration of the properties of the bidirectional flyback converter and the DE transducer, we previously published sensor-based position and force controls in Hoffstadt and Maas (2017, 2018b) that use the directly measured deformation as feedback-signal, cf. Figure 1. Within this publication we extend them to a self-sensing controller that is able to universally control the voltage, force or deformation of the DE transducer by just measuring the terminal voltage v DE and current i DE . For this purpose, in the following section 2 the considered control plant comprising a DE stack-transducer  fed by a bidirectional flyback converter (Eitzen et al., 2011b; is introduced and modeled. The design of the novel self-sensing state and disturbance estimator is presented in section 3. Due to the non-linear behavior of the control plant an extended Kalman-filter (EKF) is used for this purpose (Welch and Bishop, 2001). The developed estimator does not require any superimposed excitation. The subsequently presented controller design Maas, 2017, 2018b) is based on the sliding mode control approach (DeCarlo et al., 1988) as this is well suited for the considered control plant and its characteristic behavior. The self-sensing estimator and control are experimentally validated in section 5. Finally, section 6 summarizes the developed approaches and the result. Figure 2A shows a schematic representation of the considered DE stack-transducer with N layers. This multilayer design is used to scale the deformation z in z-direction, as one single layer has an initial thickness of only d 0 = 50 µm. Details about the design and the manufacturing were published by Maas et al. (2015). The static strain-force-behavior is shown in Figure 2B. The transducer generates higher tensile forces F act at smaller strains ε z = z/z 0 , with z 0 = N · d 0 . By increasing the initial electric field strength E 0 = v DE /d 0 the electrostatic pressure according to Equation (1) increases so that higher forces and strains are obtained. The blocking-force F act (ε z = 0) and the noload strain ε z (F act = 0) represent two characteristic points of the strain-force behavior.

MODEL OF THE DE TRANSDUCER SYSTEM
An analytical model for this transducer is published in Hoffstadt and Maas (2015). In Figure 2 the modeled results of the static strain-force behavior are compared with measurement results and a finite element analysis (FEA) published by Kuhring et al. (2015). The analytical model is based on the structure shown on the right of Figure 1. The actuator tension σ act is given by the force equilibrium: Here, σ elast is the elastic material tension that is calculated using the Neo-Hookean approach with the Young's modulus Y to consider the hyperelastic, non-linear material behavior: Beside this reversible elastic behavior, viscoelastic properties are taken into account with the viscosity η E and the Maxwell element with stiffness E 1 and viscosity η 1 . Furthermore, with the area ratio β it is considered that the electrostatic pressure σ el acts only on the area A e covered with electrode, while all other tensions are assumed to homogeneously act on the whole transducer area A in z-direction. Instead of applying Equation (1) for the electrostatic pressure σ el , here it is determined depending on the energy U c,diel in the electric field of the capacitance C p : The bidirectional flyback converter control proposed in  enables three discrete input states in terms of the feeding powerp. Beside an off-state, the DE can be charged and discharged with almost constant power depending on the characteristic energy increment U max transfered during every switching period T S of the converter: The energy increment U max depends on the magnetizing inductance L m of the converter and the magnetizing current I m,max adjusted by its inner control. Under further consideration of losses p Re dissipated in the electrode material the powerp ′ = p − p Re feeds the capacitance of the DE transducer. With this, the electromechanically coupled behavior of a DE transducer can be modeled based on a power balance yielding the state space representatioṅ Beside the strain ε z and the energy U c,diel the state vector x includes the velocityε z as well as the strain ε E 1 of the stiffness E 1 of the Maxwell element. Depending on the supplied input power p ′ and an external load σ load the inner states of the DE transducer with volume V and accelerated mass m B can be calculated with Equation (6).
For the subsequently developed self-sensing state estimator models describing the strain dependency of the electrical transducer parameters are required, too. The series resistance R s mainly comprises losses in the contacting of the DE transducer and electrodes that are applied on the initial area A e,0 of every layer. It was shown  that this resistance is almost constant in the relevant range of deformation. In contrast, the capacitance C p for the N layers connected in parallel is given by: The change of the initial capacitance C p,0 also depends on the factor κ. In case of an absolutely homogeneous deformation without constraints, κ = 2 would apply. However, due to a passive area around A e that is required for insulation purposes, as well as due to stiff mechanical interfaces applied on the top and/or bottom of the transducer, here the factor is slightly decreased to κ = 1.85. In analogy, the strain dependency of the parallel resistance R p reads as follows: This resistance represents losses in the dielectric with the specific resistance ̺ p . Although C p and R p vary with the strain ε z the resulting time constant τ p is independent of the strain: Frontiers in Robotics and AI | www.frontiersin.org

EKF-BASED SELF-SENSING ALGORITHM
In Hoffstadt and Maas (2018a) we already published a selfsensing estimator based on a discrete, extended Kalmanfilter that estimates the strain of the DE transducer without superimposed excitation. However, for the closed-loop operation beside the inner states of the transducer also the disturbance σ load has to be estimated. For example, this load tension might result from a collision of an external device or human with a softbodied robot equipped with DE transducers. Therefore, a new and extended approach based on Equation (6) is applied here. As mentioned above, the goal is to determine the electromechanical state of the DE transducer based on the measured terminal voltage v DE and the current i DE . However, in Equation (6) the energy U c,diel represents the electrical state. Therefore, a modification of the model is required to design the selfsensing estimator. For this purpose, the change of the charge q p on the capacitance C p is taken into account. It can be calculated under consideration of the current i DE and the leakage current v p /R p = q p /τ p , see Figure 1: As the charge depends on the measured current and the invariant time constant τ p from Equation (9), it is used as input variable u qv in the following. Furthermore, if instead of the energy U c,diel the charge q p is considered, the electrostatic pressure can be expressed by: (11) Additionally, the voltage v p across C p depends on the terminal voltage v DE reduced by the voltage drop R s · i DE across the series resistance R s that is assumed to be constant here: This voltage will be used as output variable y qv afterwards. Beside the mechanical states included in x and Equation (6) the external load σ load has to be estimated as disturbance, too. As it represents an unknown disturbance it is assumed (according to Isermann and Munchhof, 2011) that it is constant during one sample time T of the discrete EKF implemented on a DSP. By applyingσ load = 0 in combination with Equations (10)-(12) a fourth order system can be established for the estimation: According to Adamy (2014) the observability of the nonlinear system (13) is given if the determinant of the observability matrix Q B,qv is not zero. This matrix can be calculated under consideration of the Lie derivatives L i f qv g qv , with i = 0, ..., 3: (14) Beside material parameters that are different from zero, the determinant of Q B,qv depends on the charge q p and the strain ε z : The strain ε z is always smaller than one, and thus does not influence the observability. However, the uncharged state with q p = 0 is not observable. In contrast for example to piezoelectric materials, this is due to the fact that the DE materials do not contain inherent dipoles causing a charge separation under deformation. Instead, a DE transducer has to be electrically pre-charged so that a current flow or change of voltage can be detected when it is deformed. Furthermore, the restricted observability for q p = 0 is not only a drawback of the proposed approach. All referenced self-sensing methods have the same issue, but the usually superimposed voltage excitations ensure that this operating point does not occur. As this superimposed excitation is not required for the EKF-based estimator a certain amount of charge q p,min is always applied, here. This results in the structure of the EKF-based self-sensing state and disturbance estimator shown in Figure 3. As the EKF will be implemented on a DSP its discrete implementation according to Welch and Bishop (2001) is applied. Using the external estimation of the charge by filtering the measured current i DE,m has the advantage, that the state vector x qv only includes mechanical states that have to be estimated with the EKF. Furthermore, the parameterization effort increases significantly FIGURE 3 | Structure of the proposed self-sensing state and disturbance estimator based on an extended Kalman-filter.
with increasing system order so that it is meaningful to use a system with order n = 4 instead of n = 5.
For the implementation of the Kalman-filter algorithm according to Figure 3 the system (13) has to be linearized in the predicted statex qv,p,k : with a qv,21 = γ 1 1 −ε z,p · σ act,p u qv,k −σ load,p 1 −ε z,p − dσ elast,p dε z,p − dσ el,p dε z,p , and dσ el,p dε z,p Based on this the discrete transition matrix qv can be approximated by (Ifeachor and Jervis, 2002): where I represents the unity matrix of order n = 4. The output vector c T qv,k is calculated by the jacobian of the output function g qv in Equation (13) with respect to the state vector x qv : With these information the predicted statex qv,p,k and the related covariance matrix P p,k can be determined in the prediction step (denoted by the index p) of the algorithm shown in Figure 3.
In the following correction the Kalman matrix K and covariance matrix P k are calculated to update the estimated state vectorx qv,k . The covariance matrices of the measurement and system noise R vv and Q ww , respectively, will be parameterized in the validation section 5. With the information, included in the state vector x qv,k and Equation (4) to calculate the energy U c,diel based on the charge q p , all state variables in x from Equation (6) as well as the load σ load can be determined.

SELF-SENSING SLIDING MODE CONTROL
The considered control plant modeled with Equation (6) has a strongly non-linear behavior. Furthermore, the bidirectional flyback converter allows to supply discrete feeding powersp so that it can be described by the three-point switch in Equation (5). Due to these properties the design of a variable structure control is well suited. In Hoffstadt and Maas (2017) a position controller based on the model (6) was introduced that uses the sliding mode control (SMC), for this purpose. Additionally, a SMC force controller was published in Hoffstadt and Maas (2018b). In the following it will be shown that this controller cannot be used to solely control the force F act of the DE transducer but also the strain ε z and the voltage v p by applying different feed-forward structures to one and the same controller. This flexibility makes the approach advantageous for sophisticated applications like in soft robotics. The detailed design of the controller shown in Figure 4 can be found in Hoffstadt and Maas (2018b) and will be summarized in the following. In case of the SMC a static setpoint state vector x * has to be defined including setpoints for every state variable. Under consideration of the static force equilibrium lim t→∞ σ act (t) = β · σ el − σ elast (19) resulting from Equation (2) setpoints for the energy U * c,diel can be derived. On the one hand, the energy can be calculated depending FIGURE 4 | Detailed structure of the controller (blue box in Figure 1) including a feed-forward structure to either control the voltage, strain or force and a three-point controller with hysteresis and adaption of the inner flyback converter control. on a setpoint strain ε * z : To achieve this strain the electrostatic pressure caused by the energy according to Equation (4) has to compensate the elastic material tension σ elast (ε * z ) given by Equation (3) as well as the influence of the disturbance σ load . On the other hand, if the DE transducer should generate a certain force F * act = A · σ act the corresponding energy U * c,diel is given by: In this case, the influence of the elastic deformation has to be compensated, i.e., the energy has to be increased with increasing strain ε z (see Figure 2B). Beside these approaches, the energy U * c,diel can be also determined depending on a setpoint voltage v * p across the capacitance C p (ε z ) in Equation (7): With Equations (20)-(22) three approaches exist to define a setpoint value for the energy U * c,diel . For the system (6) also a setpoint for the strain ε * z is required, while the other two state variables are zero during steady stateε z = ε E 1 = 0, respectively. However, especially if the force or voltage should be controlled by applying Equations (21) or (22), they should be independent of the strain, i.e., that no setpoint ε * z can be defined in this case. To overcome this issue, the control design is based on a reduced system (23) withε z , ε E 1 and U c,diel as state variables while the strain ε z together with the load tension σ load is considered to be a disturbance, here: In this case, the setpoint state vector reads as:

Design of the Sliding Mode
The control operation with a SMC is characterized by two phases. During the sliding mode the system is led toward its setpoint x * on the switching function S( x) = S(x − x * ) = 0. Within the reaching phase it is ensured, first, that this switching function is reached from any arbitrary initial state. According to DeCarlo et al. (1988) one comparable simple approach for the design of the switching function is obtained if the system is in standard canonical form (denoted by the index R). To determine a corresponding transformation matrix T, the system (23) has to be linearized yielding the system matrix A U for the estimated statex U : (25) As the system behaves linear concerning the input u, the constant input vector b U is already given in Equation (23). With these information the following transformation matrix T can be derived as proposed by Kalman (1960): For the considered single input single output (SISO) system a linear switching function is defined: During the sliding mode S ( x R ) = 0 as well asṠ ( x R ) = 0 applies. This behavior is obtained by the equivalent input (DeCarlo et al., 1988) u (28) With this input the dynamics during the sliding mode only depend on the coefficients c i , with i = 1, 2, 3, of the switching function in Equation (27): An other characteristic property of the SMC approach is that during the sliding mode the system order n is reduced by the number of inputs p (here p = 1). Thus, the dynamics during the sliding mode can be defined by a pole placement under consideration ofÃ 1 . For a second order element with damping coefficient D and cut-off frequency ω g this results in:

Reachability
To reach this sliding mode a proper controller function u x R,U has to be determined and parametrized under consideration of the properties of the feeding power electronics. One approach to prove the reachability is based on an investigation of the Laypunov function V x R,U = 1/2 · S 2 x R,U . To ensure stable steady-state behavior the time derivate of the Lyapunov function has to be negative: The derivative of the switching function is given by: The coefficients ζ 1 , ζ 2 and ζ 3 depend on material parameters as well as the damping ratio D and cut-off frequency ω g . These two controller parameters are chosen in such a way that the influence of the state variables ε z and ε E 1 on Equation (32) vanishes. By solving ζ 1 = 0 and ζ 2 = 0 the following parameters result: According to Equation (5) the input powerp supplied by the bidirectional flyback converter can be described by a three-point controller. However, for the design of the SMC the off-state with p = 0 can be neglected in a first step. Under consideration p = ±p max a two-point controller is defined: The parameter ̺ = ±p max will be chosen so that the reachability is ensured. By inserting Equations (34), (34b), and (35) into Equation (32) the time derivative of the switching function simplifies to: The control parameter ̺ is determined by applying a case-by-case analysis to satisfy Equation (31): Frontiers in Robotics and AI | www.frontiersin.org As the energy U * c,diel will always be equal to or larger than zero, both inequalities are solved by choosing: Especially during steady state the introduced twopoint controller will permanently switch between the positive and negative input power ±p max . To avoid this chattering, the controller is extended to a three-point controller with hysteresis, as already shown in Figure 4: On the one hand, the off-state of the flyback converter is now taken into account, while on the other hand the hysteresis with threshold δ S will significantly reduce the switching frequency in closed-loop operation. In Figure 4 an output limitation is also depicted that switches off the control, when the energyÛ c,diel exceeds a maximum value U c,diel,max . Furthermore, to improve the steady state behavior the inner control of the flyback converter is adapted. Depending on the absolute value of the switching function |S( x R,U )|, the maximum magnetizing current I * m,max and thus the feeding powerp according to Equation (5) is varied. This ensures, that for large control deviations corresponding to large values of |S( x R,U )| the maximum feeding power is supplied for achieving the maximum dynamics. In contrast, for small control deviations the power is reduced for a higher accuracy by also adapting the hysteresis threshold δ S .
Further details can be found in Maas (2017, 2018b). Figure 5 schematically depicts the test setup used for the experimental validation of the self-sensing estimator and the selfsensing control. It consists of a bidirectional flyback converter that supplies the DE transducer with voltages up to 2.5 kV . The voltage v DE,m is measured with the voltage probe TT-SI 9010 from Testec, while the current i DE,m is determined by the voltage drop across the shunt resistance R is =1 k . Details about the utilized DE transducers can be found in Maas et al. (2015). If no-load scenarios are investigated in the following, the displacement of the DE transducer is directly  The proposed self-sensing algorithm and the energy control are implemented on the DSP of a real-time system from dSPACE operating with a sample rate of f DSP = 20 kHz. The system contains also a fast FPGA board. On this board the control of the flyback converter and the signal conditioning for the measured voltage and current v DE,m and i DE,m are performed.

Validation of the EKF-Based Self-Sensing Algorithm
Before the closed-loop self-sensing operation is investigated, the estimation results obtained with the suggested self-sensing approach are compared to results estimated with the sensorbased observer introduced in Hoffstadt and Maas (2017Maas ( , 2018b. The parameters of the silicone based DE stack-transducer with N = 192 layers are listed in Table 1. This table also includes parameters for the controller used in the following section. Figure 6 compares the estimation results of the proposed selfsensing approach with the sensor-based estimator. The voltage controlled bidirectional flyback converter supplies the DE stacktransducer stepwise with voltages of v DE =1.5, 2.5, and 2 kV, respectively. The charge q p determined by filtering the measured current i DE,m according to Equation (10) is used as input for the self-sensing filter, while the sensor-based estimator uses the energy U c,diel as input. The measurement noise R vv = 4 V 2 required for the implementation of the EKF can be determined experimentally. For this, the output function g qv in Equation (13) and the properties of the voltage probe and current measurement via the shunt have to be taken into account. One of the main issues when designing an EKF is to find an appropriate choice of Q ww . Here, the numerical optimization approach presented by Powell (2002) is used to minimize the error between simulated and estimated state variables by varying the entries of the symmetric matrix Q ww . For the system introduced in section 3 this optimization yields: 4, 8 · 10 −8 −1, 7 · 10 −9 −2, 2 · 10 −8 −9, 3 · 10 −4 −1, 7 · 10 −9 4, 9 · 10 −5 1, 5 · 10 −4 4, 4 · 10 −1 −2, 2 · 10 −8 1, 5 · 10 −4 1, 6 · 10 −9 −1, 7 · 10 −5 −9, 3 · 10 −4 4, 4 · 10 −1 −1, 7 · 10 −5 7, 6 · 10 4     .
The entries represent in a certain way the uncertainty of the model (13) to describe the dynamics of the state variables. While all entries of the matrix are comparable small, the one in the fourth row and column is very large. This is due to the unknown dynamics of the load tension that is considered withσ load = 0 in Equation (13). As the dynamics of the state estimation can be adjusted by the absolute values of the entries in Q ww the scaling factor ζ qv is introduced. It gives the opportunity to adjust a compromise between sufficient dynamics, reliable state estimation and noise suppression.
In Figure 6 the no-load scenario with F load = A · σ load = 0 is considered. As can be seen in the comparison of the measured and estimated strains ε z in the top right plot, almost no deviations between the approaches in terms of dynamics and accuracy occur. Due to parameter deviations the sensorbased filter estimates small load forces especially during transient operation. For the self-sensing filter with ζ qv = 10 −3 a comparable small factor is applied here. With this negligible deviations in the estimated load force occur without affecting the estimation results of the state variables shown on the right. Figure 7 compares the estimation results obtained when a load force of F load = 2 N is stepwise applied to the DE stacktransducer with the force controlled voice coil actuator. When the tensile load is applied the strain of the DE transducer reduces FIGURE 8 | Comparison of the sensor-based and self-sensing sliding mode energy control. In both cases two-point controllers (2PC) with I * m,max = 8 A and I * m,max = 4 A as well as the three-point controller (3PC) with hysteresis and adaption of the inner flyback converter control are considered. from ε z ≈ 1.9% to ε z ≈ 1.1%. In voltage controlled operation this causes a reduction of charge and energy as can be seen in the top left plot. The saw tooth profile in the charge q p and energy U c,diel is caused by the voltage control of the flyback converter that is based on a hysteresis controller. The sensorbased filter estimates the strain as well as the load force with errors less than |err ε | ≤ 1% and |err F | ≤ 4%, respectively. For the self-sensing filter two parameterization with ζ qv = 10 −3 and ζ qv = 1 are investigated. While with ζ qv = 10 −3 the strain and force are estimated with errors below |err ε | ≤ 1% before the load is applied and after it is released, the dynamics of the estimation is not sufficient to consider the influence of the load correctly. In contrast, with ζ qv = 1 the influence of the load is accurately estimated. However, with this setting the noise suppression especially for charge states below q p ≤ 5 µAs is not sufficient. Therefore, for the following investigations of the selfsensing control the scaling factor is switched from ζ qv = 10 −3 to ζ qv = 1 if the charge exceed q p ≤ 5 µAs. This ensures an accurate estimation of the inner transducer states at low charge states as well as an accurate detection of a load force and its influence on the states.

Validation of the Self-Sensing Control
The parameters of the sliding mode energy controller designed in section 4 are listed in Table 1. The damping coefficient D = 3 and cut-off frequency ω g = 2.430 rad/s were determined with Equation (34). The hysteresis threshold δ S = 4 · U max for the three-point controller in Equation (39) is set to a multiple of the energy increment U max transfered during one switching period T S of the flyback converter. Figure 8 compares the closed-loop operation of the sensor-based controller published in Hoffstadt and Maas (2018b) and the proposed self-sensing controller. First of all, no feed-forward control approaches as suggested in Equations (20)-(22) are considered. Instead, three setpoint steps for the energy U * c,diel are applied that correspond to voltages of FIGURE 9 | Bandwidth of the sensor-based and self-sensing sliding mode energy control for the three investigated controller settings. v DE = 1.5 kV, 2.5 kV and 2 kV for the silicone based DEstack-transducer, respectively. For both, the sensor-based and the self-sensing control two-point controllers (2PC) according to Equation (35) with I * m,max = 8 A and I * m,max = 4 A are investigated as well as the three-point controller (3PC) with hysteresis and adaption of the inner flyback converter control from Equation (39) and Figure 4. The DE stack-transducer is attached between the force measurement and the blocked voice coil so that it cannot deform (ε z = 0) to avoid disturbances.
Via the setpoint I * m,max for the current control of the flyback converter its feeding power is adjusted according to Equation (5). Due to the reduced power it takes a longer time to adjust the setpoint energies with the two-point controller with I * m,max = 4 A compared to the one with I * m,max = 8 A. In contrast, the reduced feeding power results in a higher accuracy during steady state. The standard deviation for the time interval between 50 and 60 ms increases from 0.03 mJ (2PC, I * m,max = 4 A) to 0.05 mJ (2PC, I * m,max = 8 A) for the sensor-based control and from 0.1 to 0.3 mJ for the self-sensing control, respectively. The adaptive three-point controller with hysteresis combines the advantageous of the two mentioned two-point controllers by automatically choosing the maximum current I * m,max = 8 A right after setpoint steps and reducing this current to I * m,max = 4 A at steady state. This fundamental behavior applies for both the sensor-based and self-sensing control. Although the dynamics of both approaches are comparable, a small oscillation around the setpoint can be observed in case of the self-sensing control that results in the higher standard deviation.
Furthermore, it can be seen that the two-point controllers permanently switch between the maximum charging and discharging powerp = ±p max during steady state. By extending the controller to a three-point controller with hysteresis, the switching frequency can be significantly reduced by more than 80% in case of the sensor-based control and 30% in case of the self-sensing control. Figure 9 depicts the comparison of the bandwidth of the introduced controller settings. For this purpose, the small signal behavior is considered. A harmonic setpoint U * c,diel with increasing frequency, an offset ofŪ c,diel = 12 mJ and an amplitude of U c,diel,amp = 2 mJ is applied. The sensor-based twopoint controller with I * m,max = 8 A and the three-point controller have a high -3 dB cut-off frequency of about 400 Hz. This is also obtained with the self-sensing control. However, disruptive amplitude peaks of about 5 dB result in the already observed oscillation. By reducing the feeding powerp with I * m,max = 4 A, the cut-off frequency is reduced to 200 Hz, while the amplitude peaks are suppressed.

Energy Control With Voltage Feed-Forward Control
By applying Equation (22) for the feed-forward control depicted in Figure 4 the voltage v p across the capacitance C p can be controlled. In Figure 10 the results of the sensor-based and self-sensing energy three-point controller are compared to the behavior obtained with the hysteresis voltage control for the bidirectional flyback converter suggested in . For the hysteresis voltage control a threshold of v DE = 30 V was chosen. If the control deviation |v * DE − v DE,m | exceeds this threshold the control activates the flyback converter to charge or discharge the DE transducer. Afterwards the converter is turned into idle state again. The three-point controller suggested here behaves more or less the same. The only difference is that with the controller settings from Table 1 a threshold of v p ≈ 16 V results. This smaller threshold increases on the one hand the steady state accuracy. However, the switching frequency is on the other hand a bit higher compared to the simple hysteresis voltage control. Concerning the sensor-based and self-sensing energy control a comparable behavior as shown and explained in Figure 8 can be observed here, too.

Energy Control With Position Feed-Forward Control
If the setpoint energy U * c,diel of the control structure in Figure 4 is determined with Equation (20) the proposed energy control can be used to adjust a certain strain ε * z , although the strain is not part of the state vector x U . To compensate the influence of a disturbance, the estimated loadσ load is considered in Equation (20). In contrast, an explicit position control based on the model (6) was derived in Hoffstadt and Maas (2017). Figure 11 shows the comparison of the explicit (Position-3PC) and energy-based position control (Energy-3PC) for the no-load case of the DE stack-transducer. Both approaches are realized as sensor-based and self-sensing control with the adaptive threepoint controller from Equation (39). The explicit and energy-based position control show comparable dynamics and accuracy for both the sensorbased and the self-sensing control. The different setpoints are adjusted within a few milliseconds. By increasing the strain setpoint ε * z the energy U * c,diel also increases according to Equation (20). Instead of the energy, the voltage v DE is depicted in Figure 11 as it is measured directly and can be interpreted more intuitively. With Equation (22) a relationship between the voltage v p ≈ v DE is given. As the no-load case is considered here, for a constant setpoint of the strain FIGURE 13 | Comparison of the sensor-based and self-sensing sliding mode energy control with force feed-forward control. In both cases two-point controllers (2PC) with I * m,max = 8 A and I * m,max = 4 A as well as the three-point controller (3PC) with hysteresis and adaption of the inner flyback converter control are considered. ε * z a constant setpoint for the energy U * c,diel or the voltage results, respectively.
In addition, Figure 12 depicts the disturbance reaction of the different position control approaches. For this purpose, a tensile load force of F * load = 0.5 N is applied by the linear drive of the test rig in Figure 5, while the setpoint strain is constantly set to ε * z = 1%. Right after the load is applied, the strain deviates from its setpoint due to the influence of the disturbance. However, the load is estimated with the sensor-based as well as the self-sensing EKF. According to Equation (20) the setpoint energy U * c,diel , or voltage v * DE , respectively, is increased to compensate the influence of the disturbanceσ load . In Figure 12 this behavior can be seen in the response of the corresponding voltage v DE in the third subplot. This compensates the influence of the disturbance within approx. 15 ms. In case of the energy-based position control a slightly higher control deviation can be observed after the load steps. This is mainly due to the fact, that the energy control only reacts on control deviations of the energy U c,diel , while the explicit position control considers the control deviation of the strain ε z directly.

Energy Control With Force Feed-Forward Control
Beside the two validated approaches, Equation (21) offers the opportunity to realize a force feed-forward control under consideration of the current elastic material tension σ elast (ε z ) based on the proposed energy control as already depicted in Figure 4. As for the previous two approaches, the controller settings are the same as listed in Table 1. However, in Figure 13 also the two-point controller with I m,max = 8 A and I m,max = 4 A is considered again. The deformation of the DE stack-transducer is blocked in this case to investigate the control behavior caused by setpoint steps without any disturbance. In general, a comparable behavior to the pure energy control in Figure 8 can 6. CONCLUSION DE transducer combine high energy densities and multifunctional operation modes. Multilayer topologies like the DE stack-actuator considered here have also high force densities with considerable absolute deformations so that they are wellsuited to be used as active skins or as end effector in softrobotic applications. But, beside the transducer design also appropriate control and sensing algorithms are required to enable the combined actuator-sensor-operation in closed loop operation without external sensors to measure mechanical states. The design of such a self-sensing state and disturbance estimator as a universal energy control that uses the information from a novel self-sensing estimator were addressed within this contribution.
For this purpose, in section 2 the control plant comprising a DE stack-transducer fed by a bidirectional flyback converter and its model to describe the electormechanically coupled behavior was summarized. To characterize the electrical behavior the model includes the energy U c,diel as one state variable. Based on this model subsequently a self-sensing state and disturbance estimator was developed that estimates the mechanical state of the transducer as well as an external load force by just measuring the terminal voltage and current. Due to the nonlinear system behavior an EKF was used for this purpose. It allows to estimate the transducer state without any superimposed voltage excitation as used for other self-sensing approaches. The validation results have shown that almost no confinements in terms of dynamics and accuracy compared to the sensor-based estimator are obtained. The sensor-based estimator requires a measurement of the terminal voltage and the displacement.
The developed energy control uses the information provided by the self-sensing EKF for closed loop operation. Due to the behavior of the bidirectional flyback converter, that either charges or discharges a DE transducer with almost constant power when enabled, the sliding mode control approach was applied. By controlling the energy in the capacitance of the DE transducer it is possible to control the voltage, force or displacement of the transducer by using different feed-forward control structures. The setpoint energy required to achieve a certain actuator force or displacement was obtained under consideration of the static force equilibrium included in the derived model. Within the validation it was shown that a precise control of the voltage, force and displacement with high dynamics and a bandwidth of up to 400 Hz is achieved with this approach. The step response as well as the disturbance reaction yield comparable dynamics and accuracy for both the sensor-based and self-sensing control.
Although here a DE stack-transducer was considered, the developed self-sensing EKF and control approach can also be applied to other topologies well-suited for soft robotic applications like DE-based minimum energy structures or membrane actuators. The utilized bidirectional flyback converter represents an efficient and competitive converter topology and can also be used to supply any kind of DE transducer. In case of soft-bodied robots equipped with DE transducers and the mentioned converter the suggested self-sensing control approach can be used to control the impedance of the robot by applying the proposed force and displacement feed-forward controls in combination with a human-machine-interface model. If under consideration of the utilized test setup a charge of at least q p = 5 µAs is applied, the proposed selfsensing filter can also detect collisions or interactions. This could be used e.g., in human machine interfaces or active skins, so that the control can react on these events. While for these applications the force and displacement control are most important, the voltage control could be used to avoid exceeding limitations that would cause a damage of the transducer.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this manuscript will be made available upon request to JM (juergen.maas@tu-berlin.de).