I want to create historical simulated data for a 3xetf, specifically UPRO or SP500. I've made some attempts myself using historical SPY data and multiplying X3 and the daily expense ratio, but I'm not coming close. There is a data set on bogleheads that has the daily %change calculated, but I need open/close data. I've compared my attempts to the existing UPRO data and my results haven't been good. Anyone have any idea how to create this data? I would be willing to pay someone to create it for me if thats what it takes. Thanks

I think you should clarify your question with additional info or example. Normally you can't take simulated data as market data for real stocks or for indices etc. Simulated stock prices is usually generated with GBM algorithm: https://en.wikipedia.org/wiki/Geometric_Brownian_motion PM me if you need help or assistance, incl. maths & programming.

This may help: https://teddykoker.com/2019/04/simulating-historical-performance-of-leveraged-etfs-in-python/ Even if you don't know Python, you can look at the equations to get the basic gist of what he's doing. To summarize what he's doing to calculate the leveraged return: 1) The % daily change (P) for the S&P 500 proxy (VFINX in this case) is calculated 2) The estimated % daily change for the UPRO simulated data is calculated as: Pu = (P - ExpenseRatio / 252) * Leverage. The ExpenseRatio is the 0.92% the ETF charges as a fee, 252 is the number of trading days in a year and Leverage is the leverage factor (3 in this case) 3) The simulated values are then calculated by compounding all of the Pu values from the beginning of the dataset The simulation starts from the inception of UPRO (2009) and the UPROsim results he gets are almost identical to the actual UPRO chart.

I don’t want completely simulated data I guess. I want to generate historical or pre-inception date data for UPRO using its same index sp500 and applying leverage and expense ratio. I would like to test on UPRO farther back than 2009

Ok, but it's then not simulated data, but rather extrapolated data from other (underlying) data, as it looks much like.

You could use a correlation method as described in this article: https://www.quantstart.com/articles/Generating-Correlated-Asset-Paths-in-C-via-Monte-Carlo/ Some years ago I did a project on this using C++; it was about generating such correlated GBM data series, to simulate the correlated movement in many stocks affected by the same event. In your case the one data series is fixed (ie. the S&P500 data) and the UPRO has to be approximately recreated using the other data plus some correlation maths. Yes, it's doable, but requires some time & work... See also https://en.wikipedia.org/wiki/Cholesky_decomposition though there are also some other methods similar to this for generating the correlation matrix.

The close values can be calculated by applying the daily change to the previous close. The open is random but can be simulated statistically by looking at the average and standard deviation of the open vs. previous close for the reference (UPRO). You can do this easily in Excel using the NORM.DIST() function.

Isn't it easier to just get this data from a data provider? At a minimal fee, you can get OHLC tickdata from algoseek.