Perform Simulated Experiments

In this section, we describe how to run a simulated experiment with the parameters you have defined in /app/bace/user_config.py. The relevant files are /tools/simulation/simulation.py and /tools/simulation/simulation_output/figures.R.

Important: the simulation file relies on the specifications set in the /app/bace/user_config.py file, which is what will be used by your application on the cloud.

We strongly recommend running a simulation on your local computer prior to running a full-scale experiment. This process can help identify bugs in your code and determine whether something is misspecified in your configuration.

Perform Simulation

To perform a simulation with your specified parameters, from the root directory:

cd tools/simulation
python simulation.py

This file runs a simulated experiment. You can specify the characteristics of your simulation within the /tools/simulation/simulation.py script.

###############################
# Specify Simulation Parameters

N_sims = 200 # Number of simulated individuals per optimization type
N_designs_per_sim = 25 # Number of questions per simulation

###############################

This configuration specifies that 200 different "respondents" will answer 25 questions each for each optimization type (BACE, random).

For each "respondent", a "true" parameter is drawn from the prior distribution specified in theta_params, but this can be changed. The algorithm simulates asking 25 rounds of questions to each individual and computes the posterior mean for each parameter after each round. We compare performance for questions that are chosen randomly vs. questions that are chosen using Bayesian Optimization. The simulation is useful for estimating how computation time scales locally and precision for different levels of size_thetas and other tuning parameters. The simulation produces a .csv file, /app/simulation_output/simulation.csv.

Note that optimization time locally depends on your available resources and may vary substantially from the times you may get when your Lambda function is deployed on AWS.

Produce Figures

After running a simulation, you can produce graphs to understand how your model performs compared to randomly selected designs using the same posterior updating scheme. To produce basic figures, change the directory to the simulation_output folder and run the file figures.R.

cd simulation_output
Rscript figures.R

This file will produce:

Scatterplots comparing the true parameter and estimated parameter across rounds.
Figures displaying the average mean squared error by method across rounds.
A histogram comparing the average time for each round by method.

These figures are useful for understanding the performance and timing you can expect given your experiment configuration and can be found in the simulation_output folder.

Note: After running Rscript figures.R, if you want to rerun simulation.py, make sure to change your working directory in the command line back to where this script is by running cd ...

Conclusion

Running a simulation is a useful process for a few reasons. First, you can understand how optimally chosen questions perform compared to randomly chosen questions. Second, running a simulation should help you identify if there are major errors in the configuration file that you defined. Finally, it gives you a rough idea of how long each question will take given your current configuration file. You can tune parameters such as size_thetas and those in conf_dict in your /app/bace/user_config.py file to compare performance and timing across different specifications.