Deploy at Scale
This section provides helpful information for deploying your survey at scale. Up to this point, your application has only experienced a few requests at a time. When you deploy your survey, the application will need to handle multiple requests at once. Depending on the complexity of your configuration and the expected number of users, you may need to adjust computational resources to handle increased traffic at scale. In this section, we walk through the process for load testing your application to see how your application scales.
The following parameters can be tuned to affect the speed of your application.
AWS Lambda Global Parameters
Specify global parameters for your Lambda function in template.yaml
Globals:
Function:
Timeout: 600 # Specify time (in s) before function times out.
MemorySize: 512 # Specify memory (in Mb) allocated to your function.
See the AWS documentation for setting memory and computing power for more details on balancing cost and speed. At the time of writing this, you can allocate between 128 MB and 10,240 MB for your functions.
Check Lambda Concurrency Quota
New AWS accounts may have reduced concurrency and memory quotas. If your function requires more than 3008 Mb of memory, you may need to request a quota increase. Standard applications of BACE should operate smoothly with less than 3,008 MB of memory.
You can also increase the number of machines that can run concurrently in order to scale your application. The default concurrency limit for the number of Lambda containers that are run concurrently is 1000. However, in some regions, the concurrency limit is throttled to a lower value for new AWS accounts.
You can check the quota for the number of concurrent AWS Lambda executions by opening up the Service Quotas console at https://console.aws.amazon.com/servicequotas/home. Select AWS Lambda and check the applied quota value for the number of Concurrent executions. This value controls the maximum number of Lambda containers that can be run simultaneously.
If you anticipate having more people take the survey simultaneously than the number that is listed, you should request a concurrency limit increase.
Application Tuning Parameters
You can also tune the speed/precision of your application by specifying some parameters in /app/bace/user_config.py
.
size_thetas = 2500 # Size of sample drawn from prior distribuion over preference parameters.
max_opt_time = 5 # Stop Bayesian Optimization process after max_opt_time and return best design.
# Configuration Dictionary for Bayesian Optimization
# See https://github.com/ARM-software/mango#6-optional-configurations for details
# Possible to add constraints and early stopping rules here.
conf_dict = dict(
domain_size=1000,
initial_random=1,
num_iteration=20
)
max_opt_time
- Sets the maximum time to spend on the Bayesian Optimization step. If the optimization step takes longer thanmax_opt_time
, the process is stopped and the design with the highest mutual information up to that point is returned.size_thetas
- This parameter governs the number of points that are sampled fromtheta_parameters
to form your prior distribution. As you increasesize_thetas
, the number of points used to estimate the posterior increases, and estimates will become more precise. However, increasingsize_thetas
also increases the computation time required to update beliefs and choose new designs. Thus, users must trade off precision and speed.- Bayesian Optimization Parameters can be set by specifying a
conf_dict (dict)
object in youruser_config.py
file. See the Mango documentation for details on how to specify this object. To choose the optimal next design, the Bayesian Optimization algorithm selectsinitial_random
points at random and computes the mutual information at these initial points. Fornum_iteration
subsequent iterations, a Gaussian process is fit to the data; the design that offers the highest expected improvement is chosen, and the mutual information is computed at that design. This observation is added to the model's data, and the fitting process is repeated. Increasingnum_iteration
will weakly improve optimization performance at the expense of computation time.
Load Testing
We provide a script to help you test how your application will run at scale. We use Locust, an open-source load testing tool, to see how your program will handle multiple users at once. Follow the instructions to install Locust on your local computer prior to working through this section.
The key file that will be used is run_load_test.py
, which is located in the root of your directory.
The file defines the /appUser
class, which simulates an individual taking the survey. The basic script can be updated to handle additional calls/requests depending on how you have coded your application.
Make sure to set up locust for your environment by following the link above. Start a new load testing session, by running:
Example output:
locust -f run_load_test.py
[2022-06-07 12:59:35,463] DESKTOP-VBEES43/INFO/locust.main: Starting web interface at http://0.0.0.0:8089 (accepting connections from all network interfaces)
[2022-06-07 12:59:35,499] DESKTOP-VBEES43/INFO/locust.main: Starting Locust 2.9.0
The command will tell you how to access the locally hosted web interface, which is typically available at http://0.0.0.0:8089
or localhost:8089
, in your browser.
Specify the number of users, the spawn rate, and the host website for your application (<your-URL>
). Do not include a trailing forward slash, /, when typing the web address. Click Start swarming
and a new test will begin. The web page records the response times for each type of request. You can scale to multiple users by clicking Edit
and changing the Number of users
and Spawn rate
.
To exit the locust test in the command line, hit Ctrl + C
or click Stop
in your web browser. Aggregate timing statistics will be printed.
Note that this process simulates real users, so you will be charged if the number of test calls exceeds the free tier limits for AWS.
Pricing
For typical applications, most costs will be covered under the AWS Free Tier. For example, DynamoDB offers up to 25GB of storage free. Lambda allows 1 million requests per month under their free tier. If you are worried about exceeding these limits, you can use the AWS Pricing Calculator to estimate costs.