Files

Timour Katchaounov 3ade0b9fe3 SERVER-72236 Generate random integer data for CE

Generate random data with integers. The approach is as follows:
- There is one collection for each different cardinality. All collections contain the same fields.
- Each field contains the data generated from a certain data distribution. The data could be anything - same type, mixed types, same mathematical distribution (e.g. normal), or a mixed distribution.
- The committed configuration file, and the corresponding data file are reduced to only two small collections. For actual experiments one needs to add more data sizes, and re-generate the data locally. This is done so that Evergreen tests can run fast, and to reduce the size of the git repository.
- All data is saved in a single JavaScript file: jstests/query_golden/libs/data/ce_accuracy_test.data, with a corresponding schema file jstests/query_golden/libs/data/ce_accuracy_test.schema.
- The data file is a JavaScript file that can be loaded directly inside a JS test. When loading this file, it creates a global variable dataSet. The reason is that this is the only way to load an external JSON file that doesn't need to install external tools in Evergreen.

2023-01-10 12:51:54 +00:00

.gitignore

SERVER-69030 Add python requirements.txt for Cost Model

2022-11-10 11:18:11 +00:00

abt_calibrator.py

SERVER-71220 Apply additional filters when calibrating Cost Model

2022-11-11 12:20:20 +00:00

benchmark.py

SERVER-67161: Add physical NestedLoopJoin and make BinaryJoin logical

2022-12-02 18:22:49 +00:00

calibration_settings.py

SERVER-72036 Implement data generation and loading into JS CE accuracy tests

2022-12-17 10:06:36 +00:00

ce_data_settings.py

SERVER-72236 Generate random integer data for CE

2023-01-10 12:51:54 +00:00

ce_generate_data_settings.py

SERVER-72036 Implement data generation and loading into JS CE accuracy tests

2022-12-17 10:06:36 +00:00

ce_generate_data.py

SERVER-72236 Generate random integer data for CE

2023-01-10 12:51:54 +00:00

common.py

…

config.py

SERVER-72036 Implement data generation and loading into JS CE accuracy tests

2022-12-17 10:06:36 +00:00

cost_estimator.py

SERVER-71220 Apply additional filters when calibrating Cost Model

2022-11-11 12:20:20 +00:00

data_generator.py

SERVER-72036 Implement data generation and loading into JS CE accuracy tests

2022-12-17 10:06:36 +00:00

database_instance.py

SERVER-70500 Calibrate ABT nodes on smaller queries

2022-11-08 21:09:45 +00:00

end_to_end.py

SERVER-72036 Implement data generation and loading into JS CE accuracy tests

2022-12-17 10:06:36 +00:00

execution_tree.py

SERVER-71576 Implement Cost Model End2End testing

2022-11-24 11:18:42 +00:00

experiment.py

SERVER-71576 Implement Cost Model End2End testing

2022-11-24 11:18:42 +00:00

mongod-inmemory.yaml

SERVER-71363 Use Nanoseconds rather than Microseconds in QueryExecTime and support Nanoseconds in ScopedTimer

2022-11-16 14:11:27 +00:00

mongod.yaml

SERVER-71363 Use Nanoseconds rather than Microseconds in QueryExecTime and support Nanoseconds in ScopedTimer

2022-11-16 14:11:27 +00:00

parameters_extractor.py

SERVER-70537 Implement end to end cost model benchmark.

2022-11-03 17:04:23 +00:00

physical_tree.py

SERVER-70537 Implement end to end cost model benchmark.

2022-11-03 17:04:23 +00:00

random_generator.py

SERVER-70500 Calibrate ABT nodes on smaller queries

2022-11-08 21:09:45 +00:00

README.md

SERVER-71220 Apply additional filters when calibrating Cost Model

2022-11-11 12:20:20 +00:00

requirements.txt

SERVER-69030 Add python requirements.txt for Cost Model

2022-11-10 11:18:11 +00:00

start.py

SERVER-71576 Cost Model End2End benchmark: calculate r2

2022-12-06 12:47:40 +00:00

workload_execution.py

SERVER-71220 Apply additional filters when calibrating Cost Model

2022-11-11 12:20:20 +00:00

README.md

Cost Model Calibrator

Python virtual environment

The following assumes you are using python from the MongoDB toolchain.

/opt/mongodbtoolchain/v4/bin/python3

Getting started

(mongo-python3) deactivate  # only if you have another python env activated
sh> /opt/mongodbtoolchain/v4/bin/python3 -m venv cm  # create new env
sh> source cm/bin/activate  # activate new env
(cm) python -m pip install -r requirements.txt  # install required packages           
(cm) python start.py  # run the calibrator
(cm) deactivate  # back to bash
sh>

Install new packages

(cm) python -m pip install <package_name>     # install <package_name>
(cm) python -m pip freeze > requirements.txt  # do not forget to update requirements.txt