Commit Graph

24 Commits

Author SHA1 Message Date
Timour Katchaounov
3ade0b9fe3 SERVER-72236 Generate random integer data for CE
Generate random data with integers. The approach is as follows:
- There is one collection for each different cardinality. All collections contain the same fields.
- Each field contains the data generated from a certain data distribution. The data could be anything - same type, mixed types, same mathematical distribution (e.g. normal), or a mixed distribution.
- The committed configuration file, and the corresponding data file are reduced to only two small collections. For actual experiments one needs to add more data sizes, and re-generate the data locally. This is done so that Evergreen tests can run fast, and to reduce the size of the git repository.
- All data is saved in a single JavaScript file: jstests/query_golden/libs/data/ce_accuracy_test.data, with a corresponding schema file jstests/query_golden/libs/data/ce_accuracy_test.schema.
- The data file is a JavaScript file that can be loaded directly inside a JS test. When loading this file, it creates a global variable dataSet. The reason is that this is the only way to load an external JSON file that doesn't need to install external tools in Evergreen.
2023-01-10 12:51:54 +00:00
Timour Katchaounov
38c5aab661 SERVER-72036 Implement data generation and loading into JS CE accuracy tests
* Extend the data generation Python framework for cost calibration to support data generation for CE testing as follows:
- the entry point is ce_generate_data.py,
- the configuration of the generated data is in ce_generate_data_settings.py,
- all collection data is exported into a single JSON file stored in 'jstests/query_golden/libs/data', and a schema file stored in the same directory
* Implement a JS data loader function that also creates all indexes specified in the schema file.
* Add a small JS test that shows how to load the generated JSON files into collections.
2022-12-17 10:06:36 +00:00
Alexander Ignatyev
41f666bd81 SERVER-71576 Cost Model End2End benchmark: calculate r2 2022-12-06 12:47:40 +00:00
Hana Pearlman
39640e543c SERVER-67161: Add physical NestedLoopJoin and make BinaryJoin logical 2022-12-02 18:22:49 +00:00
Alexander Ignatyev
7db13d1475 SERVER-71576 Implement Cost Model End2End testing 2022-11-24 11:18:42 +00:00
Ruoxin Xu
bfe342d122 SERVER-71363 Use Nanoseconds rather than Microseconds in QueryExecTime and support Nanoseconds in ScopedTimer 2022-11-16 14:11:27 +00:00
Alexander Ignatyev
ea088bcf55 SERVER-71220 Apply additional filters when calibrating Cost Model 2022-11-11 12:20:20 +00:00
Alexander Ignatyev
0b52d4cbc8 SERVER-69030 Add python requirements.txt for Cost Model 2022-11-10 11:18:11 +00:00
Ruoxin Xu
32545f21f5 SERVER-70500 Calibrate ABT nodes on smaller queries 2022-11-08 21:09:45 +00:00
Alexander Ignatyev
4acc5a3704 SERVER-70537 Implement end to end cost model benchmark. 2022-11-03 17:04:23 +00:00
Alexander Ignatyev
eb3badb0ae SERVER-70491 Calibrate physical ABT nodes in the framework 2022-10-13 17:36:29 +00:00
Alexander Ignatyev
3b8505623b SERVER-68984 Implement experiments in Cost Model Framework 2022-09-01 15:50:46 +00:00
Alexander Ignatyev
5f11658224 SERVER-69031 Move JSON configuration to python file 2022-09-01 09:50:08 +00:00
Ruoxin Xu
ce2dbe9a4a SERVER-67223 Optimize writing to MongoDB data generator 2022-08-30 14:26:01 +00:00
Alexander Ignatyev
8c0bbf5b25 SERVER-69089: Add ability to define indexes in config 2022-08-30 09:32:51 +00:00
Alexander Ignatyev
2b69d029a2 SERVER-68983 Do not run ABT calibration workflows with hidden indexes 2022-08-23 16:45:21 +00:00
Ruoxin Xu
0c53d0c90b SERVER-68386 Implement random generator of documents 2022-08-21 15:51:59 +00:00
Ruoxin Xu
7e08691fc9 SERVER-68385 Implement random generator of arrays 2022-08-15 16:52:35 +00:00
Nicholas Zolnierz
cdd2775640 SERVER-62042 Consolidate query optimization and execution control into a single knob 2022-08-04 15:39:07 +00:00
auto-revert-processor
757f6f2293 Revert "SERVER-62042 Consolidate query optimization and execution control into a single knob"
This reverts commit c9bbd1cfae.
2022-08-04 06:09:52 +00:00
Nicholas Zolnierz
c9bbd1cfae SERVER-62042 Consolidate query optimization and execution control into a single knob 2022-08-03 17:22:30 +00:00
Alexander Ignatyev
c545bd81b0 SERVER-68349 Support various data distributions in random data generator 2022-07-28 15:26:31 +00:00
Alexander Ignatyev
30420ef9af SERVER-67851 Support additional input parameters in Cost Model 2022-07-26 19:46:18 +00:00
Alexander Ignatyev
304a34e0c9 SERVER-67121 Cost Model and Calibration farmework 2022-07-05 09:31:30 +00:00