Compare commits

...

18 Commits

Author SHA1 Message Date
sueloverso
8b7110bfac WT-2706 Fix lost log writes when switching files (#2803)
Fix a path that incorrectly returning success from log_write without writing a log record.

(cherry picked from commit 36d657ccc6)
2016-07-07 08:49:56 +10:00
Michael Cahill
f4954f6592 WT-2729 Focus eviction on the largest trees in cache (mongodb-3.2). (#2833)
Randomize visits to trees that use a tiny fraction of the cache.

Eviction optimizations.

Now that we are queuing more entries (potentially), make sure enough of
them become candidates.  Previously, a skewed distribution of read
generations could mean that only 10% of queue entries were considered.

Improve the efficiency of sorting the queue by calculating the score
once when pages are added to the queue.

Take care to bound the maximum eviction slot.
2016-06-28 14:58:58 +10:00
Keith Bostic
dd2a33849a WT-2708 split child-update race with reconciliation/eviction (#2835)
(cherry picked from commit 521270d54c)

When splitting the root page and updating the child's WT_REF.addr, reconciliation/eviction can race with us, updating WT_REF.addr after our read and before our update. The update is necessary because the child's
address points into the page being split: if the address changes, then it can no longer point into the page being split and the update is no longer necessary.
2016-06-28 14:17:30 +10:00
Michael Cahill
a63e21b838 SERVER-24580 Add more eviction stats to track efficiency. (#2830)
(cherry picked from commit 1f4aaa4490)
2016-06-23 17:36:28 +10:00
sueloverso
30e49acc90 WT-2696 Wait if we find an unbuffered flag without the size set yet. (#2794)
* Modify recovery test to use multiple threads to reproduce this issue.

(cherry picked from commit 0d4c83daf7)
2016-06-23 17:30:02 +10:00
Keith Bostic
063dbdfe45 WT-2672 handle system calls that don't set errno (#2765)
Define system call success as a 0 return, and split error handling into two parts: if the call returns -1, use errno, otherwise expect the failing return to be an error value.

Replace calls to remove with unlink, so we know errno will be set.  Do the best we can with rename, there's no easy workaround.

POSIX requires posix_madvise return an errno value, but some OS versions return a -1/errno pair instead (at least FreeBSD and OS X). I don't care about retrying posix_madvise calls on failure, but since WT_SYSCALL_RETRY includes the necessary error handling magic, wrap the posix_madvise calls in WT_SYSCALL_RETRY.

(cherry picked from commit ced588aecd)
2016-06-23 17:16:06 +10:00
Michael Cahill
9ee39b8aea SERVER-24580 Fix backport. 2016-06-23 16:29:47 +10:00
Michael Cahill
d68800d0e9 SERVER-24580 Update oldest txn ID with "strict, nowait" flags. (#2829)
Add more options for callers when updating the oldest ID to control how much they care about the ID being updated.
(cherry picked from commit 116e41e5e1)
2016-06-23 16:03:26 +10:00
Michael Cahill
ded2149b2c SERVER-24580 Enhance eviction when application threads are contributing (#2806)
When the cache hits eviction triggers, all application threads can
hammer the eviction queue lock, starving each other and server threads.

Also, noticed with the same workload, the eviction server doesn't need
to force updates to the oldest ID (which can starve the eviction server
thread if there are hundreds of application threads getting snapshots).
It is sufficient to update it lazily.

* Clear the eviction walk if we don't find any candidates.

Otherwise, we are keeping a page pinned in what might be an idle file,
and tying up a hazard pointer that could prevent eviction from an active
file (since the eviction server tracks how many hazard pointers it is
using to avoid going over the limit).

(cherry picked from commit 7f9d7aecea)
2016-06-23 16:03:04 +10:00
Michael Cahill
0f36f40c37 WT-2702 Block operations when the cache is 100% full. (#2798)
(cherry picked from commit ac14731a59)
2016-06-23 16:02:46 +10:00
David Hows
a6a64e986b WT-2646 Add checkpoint_wait configuration option to drop (#2768)
* Default checkpoint_wait is true. This change is useful because it means concurrent create/drop calls don't generate EBUSY returns.
* Mark lock_wait and checkpoint_wait as undoc

(cherry picked from commit 4b48ad6fb7)
2016-06-02 16:12:21 +10:00
David Hows
234b68b116 WT-2613 Add WT_UNUSED to a variable to fix Windows compilation. (#2717)
(cherry picked from commit 7deb9c213b)
2016-06-01 12:47:25 +10:00
Michael Cahill
5d215904c3 SERVER-24306 Fix stall in log_flush switching to new files. (#2761)
* SERVER-24306 Fix stall in log_flush switching to new files.

* Pass boolean false rather than 0.

(cherry picked from commit b89aaece7b)
2016-06-01 12:44:46 +10:00
Michael Cahill
18879587af WT-2629 Make the stack non-executable with GCC only. (#2742)
(cherry picked from commit f6f86961a4)
2016-06-01 12:41:58 +10:00
Michael Cahill
6bfcb1ca5b WT-2629 Don't make stacks executable in assembly source. (#2739)
(cherry picked from commit 0f7ae730d9)
2016-06-01 12:41:48 +10:00
Alex Gorrod
71c0588a77 Merge pull request #2677 from wiredtiger/wt-2560-spin
WT-2560 Spin on transaction locks.
(cherry picked from commit f498d8c1c1)
2016-06-01 10:48:12 +10:00
Michael Cahill
58765850aa Merge pull request #2660 from wiredtiger/wt-2560
WT-2560 Use a rwlock to protect transaction state, don't spin.
Conflicted on a whitespace cleanup.

(cherry picked from commit 76e286c7ba)
2016-06-01 10:46:47 +10:00
Keith Bostic
30d327f810 Merge pull request #2664 from wiredtiger/wt-2559
WT-2559 Open a local log file handle for sync.
(cherry picked from commit 6b3553003f)
2016-06-01 10:40:38 +10:00
45 changed files with 884 additions and 531 deletions

10
dist/api_data.py vendored
View File

@@ -787,15 +787,19 @@ methods = {
]),
'WT_SESSION.drop' : Method([
Config('checkpoint_wait', 'true', r'''
wait for the checkpoint lock, if \c checkpoint_wait=false, fail if
this lock is not available immediately''',
type='boolean', undoc=True),
Config('force', 'false', r'''
return success if the object does not exist''',
type='boolean'),
Config('remove_files', 'true', r'''
should the underlying files be removed?''',
type='boolean'),
Config('lock_wait', 'true', r'''
wait for locks, if \c lock_wait=false, fail if any required locks are
not available immediately''',
type='boolean', undoc=True),
Config('remove_files', 'true', r'''
should the underlying files be removed?''',
type='boolean'),
]),

4
dist/flags.py vendored
View File

@@ -57,6 +57,10 @@ flags = {
'TXN_LOG_CKPT_STOP',
'TXN_LOG_CKPT_SYNC',
],
'txn_update_oldest' : [
'TXN_OLDEST_STRICT',
'TXN_OLDEST_WAIT',
],
'verbose' : [
'VERB_API',
'VERB_BLOCK',

2
dist/s_string.ok vendored
View File

@@ -439,6 +439,7 @@ bzDecompressInit
bzalloc
bzfree
bzip
call's
calloc
cas
catfmt
@@ -1067,6 +1068,7 @@ unescaped
unicode
uninstantiated
unistd
unlink
unlinked
unmap
unmarshall

7
dist/stat_data.py vendored
View File

@@ -162,6 +162,7 @@ connection_stats = [
CacheStat('cache_bytes_write', 'bytes written from cache', 'size'),
CacheStat('cache_eviction_aggressive_set', 'eviction currently operating in aggressive mode', 'no_clear,no_scale'),
CacheStat('cache_eviction_app', 'pages evicted by application threads'),
CacheStat('cache_eviction_app_dirty', 'modified pages evicted by application threads'),
CacheStat('cache_eviction_checkpoint', 'checkpoint blocked page eviction'),
CacheStat('cache_eviction_clean', 'unmodified pages evicted'),
CacheStat('cache_eviction_deepen', 'page split during eviction deepened the tree'),
@@ -173,6 +174,9 @@ connection_stats = [
CacheStat('cache_eviction_hazard', 'hazard pointer blocked page eviction'),
CacheStat('cache_eviction_internal', 'internal pages evicted'),
CacheStat('cache_eviction_maximum_page_size', 'maximum page size at eviction', 'no_clear,no_scale,size'),
CacheStat('cache_eviction_pages_queued', 'pages queued for eviction'),
CacheStat('cache_eviction_pages_queued_oldest', 'pages queued for urgent eviction'),
CacheStat('cache_eviction_pages_seen', 'pages seen by eviction walk'),
CacheStat('cache_eviction_queue_empty', 'eviction server candidate queue empty when topping up'),
CacheStat('cache_eviction_queue_not_empty', 'eviction server candidate queue not empty when topping up'),
CacheStat('cache_eviction_server_evicting', 'eviction server evicting pages'),
@@ -181,6 +185,8 @@ connection_stats = [
CacheStat('cache_eviction_split_internal', 'internal pages split during eviction'),
CacheStat('cache_eviction_split_leaf', 'leaf pages split during eviction'),
CacheStat('cache_eviction_walk', 'pages walked for eviction'),
CacheStat('cache_eviction_walks_active', 'files with active eviction walks', 'no_clear,no_scale,size'),
CacheStat('cache_eviction_walks_started', 'files with new eviction walks started'),
CacheStat('cache_eviction_worker_evicting', 'eviction worker thread evicting pages'),
CacheStat('cache_inmem_split', 'in-memory page splits'),
CacheStat('cache_inmem_splittable', 'in-memory page passed criteria to be split'),
@@ -408,6 +414,7 @@ dsrc_stats = [
##########################################
# Cache and eviction statistics
##########################################
CacheStat('cache_bytes_inuse', 'bytes currently in the cache', 'no_clear,no_scale,size'),
CacheStat('cache_bytes_read', 'bytes read into cache', 'size'),
CacheStat('cache_bytes_write', 'bytes written from cache', 'size'),
CacheStat('cache_eviction_checkpoint', 'checkpoint blocked page eviction'),

View File

@@ -325,7 +325,7 @@ __wt_btcur_search(WT_CURSOR_BTREE *cbt)
valid = false;
if (F_ISSET(cbt, WT_CBT_ACTIVE) &&
cbt->ref->page->read_gen != WT_READGEN_OLDEST) {
__wt_txn_cursor_op(session);
WT_ERR(__wt_txn_cursor_op(session));
WT_ERR(btree->type == BTREE_ROW ?
__cursor_row_search(session, cbt, cbt->ref, false) :
@@ -405,7 +405,7 @@ __wt_btcur_search_near(WT_CURSOR_BTREE *cbt, int *exactp)
if (btree->type == BTREE_ROW &&
F_ISSET(cbt, WT_CBT_ACTIVE) &&
cbt->ref->page->read_gen != WT_READGEN_OLDEST) {
__wt_txn_cursor_op(session);
WT_ERR(__wt_txn_cursor_op(session));
WT_ERR(__cursor_row_search(session, cbt, cbt->ref, true));

View File

@@ -326,7 +326,7 @@ __evict_force_check(WT_SESSION_IMPL *session, WT_REF *ref)
__wt_page_evict_soon(page);
/* Bump the oldest ID, we're about to do some visibility checks. */
__wt_txn_update_oldest(session, false);
WT_RET(__wt_txn_update_oldest(session, 0));
/* If eviction cannot succeed, don't try. */
return (__wt_page_can_evict(session, ref, NULL));

View File

@@ -298,7 +298,7 @@ static int
__split_ref_move(WT_SESSION_IMPL *session, WT_PAGE *from_home,
WT_REF **from_refp, size_t *decrp, WT_REF **to_refp, size_t *incrp)
{
WT_ADDR *addr;
WT_ADDR *addr, *ref_addr;
WT_CELL_UNPACK unpack;
WT_DECL_RET;
WT_IKEY *ikey;
@@ -345,13 +345,18 @@ __split_ref_move(WT_SESSION_IMPL *session, WT_PAGE *from_home,
}
/*
* If there's no address (the page has never been written), or the
* address has been instantiated, there's no work to do. Otherwise,
* instantiate the address in-memory, from the on-page cell.
* If there's no address at all (the page has never been written), or
* the address has already been instantiated, there's no work to do.
* Otherwise, the address still references a split page on-page cell,
* instantiate it. We can race with reconciliation and/or eviction of
* the child pages, be cautious: read the address and verify it, and
* only update it if the value is unchanged from the original. In the
* case of a race, the address must no longer reference the split page,
* we're done.
*/
addr = ref->addr;
if (addr != NULL && !__wt_off_page(from_home, addr)) {
__wt_cell_unpack((WT_CELL *)ref->addr, &unpack);
WT_ORDERED_READ(ref_addr, ref->addr);
if (ref_addr != NULL && !__wt_off_page(from_home, ref_addr)) {
__wt_cell_unpack((WT_CELL *)ref_addr, &unpack);
WT_RET(__wt_calloc_one(session, &addr));
if ((ret = __wt_strndup(
session, unpack.data, unpack.size, &addr->addr)) != 0) {
@@ -371,7 +376,10 @@ __split_ref_move(WT_SESSION_IMPL *session, WT_PAGE *from_home,
break;
WT_ILLEGAL_VALUE(session);
}
ref->addr = addr;
if (!__wt_atomic_cas_ptr(&ref->addr, ref_addr, addr)) {
__wt_free(session, addr->addr);
__wt_free(session, addr);
}
}
/* And finally, copy the WT_REF pointer itself. */

View File

@@ -41,6 +41,9 @@ __wt_btree_stat_init(WT_SESSION_IMPL *session, WT_CURSOR_STAT *cst)
WT_STAT_SET(session, stats, btree_maxleafpage, btree->maxleafpage);
WT_STAT_SET(session, stats, btree_maxleafvalue, btree->maxleafvalue);
WT_STAT_SET(session, stats, cache_bytes_inuse,
__wt_btree_bytes_inuse(session));
/* Everything else is really, really expensive. */
if (!F_ISSET(cst, WT_CONN_STAT_ALL))
return (0);

View File

@@ -26,12 +26,14 @@ __sync_file(WT_SESSION_IMPL *session, WT_CACHE_OP syncop)
uint64_t internal_bytes, internal_pages, leaf_bytes, leaf_pages;
uint64_t oldest_id, saved_snap_min;
uint32_t flags;
u_int saved_evict_walk_period;
conn = S2C(session);
btree = S2BT(session);
walk = NULL;
txn = &session->txn;
saved_snap_min = WT_SESSION_TXN_STATE(session)->snap_min;
saved_evict_walk_period = btree->evict_walk_period;
flags = WT_READ_CACHE | WT_READ_NO_GEN;
internal_bytes = leaf_bytes = 0;
@@ -81,7 +83,7 @@ __sync_file(WT_SESSION_IMPL *session, WT_CACHE_OP syncop)
if (__wt_page_is_modified(page) &&
WT_TXNID_LT(page->modify->update_txn, oldest_id)) {
if (txn->isolation == WT_ISO_READ_COMMITTED)
__wt_txn_get_snapshot(session);
WT_ERR(__wt_txn_get_snapshot(session));
leaf_bytes += page->memory_footprint;
++leaf_pages;
WT_ERR(__wt_reconcile(session, walk, NULL, 0));
@@ -100,7 +102,7 @@ __sync_file(WT_SESSION_IMPL *session, WT_CACHE_OP syncop)
* the metadata shouldn't be that big, and (b) if we do ever
*/
if (txn->isolation == WT_ISO_READ_COMMITTED)
__wt_txn_get_snapshot(session);
WT_ERR(__wt_txn_get_snapshot(session));
/*
* We cannot check the tree modified flag in the case of a
@@ -236,10 +238,10 @@ err: /* On error, clear any left-over tree walk. */
WT_FULL_BARRIER();
/*
* If this tree was being skipped by the eviction server during
* the checkpoint, clear the wait.
* In case this tree was being skipped by the eviction server
* during the checkpoint, restore the previous state.
*/
btree->evict_walk_period = 0;
btree->evict_walk_period = saved_evict_walk_period;
/*
* Wake the eviction server, in case application threads have

View File

@@ -291,6 +291,7 @@ static const WT_CONFIG_CHECK confchk_WT_SESSION_create[] = {
};
static const WT_CONFIG_CHECK confchk_WT_SESSION_drop[] = {
{ "checkpoint_wait", "boolean", NULL, NULL, NULL, 0 },
{ "force", "boolean", NULL, NULL, NULL, 0 },
{ "lock_wait", "boolean", NULL, NULL, NULL, 0 },
{ "remove_files", "boolean", NULL, NULL, NULL, 0 },
@@ -1026,8 +1027,8 @@ static const WT_CONFIG_ENTRY config_entries[] = {
confchk_WT_SESSION_create, 40
},
{ "WT_SESSION.drop",
"force=0,lock_wait=,remove_files=",
confchk_WT_SESSION_drop, 3
"checkpoint_wait=,force=0,lock_wait=,remove_files=",
confchk_WT_SESSION_drop, 4
},
{ "WT_SESSION.join",
"bloom_bit_count=16,bloom_hash_count=8,compare=\"eq\",count=,"

View File

@@ -217,6 +217,14 @@ __wt_cache_stats_update(WT_SESSION_IMPL *session)
WT_STAT_SET(
session, stats, cache_bytes_overflow, cache->bytes_overflow);
WT_STAT_SET(session, stats, cache_bytes_leaf, leaf);
/*
* The number of files with active walks ~= number of hazard pointers
* in the walk session. Note: reading without locking.
*/
if (conn->evict_session != NULL)
WT_STAT_SET(session, stats, cache_eviction_walks_active,
conn->evict_session->nhazard);
}
/*

View File

@@ -545,8 +545,6 @@ restart:
while (i < WT_SLOT_POOL) {
save_i = i;
slot = &log->slot_pool[i++];
WT_ASSERT(session, slot->slot_state != 0 ||
slot->slot_release_lsn.l.file >= log->write_lsn.l.file);
if (slot->slot_state != WT_LOG_SLOT_WRITTEN)
continue;
written[written_i].slot_index = save_i;

View File

@@ -93,7 +93,8 @@ __wt_connection_close(WT_CONNECTION_IMPL *conn)
* transaction ID will catch up with the current ID.
*/
for (;;) {
__wt_txn_update_oldest(session, true);
WT_TRET(__wt_txn_update_oldest(session,
WT_TXN_OLDEST_STRICT | WT_TXN_OLDEST_WAIT));
if (txn_global->oldest_id == txn_global->current)
break;
__wt_yield();

View File

@@ -16,7 +16,7 @@ static int
__curds_txn_enter(WT_SESSION_IMPL *session)
{
session->ncursors++; /* XXX */
__wt_txn_cursor_op(session);
WT_RET(__wt_txn_cursor_op(session));
return (0);
}

View File

@@ -26,7 +26,8 @@ __wt_evict_file(WT_SESSION_IMPL *session, WT_CACHE_OP syncop)
WT_RET(__wt_evict_file_exclusive_on(session));
/* Make sure the oldest transaction ID is up-to-date. */
__wt_txn_update_oldest(session, true);
WT_RET(__wt_txn_update_oldest(
session, WT_TXN_OLDEST_STRICT | WT_TXN_OLDEST_WAIT));
/* Walk the tree, discarding pages. */
next_ref = NULL;

View File

@@ -16,7 +16,7 @@ static int __evict_lru_walk(WT_SESSION_IMPL *);
static int __evict_page(WT_SESSION_IMPL *, bool);
static int __evict_pass(WT_SESSION_IMPL *);
static int __evict_walk(WT_SESSION_IMPL *);
static int __evict_walk_file(WT_SESSION_IMPL *, u_int *);
static int __evict_walk_file(WT_SESSION_IMPL *, u_int, u_int *);
static WT_THREAD_RET __evict_worker(void *);
static int __evict_server_work(WT_SESSION_IMPL *);
@@ -32,11 +32,6 @@ __evict_read_gen(const WT_EVICT_ENTRY *entry)
uint64_t read_gen;
btree = entry->btree;
/* Never prioritize empty slots. */
if (entry->ref == NULL)
return (UINT64_MAX);
page = entry->ref->page;
/* Any page set to the oldest generation should be discarded. */
@@ -71,14 +66,15 @@ __evict_read_gen(const WT_EVICT_ENTRY *entry)
* Qsort function: sort the eviction array.
*/
static int WT_CDECL
__evict_lru_cmp(const void *a, const void *b)
__evict_lru_cmp(const void *a_arg, const void *b_arg)
{
uint64_t a_lru, b_lru;
const WT_EVICT_ENTRY *a = a_arg, *b = b_arg;
uint64_t a_score, b_score;
a_lru = __evict_read_gen(a);
b_lru = __evict_read_gen(b);
a_score = (a->ref == NULL ? UINT64_MAX : a->score);
b_score = (b->ref == NULL ? UINT64_MAX : b->score);
return ((a_lru < b_lru) ? -1 : (a_lru == b_lru) ? 0 : 1);
return ((a_score < b_score) ? -1 : (a_score == b_score) ? 0 : 1);
}
/*
@@ -592,9 +588,10 @@ __evict_pass(WT_SESSION_IMPL *session)
*
* Do this every time the eviction server wakes up, regardless
* of whether the cache is full, to prevent the oldest ID
* falling too far behind.
* falling too far behind. Don't wait to lock the table: with
* highly threaded workloads, that creates a bottleneck.
*/
__wt_txn_update_oldest(session, true);
WT_RET(__wt_txn_update_oldest(session, WT_TXN_OLDEST_STRICT));
if (!__evict_update_work(session))
break;
@@ -900,7 +897,7 @@ __evict_lru_walk(WT_SESSION_IMPL *session)
{
WT_CACHE *cache;
WT_DECL_RET;
uint64_t cutoff, read_gen_oldest;
uint64_t read_gen_oldest;
uint32_t candidates, entries;
cache = S2C(session)->cache;
@@ -958,7 +955,7 @@ __evict_lru_walk(WT_SESSION_IMPL *session)
read_gen_oldest = WT_READGEN_OLDEST;
for (candidates = 0; candidates < entries; ++candidates) {
read_gen_oldest =
__evict_read_gen(&cache->evict_queue[candidates]);
cache->evict_queue[candidates].score;
if (read_gen_oldest != WT_READGEN_OLDEST)
break;
}
@@ -967,35 +964,29 @@ __evict_lru_walk(WT_SESSION_IMPL *session)
* Take all candidates if we only gathered pages with an oldest
* read generation set.
*
* We normally never take more than 50% of the entries; if 50%
* of the entries were at the oldest read generation, take them.
* We normally never take more than 50% of the entries but if
* 50% of the entries were at the oldest read generation, take
* all of them.
*/
if (read_gen_oldest == WT_READGEN_OLDEST)
cache->evict_candidates = entries;
else if (candidates >= entries / 2)
cache->evict_candidates = candidates;
else {
/* Save the calculated oldest generation. */
cache->read_gen_oldest = read_gen_oldest;
/* Find the bottom 25% of read generations. */
cutoff =
(3 * read_gen_oldest + __evict_read_gen(
&cache->evict_queue[entries - 1])) / 4;
/*
* Don't take less than 10% or more than 50% of entries,
* regardless. That said, if there is only one entry,
* which is normal when populating an empty file, don't
* exclude it.
* Take all of the urgent pages plus a third of
* ordinary candidates (which could be expressed as
* WT_EVICT_WALK_INCR / WT_EVICT_WALK_BASE). In the
* steady state, we want to get as many candidates as
* the eviction walk adds to the queue.
*
* That said, if there is only one entry, which is
* normal when populating an empty file, don't exclude
* it.
*/
for (candidates = 1 + entries / 10;
candidates < entries / 2;
candidates++)
if (__evict_read_gen(
&cache->evict_queue[candidates]) > cutoff)
break;
cache->evict_candidates = candidates;
cache->evict_candidates =
1 + candidates + ((entries - candidates) - 1) / 3;
cache->read_gen_oldest = read_gen_oldest;
}
}
@@ -1071,7 +1062,7 @@ __evict_walk(WT_SESSION_IMPL *session)
* per walk.
*/
start_slot = slot = cache->evict_entries;
max_entries = slot + WT_EVICT_WALK_INCR;
max_entries = WT_MIN(slot + WT_EVICT_WALK_INCR, cache->evict_slots);
retry: while (slot < max_entries && ret == 0) {
/*
@@ -1154,7 +1145,6 @@ retry: while (slot < max_entries && ret == 0) {
* useful in the past.
*/
if (btree->evict_walk_period != 0 &&
cache->evict_entries >= WT_EVICT_WALK_INCR &&
btree->evict_walk_skips++ < btree->evict_walk_period)
continue;
btree->evict_walk_skips = 0;
@@ -1180,7 +1170,8 @@ retry: while (slot < max_entries && ret == 0) {
if (!F_ISSET(btree, WT_BTREE_NO_EVICTION)) {
cache->evict_file_next = dhandle;
WT_WITH_DHANDLE(session, dhandle,
ret = __evict_walk_file(session, &slot));
ret = __evict_walk_file(
session, max_entries, &slot));
WT_ASSERT(session, session->split_gen == 0);
}
__wt_spin_unlock(session, &cache->evict_walk_lock);
@@ -1247,8 +1238,9 @@ __evict_init_candidate(
if (evict->ref != NULL)
__evict_list_clear(session, evict);
evict->ref = ref;
evict->btree = S2BT(session);
evict->ref = ref;
evict->score = __evict_read_gen(evict);
/* Mark the page on the list; set last to flush the other updates. */
F_SET_ATOMIC(ref->page, WT_PAGE_EVICT_LRU);
@@ -1259,7 +1251,7 @@ __evict_init_candidate(
* Get a few page eviction candidates from a single underlying file.
*/
static int
__evict_walk_file(WT_SESSION_IMPL *session, u_int *slotp)
__evict_walk_file(WT_SESSION_IMPL *session, u_int max_entries, u_int *slotp)
{
WT_BTREE *btree;
WT_CACHE *cache;
@@ -1269,8 +1261,9 @@ __evict_walk_file(WT_SESSION_IMPL *session, u_int *slotp)
WT_PAGE *page;
WT_PAGE_MODIFY *mod;
WT_REF *ref;
uint64_t pages_walked;
uint32_t walk_flags;
uint64_t btree_inuse, bytes_per_slot, cache_inuse;
uint64_t pages_seen, refs_walked;
uint32_t remaining_slots, target_pages, total_slots, walk_flags;
int internal_pages, restarts;
bool enough, modified;
@@ -1280,11 +1273,43 @@ __evict_walk_file(WT_SESSION_IMPL *session, u_int *slotp)
internal_pages = restarts = 0;
enough = false;
/*
* Figure out how many slots to fill from this tree.
* Note that some care is taken in the calculation to avoid overflow.
*/
start = cache->evict_queue + *slotp;
end = start + WT_EVICT_WALK_PER_FILE;
btree_inuse = __wt_btree_bytes_inuse(session);
cache_inuse = __wt_cache_bytes_inuse(cache);
remaining_slots = max_entries - *slotp;
total_slots = max_entries - cache->evict_entries;
target_pages = (uint32_t)(btree_inuse /
(cache_inuse / total_slots));
/*
* The target number of pages for this tree is proportional to the
* space it is taking up in cache. Round to the nearest number of
* slots so we assign all of the slots to a tree filling 99+% of the
* cache (and only have to walk it once).
*/
bytes_per_slot = cache_inuse / total_slots;
target_pages = (uint32_t)(
(btree_inuse + bytes_per_slot / 2) / bytes_per_slot);
if (target_pages == 0) {
/*
* Randomly walk trees with a tiny fraction of the cache in
* case there are so many trees that none of them use enough of
* the cache to be allocated slots.
*/
if (__wt_random(&session->rnd) / (double)UINT32_MAX >
btree_inuse / (double)cache_inuse)
return (0);
target_pages = 10;
}
if (F_ISSET(session->dhandle, WT_DHANDLE_DEAD) ||
end > cache->evict_queue + cache->evict_slots)
end = cache->evict_queue + cache->evict_slots;
target_pages > remaining_slots)
target_pages = remaining_slots;
end = start + target_pages;
walk_flags =
WT_READ_CACHE | WT_READ_NO_EVICT | WT_READ_NO_GEN | WT_READ_NO_WAIT;
@@ -1303,17 +1328,21 @@ __evict_walk_file(WT_SESSION_IMPL *session, u_int *slotp)
* Once we hit the page limit, do one more step through the walk in
* case we are appending and only the last page in the file is live.
*/
for (evict = start, pages_walked = 0;
for (evict = start, pages_seen = refs_walked = 0;
evict < end && !enough && (ret == 0 || ret == WT_NOTFOUND);
ret = __wt_tree_walk_count(
session, &btree->evict_ref, &pages_walked, walk_flags)) {
enough = pages_walked > cache->evict_max_refs_per_file;
session, &btree->evict_ref, &refs_walked, walk_flags)) {
enough = refs_walked > cache->evict_max_refs_per_file;
if ((ref = btree->evict_ref) == NULL) {
if (++restarts == 2 || enough)
break;
WT_STAT_FAST_CONN_INCR(
session, cache_eviction_walks_started);
continue;
}
++pages_seen;
/* Ignore root pages entirely. */
if (__wt_ref_is_root(ref))
continue;
@@ -1341,9 +1370,13 @@ __evict_walk_file(WT_SESSION_IMPL *session, u_int *slotp)
}
/* Pages we no longer need (clean or dirty), are found money. */
if (page->read_gen == WT_READGEN_OLDEST) {
WT_STAT_FAST_CONN_INCR(
session, cache_eviction_pages_queued_oldest);
goto fast;
}
if (__wt_page_is_empty(page) ||
F_ISSET(session->dhandle, WT_DHANDLE_DEAD) ||
page->read_gen == WT_READGEN_OLDEST)
F_ISSET(session->dhandle, WT_DHANDLE_DEAD))
goto fast;
/* Skip clean pages if appropriate. */
@@ -1409,24 +1442,31 @@ fast: /* If the page can't be evicted, give up. */
WT_RET_NOTFOUND_OK(ret);
*slotp += (u_int)(evict - start);
WT_STAT_FAST_CONN_INCRV(
session, cache_eviction_pages_queued, (u_int)(evict - start));
/*
* If we happen to end up on the root page, clear it. We have to track
* hazard pointers, and the root page complicates that calculation.
*
* Likewise if we found no new candidates during the walk: there is no
* point keeping a page pinned, since it may be the only candidate in an
* idle tree.
*
* If we land on a page requiring forced eviction, move on to the next
* page: we want this page evicted as quickly as possible.
*/
if ((ref = btree->evict_ref) != NULL) {
if (__wt_ref_is_root(ref))
if (__wt_ref_is_root(ref) || evict == start)
WT_RET(__evict_clear_walk(session));
else if (ref->page->read_gen == WT_READGEN_OLDEST)
WT_RET_NOTFOUND_OK(__wt_tree_walk_count(
session, &btree->evict_ref,
&pages_walked, walk_flags));
&refs_walked, walk_flags));
}
WT_STAT_FAST_CONN_INCRV(session, cache_eviction_walk, pages_walked);
WT_STAT_FAST_CONN_INCRV(session, cache_eviction_walk, refs_walked);
WT_STAT_FAST_CONN_INCRV(session, cache_eviction_pages_seen, pages_seen);
return (0);
}
@@ -1459,6 +1499,8 @@ __evict_get_ref(
return (WT_NOTFOUND);
if (__wt_spin_trylock(session, &cache->evict_lock) == 0)
break;
if (!F_ISSET(session, WT_SESSION_INTERNAL))
return (WT_NOTFOUND);
__wt_yield();
}
@@ -1472,13 +1514,14 @@ __evict_get_ref(
candidates /= 2;
/* Get the next page queued for eviction. */
while ((evict = cache->evict_current) != NULL &&
evict < cache->evict_queue + candidates && evict->ref != NULL) {
for (evict = cache->evict_current;
evict >= cache->evict_queue &&
evict < cache->evict_queue + candidates;
++evict) {
if (evict->ref == NULL)
continue;
WT_ASSERT(session, evict->btree != NULL);
/* Move to the next item. */
++cache->evict_current;
/*
* Lock the page while holding the eviction mutex to prevent
* multiple attempts to evict it. For pages that are already
@@ -1508,8 +1551,11 @@ __evict_get_ref(
}
/* Clear the current pointer if there are no more candidates. */
if (evict >= cache->evict_queue + cache->evict_candidates)
if (evict == NULL || evict + 1 >=
cache->evict_queue + cache->evict_candidates)
cache->evict_current = NULL;
else
cache->evict_current = evict + 1;
__wt_spin_unlock(session, &cache->evict_lock);
return ((*refp == NULL) ? WT_NOTFOUND : 0);
@@ -1533,15 +1579,18 @@ __evict_page(WT_SESSION_IMPL *session, bool is_server)
* An internal session flags either the server itself or an eviction
* worker thread.
*/
if (F_ISSET(session, WT_SESSION_INTERNAL)) {
if (is_server)
if (is_server)
WT_STAT_FAST_CONN_INCR(
session, cache_eviction_server_evicting);
else if (F_ISSET(session, WT_SESSION_INTERNAL))
WT_STAT_FAST_CONN_INCR(
session, cache_eviction_worker_evicting);
else {
if (__wt_page_is_modified(ref->page))
WT_STAT_FAST_CONN_INCR(
session, cache_eviction_server_evicting);
else
WT_STAT_FAST_CONN_INCR(
session, cache_eviction_worker_evicting);
} else
session, cache_eviction_app_dirty);
WT_STAT_FAST_CONN_INCR(session, cache_eviction_app);
}
/*
* In case something goes wrong, don't pick the same set of pages every
@@ -1628,8 +1677,9 @@ __wt_cache_eviction_worker(WT_SESSION_IMPL *session, bool busy, u_int pct_full)
}
/* See if eviction is still needed. */
if (!__wt_eviction_needed(session, NULL) ||
cache->pages_evict > init_evict_count + max_pages_evicted)
if (!__wt_eviction_needed(session, &pct_full) ||
(pct_full < 100 &&
cache->pages_evict > init_evict_count + max_pages_evicted))
return (0);
/* Evict a page. */

View File

@@ -420,7 +420,8 @@ __evict_review(
* fallen behind current.
*/
if (modified)
__wt_txn_update_oldest(session, true);
WT_RET(__wt_txn_update_oldest(
session, WT_TXN_OLDEST_STRICT));
if (!__wt_page_can_evict(session, ref, inmem_splitp))
return (EBUSY);

View File

@@ -129,6 +129,8 @@ struct __wt_btree {
uint64_t rec_max_txn; /* Maximum txn seen (clean trees) */
uint64_t write_gen; /* Write generation */
uint64_t bytes_inmem; /* Cache bytes in memory. */
WT_REF *evict_ref; /* Eviction thread's location */
uint64_t evict_priority; /* Relative priority of cached pages */
u_int evict_walk_period; /* Skip this many LRU walks */

View File

@@ -54,6 +54,27 @@ __wt_btree_block_free(
return (bm->free(bm, session, addr, addr_size));
}
/*
* __wt_btree_bytes_inuse --
* Return the number of bytes in use.
*/
static inline uint64_t
__wt_btree_bytes_inuse(WT_SESSION_IMPL *session)
{
WT_CACHE *cache;
uint64_t bytes_inuse;
cache = S2C(session)->cache;
/* Adjust the cache size to take allocation overhead into account. */
bytes_inuse = S2BT(session)->bytes_inmem;
if (cache->overhead_pct != 0)
bytes_inuse +=
(bytes_inuse * (uint64_t)cache->overhead_pct) / 100;
return (bytes_inuse);
}
/*
* __wt_cache_page_inmem_incr --
* Increment a page's memory footprint in the cache.
@@ -66,6 +87,7 @@ __wt_cache_page_inmem_incr(WT_SESSION_IMPL *session, WT_PAGE *page, size_t size)
WT_ASSERT(session, size < WT_EXABYTE);
cache = S2C(session)->cache;
(void)__wt_atomic_add64(&S2BT(session)->bytes_inmem, size);
(void)__wt_atomic_add64(&cache->bytes_inmem, size);
(void)__wt_atomic_addsize(&page->memory_footprint, size);
if (__wt_page_is_modified(page)) {
@@ -195,6 +217,8 @@ __wt_cache_page_inmem_decr(WT_SESSION_IMPL *session, WT_PAGE *page, size_t size)
WT_ASSERT(session, size < WT_EXABYTE);
__wt_cache_decr_check_uint64(
session, &S2BT(session)->bytes_inmem, size, "WT_BTREE.bytes_inmem");
__wt_cache_decr_check_uint64(
session, &cache->bytes_inmem, size, "WT_CACHE.bytes_inmem");
__wt_cache_decr_check_size(
@@ -274,8 +298,9 @@ __wt_cache_page_evict(WT_SESSION_IMPL *session, WT_PAGE *page)
modify = page->modify;
/* Update the bytes in-memory to reflect the eviction. */
__wt_cache_decr_check_uint64(session,
&cache->bytes_inmem,
__wt_cache_decr_check_uint64(session, &S2BT(session)->bytes_inmem,
page->memory_footprint, "WT_BTREE.bytes_inmem");
__wt_cache_decr_check_uint64(session, &cache->bytes_inmem,
page->memory_footprint, "WT_CACHE.bytes_inmem");
/* Update the bytes_internal value to reflect the eviction */

View File

@@ -13,7 +13,6 @@
#define WT_EVICT_INT_SKEW (1<<20) /* Prefer leaf pages over internal
pages by this many increments of the
read generation. */
#define WT_EVICT_WALK_PER_FILE 10 /* Pages to queue per file */
#define WT_EVICT_WALK_BASE 300 /* Pages tracked across file visits */
#define WT_EVICT_WALK_INCR 100 /* Pages added each walk */
@@ -24,6 +23,7 @@
struct __wt_evict_entry {
WT_BTREE *btree; /* Enclosing btree object */
WT_REF *ref; /* Page to flush/evict */
uint64_t score; /* Relative eviction priority */
};
/*

View File

@@ -270,7 +270,7 @@ __cursor_func_init(WT_CURSOR_BTREE *cbt, bool reenter)
* to read.
*/
if (!F_ISSET(cbt, WT_CBT_NO_TXN))
__wt_txn_cursor_op(session);
WT_RET(__wt_txn_cursor_op(session));
return (0);
}

View File

@@ -676,8 +676,8 @@ extern void __wt_stat_join_clear_single(WT_JOIN_STATS *stats);
extern void __wt_stat_join_clear_all(WT_JOIN_STATS **stats);
extern void __wt_stat_join_aggregate( WT_JOIN_STATS **from, WT_JOIN_STATS *to);
extern void __wt_txn_release_snapshot(WT_SESSION_IMPL *session);
extern void __wt_txn_get_snapshot(WT_SESSION_IMPL *session);
extern void __wt_txn_update_oldest(WT_SESSION_IMPL *session, bool force);
extern int __wt_txn_get_snapshot(WT_SESSION_IMPL *session);
extern int __wt_txn_update_oldest(WT_SESSION_IMPL *session, uint32_t flags);
extern int __wt_txn_config(WT_SESSION_IMPL *session, const char *cfg[]);
extern void __wt_txn_release(WT_SESSION_IMPL *session);
extern int __wt_txn_commit(WT_SESSION_IMPL *session, const char *cfg[]);

View File

@@ -76,6 +76,8 @@
#define WT_TXN_LOG_CKPT_START 0x00000004
#define WT_TXN_LOG_CKPT_STOP 0x00000008
#define WT_TXN_LOG_CKPT_SYNC 0x00000010
#define WT_TXN_OLDEST_STRICT 0x00000001
#define WT_TXN_OLDEST_WAIT 0x00000002
#define WT_VERB_API 0x00000001
#define WT_VERB_BLOCK 0x00000002
#define WT_VERB_CHECKPOINT 0x00000004

View File

@@ -255,7 +255,8 @@ struct __wt_log {
uint64_t write_calls; /* Calls to log_write */
#endif
uint32_t flags;
#define WT_LOG_OPENED 0x01 /* Log subsystem successfully open */
uint32_t flags;
};
struct __wt_log_record {

View File

@@ -17,15 +17,26 @@
#define WT_SYSCALL_RETRY(call, ret) do { \
int __retry; \
for (__retry = 0; __retry < 10; ++__retry) { \
if ((call) == 0) { \
(ret) = 0; \
break; \
} \
switch ((ret) = __wt_errno()) { \
case 0: \
/* The call failed but didn't set errno. */ \
(ret) = WT_ERROR; \
/* \
* A call returning 0 indicates success; any call where \
* 0 is not the only successful return must provide an \
* expression evaluating to 0 in all successful cases. \
*/ \
if (((ret) = (call)) == 0) \
break; \
/* \
* The call's error was either returned by the call or \
* is in errno, and there are cases where it depends on \
* the software release as to which it is (for example, \
* posix_fadvise on FreeBSD and OS X). Failing calls \
* must either return a non-zero error value, or -1 if \
* the error value is in errno. (The WiredTiger errno \
* function returns WT_ERROR if errno is 0, which isn't \
* ideal but won't discard the failure.) \
*/ \
if ((ret) == -1) \
(ret) = __wt_errno(); \
switch (ret) { \
case EAGAIN: \
case EBUSY: \
case EINTR: \

View File

@@ -306,7 +306,7 @@ __wt_update_serial(WT_SESSION_IMPL *session, WT_PAGE *page,
if ((txn = page->modify->obsolete_check_txn) != WT_TXN_NONE) {
if (!__wt_txn_visible_all(session, txn)) {
/* Try to move the oldest ID forward and re-check. */
__wt_txn_update_oldest(session, false);
WT_RET(__wt_txn_update_oldest(session, 0));
if (!__wt_txn_visible_all(session, txn))
return (0);

View File

@@ -269,6 +269,8 @@ struct __wt_connection_stats {
int64_t cache_eviction_slow;
int64_t cache_eviction_worker_evicting;
int64_t cache_eviction_force_fail;
int64_t cache_eviction_walks_active;
int64_t cache_eviction_walks_started;
int64_t cache_eviction_hazard;
int64_t cache_inmem_splittable;
int64_t cache_inmem_split;
@@ -280,14 +282,18 @@ struct __wt_connection_stats {
int64_t cache_bytes_max;
int64_t cache_eviction_maximum_page_size;
int64_t cache_eviction_dirty;
int64_t cache_eviction_app_dirty;
int64_t cache_eviction_deepen;
int64_t cache_write_lookaside;
int64_t cache_pages_inuse;
int64_t cache_eviction_force;
int64_t cache_eviction_force_delete;
int64_t cache_eviction_app;
int64_t cache_eviction_pages_queued;
int64_t cache_eviction_pages_queued_oldest;
int64_t cache_read;
int64_t cache_read_lookaside;
int64_t cache_eviction_pages_seen;
int64_t cache_eviction_fail;
int64_t cache_eviction_walk;
int64_t cache_write;
@@ -441,6 +447,7 @@ struct __wt_dsrc_stats {
int64_t btree_compact_rewrite;
int64_t btree_row_internal;
int64_t btree_row_leaf;
int64_t cache_bytes_inuse;
int64_t cache_bytes_read;
int64_t cache_bytes_write;
int64_t cache_eviction_checkpoint;

View File

@@ -74,7 +74,7 @@ struct __wt_txn_global {
volatile uint64_t current; /* Current transaction ID. */
/* The oldest running transaction ID (may race). */
uint64_t last_running;
volatile uint64_t last_running;
/*
* The oldest transaction ID that is not yet visible to some
@@ -82,8 +82,11 @@ struct __wt_txn_global {
*/
volatile uint64_t oldest_id;
/* Count of scanning threads, or -1 for exclusive access. */
volatile int32_t scan_count;
/*
* Prevents the oldest ID moving forwards while threads are scanning
* the global transaction state.
*/
WT_RWLOCK *scan_rwlock;
/*
* Track information about the running checkpoint. The transaction

View File

@@ -261,14 +261,14 @@ __wt_txn_begin(WT_SESSION_IMPL *session, const char *cfg[])
* eviction, it's better to do it beforehand.
*/
WT_RET(__wt_cache_eviction_check(session, false, NULL));
__wt_txn_get_snapshot(session);
WT_RET(__wt_txn_get_snapshot(session));
}
F_SET(txn, WT_TXN_RUNNING);
if (F_ISSET(S2C(session), WT_CONN_READONLY))
F_SET(txn, WT_TXN_READONLY);
return (false);
return (0);
}
/*
@@ -450,7 +450,7 @@ __wt_txn_read_last(WT_SESSION_IMPL *session)
* __wt_txn_cursor_op --
* Called for each cursor operation.
*/
static inline void
static inline int
__wt_txn_cursor_op(WT_SESSION_IMPL *session)
{
WT_TXN *txn;
@@ -482,7 +482,9 @@ __wt_txn_cursor_op(WT_SESSION_IMPL *session)
if (txn_state->snap_min == WT_TXN_NONE)
txn_state->snap_min = txn_global->last_running;
} else if (!F_ISSET(txn, WT_TXN_HAS_SNAPSHOT))
__wt_txn_get_snapshot(session);
WT_RET(__wt_txn_get_snapshot(session));
return (0);
}
/*

View File

@@ -1221,9 +1221,6 @@ struct __wt_session {
* @configstart{WT_SESSION.drop, see dist/api_data.py}
* @config{force, return success if the object does not exist., a
* boolean flag; default \c false.}
* @config{lock_wait, wait for locks\, if \c lock_wait=false\, fail if
* any required locks are not available immediately., a boolean flag;
* default \c true.}
* @config{remove_files, should the underlying files be removed?., a
* boolean flag; default \c true.}
* @configend
@@ -3790,257 +3787,269 @@ extern int wiredtiger_extension_terminate(WT_CONNECTION *connection);
#define WT_STAT_CONN_CACHE_EVICTION_WORKER_EVICTING 1040
/*! cache: failed eviction of pages that exceeded the in-memory maximum */
#define WT_STAT_CONN_CACHE_EVICTION_FORCE_FAIL 1041
/*! cache: files with active eviction walks */
#define WT_STAT_CONN_CACHE_EVICTION_WALKS_ACTIVE 1042
/*! cache: files with new eviction walks started */
#define WT_STAT_CONN_CACHE_EVICTION_WALKS_STARTED 1043
/*! cache: hazard pointer blocked page eviction */
#define WT_STAT_CONN_CACHE_EVICTION_HAZARD 1042
#define WT_STAT_CONN_CACHE_EVICTION_HAZARD 1044
/*! cache: in-memory page passed criteria to be split */
#define WT_STAT_CONN_CACHE_INMEM_SPLITTABLE 1043
#define WT_STAT_CONN_CACHE_INMEM_SPLITTABLE 1045
/*! cache: in-memory page splits */
#define WT_STAT_CONN_CACHE_INMEM_SPLIT 1044
#define WT_STAT_CONN_CACHE_INMEM_SPLIT 1046
/*! cache: internal pages evicted */
#define WT_STAT_CONN_CACHE_EVICTION_INTERNAL 1045
#define WT_STAT_CONN_CACHE_EVICTION_INTERNAL 1047
/*! cache: internal pages split during eviction */
#define WT_STAT_CONN_CACHE_EVICTION_SPLIT_INTERNAL 1046
#define WT_STAT_CONN_CACHE_EVICTION_SPLIT_INTERNAL 1048
/*! cache: leaf pages split during eviction */
#define WT_STAT_CONN_CACHE_EVICTION_SPLIT_LEAF 1047
#define WT_STAT_CONN_CACHE_EVICTION_SPLIT_LEAF 1049
/*! cache: lookaside table insert calls */
#define WT_STAT_CONN_CACHE_LOOKASIDE_INSERT 1048
#define WT_STAT_CONN_CACHE_LOOKASIDE_INSERT 1050
/*! cache: lookaside table remove calls */
#define WT_STAT_CONN_CACHE_LOOKASIDE_REMOVE 1049
#define WT_STAT_CONN_CACHE_LOOKASIDE_REMOVE 1051
/*! cache: maximum bytes configured */
#define WT_STAT_CONN_CACHE_BYTES_MAX 1050
#define WT_STAT_CONN_CACHE_BYTES_MAX 1052
/*! cache: maximum page size at eviction */
#define WT_STAT_CONN_CACHE_EVICTION_MAXIMUM_PAGE_SIZE 1051
#define WT_STAT_CONN_CACHE_EVICTION_MAXIMUM_PAGE_SIZE 1053
/*! cache: modified pages evicted */
#define WT_STAT_CONN_CACHE_EVICTION_DIRTY 1052
#define WT_STAT_CONN_CACHE_EVICTION_DIRTY 1054
/*! cache: modified pages evicted by application threads */
#define WT_STAT_CONN_CACHE_EVICTION_APP_DIRTY 1055
/*! cache: page split during eviction deepened the tree */
#define WT_STAT_CONN_CACHE_EVICTION_DEEPEN 1053
#define WT_STAT_CONN_CACHE_EVICTION_DEEPEN 1056
/*! cache: page written requiring lookaside records */
#define WT_STAT_CONN_CACHE_WRITE_LOOKASIDE 1054
#define WT_STAT_CONN_CACHE_WRITE_LOOKASIDE 1057
/*! cache: pages currently held in the cache */
#define WT_STAT_CONN_CACHE_PAGES_INUSE 1055
#define WT_STAT_CONN_CACHE_PAGES_INUSE 1058
/*! cache: pages evicted because they exceeded the in-memory maximum */
#define WT_STAT_CONN_CACHE_EVICTION_FORCE 1056
#define WT_STAT_CONN_CACHE_EVICTION_FORCE 1059
/*! cache: pages evicted because they had chains of deleted items */
#define WT_STAT_CONN_CACHE_EVICTION_FORCE_DELETE 1057
#define WT_STAT_CONN_CACHE_EVICTION_FORCE_DELETE 1060
/*! cache: pages evicted by application threads */
#define WT_STAT_CONN_CACHE_EVICTION_APP 1058
#define WT_STAT_CONN_CACHE_EVICTION_APP 1061
/*! cache: pages queued for eviction */
#define WT_STAT_CONN_CACHE_EVICTION_PAGES_QUEUED 1062
/*! cache: pages queued for urgent eviction */
#define WT_STAT_CONN_CACHE_EVICTION_PAGES_QUEUED_OLDEST 1063
/*! cache: pages read into cache */
#define WT_STAT_CONN_CACHE_READ 1059
#define WT_STAT_CONN_CACHE_READ 1064
/*! cache: pages read into cache requiring lookaside entries */
#define WT_STAT_CONN_CACHE_READ_LOOKASIDE 1060
#define WT_STAT_CONN_CACHE_READ_LOOKASIDE 1065
/*! cache: pages seen by eviction walk */
#define WT_STAT_CONN_CACHE_EVICTION_PAGES_SEEN 1066
/*! cache: pages selected for eviction unable to be evicted */
#define WT_STAT_CONN_CACHE_EVICTION_FAIL 1061
#define WT_STAT_CONN_CACHE_EVICTION_FAIL 1067
/*! cache: pages walked for eviction */
#define WT_STAT_CONN_CACHE_EVICTION_WALK 1062
#define WT_STAT_CONN_CACHE_EVICTION_WALK 1068
/*! cache: pages written from cache */
#define WT_STAT_CONN_CACHE_WRITE 1063
#define WT_STAT_CONN_CACHE_WRITE 1069
/*! cache: pages written requiring in-memory restoration */
#define WT_STAT_CONN_CACHE_WRITE_RESTORE 1064
#define WT_STAT_CONN_CACHE_WRITE_RESTORE 1070
/*! cache: percentage overhead */
#define WT_STAT_CONN_CACHE_OVERHEAD 1065
#define WT_STAT_CONN_CACHE_OVERHEAD 1071
/*! cache: tracked bytes belonging to internal pages in the cache */
#define WT_STAT_CONN_CACHE_BYTES_INTERNAL 1066
#define WT_STAT_CONN_CACHE_BYTES_INTERNAL 1072
/*! cache: tracked bytes belonging to leaf pages in the cache */
#define WT_STAT_CONN_CACHE_BYTES_LEAF 1067
#define WT_STAT_CONN_CACHE_BYTES_LEAF 1073
/*! cache: tracked bytes belonging to overflow pages in the cache */
#define WT_STAT_CONN_CACHE_BYTES_OVERFLOW 1068
#define WT_STAT_CONN_CACHE_BYTES_OVERFLOW 1074
/*! cache: tracked dirty bytes in the cache */
#define WT_STAT_CONN_CACHE_BYTES_DIRTY 1069
#define WT_STAT_CONN_CACHE_BYTES_DIRTY 1075
/*! cache: tracked dirty pages in the cache */
#define WT_STAT_CONN_CACHE_PAGES_DIRTY 1070
#define WT_STAT_CONN_CACHE_PAGES_DIRTY 1076
/*! cache: unmodified pages evicted */
#define WT_STAT_CONN_CACHE_EVICTION_CLEAN 1071
#define WT_STAT_CONN_CACHE_EVICTION_CLEAN 1077
/*! connection: auto adjusting condition resets */
#define WT_STAT_CONN_COND_AUTO_WAIT_RESET 1072
#define WT_STAT_CONN_COND_AUTO_WAIT_RESET 1078
/*! connection: auto adjusting condition wait calls */
#define WT_STAT_CONN_COND_AUTO_WAIT 1073
#define WT_STAT_CONN_COND_AUTO_WAIT 1079
/*! connection: files currently open */
#define WT_STAT_CONN_FILE_OPEN 1074
#define WT_STAT_CONN_FILE_OPEN 1080
/*! connection: memory allocations */
#define WT_STAT_CONN_MEMORY_ALLOCATION 1075
#define WT_STAT_CONN_MEMORY_ALLOCATION 1081
/*! connection: memory frees */
#define WT_STAT_CONN_MEMORY_FREE 1076
#define WT_STAT_CONN_MEMORY_FREE 1082
/*! connection: memory re-allocations */
#define WT_STAT_CONN_MEMORY_GROW 1077
#define WT_STAT_CONN_MEMORY_GROW 1083
/*! connection: pthread mutex condition wait calls */
#define WT_STAT_CONN_COND_WAIT 1078
#define WT_STAT_CONN_COND_WAIT 1084
/*! connection: pthread mutex shared lock read-lock calls */
#define WT_STAT_CONN_RWLOCK_READ 1079
#define WT_STAT_CONN_RWLOCK_READ 1085
/*! connection: pthread mutex shared lock write-lock calls */
#define WT_STAT_CONN_RWLOCK_WRITE 1080
#define WT_STAT_CONN_RWLOCK_WRITE 1086
/*! connection: total read I/Os */
#define WT_STAT_CONN_READ_IO 1081
#define WT_STAT_CONN_READ_IO 1087
/*! connection: total write I/Os */
#define WT_STAT_CONN_WRITE_IO 1082
#define WT_STAT_CONN_WRITE_IO 1088
/*! cursor: cursor create calls */
#define WT_STAT_CONN_CURSOR_CREATE 1083
#define WT_STAT_CONN_CURSOR_CREATE 1089
/*! cursor: cursor insert calls */
#define WT_STAT_CONN_CURSOR_INSERT 1084
#define WT_STAT_CONN_CURSOR_INSERT 1090
/*! cursor: cursor next calls */
#define WT_STAT_CONN_CURSOR_NEXT 1085
#define WT_STAT_CONN_CURSOR_NEXT 1091
/*! cursor: cursor prev calls */
#define WT_STAT_CONN_CURSOR_PREV 1086
#define WT_STAT_CONN_CURSOR_PREV 1092
/*! cursor: cursor remove calls */
#define WT_STAT_CONN_CURSOR_REMOVE 1087
#define WT_STAT_CONN_CURSOR_REMOVE 1093
/*! cursor: cursor reset calls */
#define WT_STAT_CONN_CURSOR_RESET 1088
#define WT_STAT_CONN_CURSOR_RESET 1094
/*! cursor: cursor restarted searches */
#define WT_STAT_CONN_CURSOR_RESTART 1089
#define WT_STAT_CONN_CURSOR_RESTART 1095
/*! cursor: cursor search calls */
#define WT_STAT_CONN_CURSOR_SEARCH 1090
#define WT_STAT_CONN_CURSOR_SEARCH 1096
/*! cursor: cursor search near calls */
#define WT_STAT_CONN_CURSOR_SEARCH_NEAR 1091
#define WT_STAT_CONN_CURSOR_SEARCH_NEAR 1097
/*! cursor: cursor update calls */
#define WT_STAT_CONN_CURSOR_UPDATE 1092
#define WT_STAT_CONN_CURSOR_UPDATE 1098
/*! cursor: truncate calls */
#define WT_STAT_CONN_CURSOR_TRUNCATE 1093
#define WT_STAT_CONN_CURSOR_TRUNCATE 1099
/*! data-handle: connection data handles currently active */
#define WT_STAT_CONN_DH_CONN_HANDLE_COUNT 1094
#define WT_STAT_CONN_DH_CONN_HANDLE_COUNT 1100
/*! data-handle: connection sweep candidate became referenced */
#define WT_STAT_CONN_DH_SWEEP_REF 1095
#define WT_STAT_CONN_DH_SWEEP_REF 1101
/*! data-handle: connection sweep dhandles closed */
#define WT_STAT_CONN_DH_SWEEP_CLOSE 1096
#define WT_STAT_CONN_DH_SWEEP_CLOSE 1102
/*! data-handle: connection sweep dhandles removed from hash list */
#define WT_STAT_CONN_DH_SWEEP_REMOVE 1097
#define WT_STAT_CONN_DH_SWEEP_REMOVE 1103
/*! data-handle: connection sweep time-of-death sets */
#define WT_STAT_CONN_DH_SWEEP_TOD 1098
#define WT_STAT_CONN_DH_SWEEP_TOD 1104
/*! data-handle: connection sweeps */
#define WT_STAT_CONN_DH_SWEEPS 1099
#define WT_STAT_CONN_DH_SWEEPS 1105
/*! data-handle: session dhandles swept */
#define WT_STAT_CONN_DH_SESSION_HANDLES 1100
#define WT_STAT_CONN_DH_SESSION_HANDLES 1106
/*! data-handle: session sweep attempts */
#define WT_STAT_CONN_DH_SESSION_SWEEPS 1101
#define WT_STAT_CONN_DH_SESSION_SWEEPS 1107
/*! log: busy returns attempting to switch slots */
#define WT_STAT_CONN_LOG_SLOT_SWITCH_BUSY 1102
#define WT_STAT_CONN_LOG_SLOT_SWITCH_BUSY 1108
/*! log: consolidated slot closures */
#define WT_STAT_CONN_LOG_SLOT_CLOSES 1103
#define WT_STAT_CONN_LOG_SLOT_CLOSES 1109
/*! log: consolidated slot join races */
#define WT_STAT_CONN_LOG_SLOT_RACES 1104
#define WT_STAT_CONN_LOG_SLOT_RACES 1110
/*! log: consolidated slot join transitions */
#define WT_STAT_CONN_LOG_SLOT_TRANSITIONS 1105
#define WT_STAT_CONN_LOG_SLOT_TRANSITIONS 1111
/*! log: consolidated slot joins */
#define WT_STAT_CONN_LOG_SLOT_JOINS 1106
#define WT_STAT_CONN_LOG_SLOT_JOINS 1112
/*! log: consolidated slot unbuffered writes */
#define WT_STAT_CONN_LOG_SLOT_UNBUFFERED 1107
#define WT_STAT_CONN_LOG_SLOT_UNBUFFERED 1113
/*! log: log bytes of payload data */
#define WT_STAT_CONN_LOG_BYTES_PAYLOAD 1108
#define WT_STAT_CONN_LOG_BYTES_PAYLOAD 1114
/*! log: log bytes written */
#define WT_STAT_CONN_LOG_BYTES_WRITTEN 1109
#define WT_STAT_CONN_LOG_BYTES_WRITTEN 1115
/*! log: log files manually zero-filled */
#define WT_STAT_CONN_LOG_ZERO_FILLS 1110
#define WT_STAT_CONN_LOG_ZERO_FILLS 1116
/*! log: log flush operations */
#define WT_STAT_CONN_LOG_FLUSH 1111
#define WT_STAT_CONN_LOG_FLUSH 1117
/*! log: log force write operations */
#define WT_STAT_CONN_LOG_FORCE_WRITE 1112
#define WT_STAT_CONN_LOG_FORCE_WRITE 1118
/*! log: log force write operations skipped */
#define WT_STAT_CONN_LOG_FORCE_WRITE_SKIP 1113
#define WT_STAT_CONN_LOG_FORCE_WRITE_SKIP 1119
/*! log: log records compressed */
#define WT_STAT_CONN_LOG_COMPRESS_WRITES 1114
#define WT_STAT_CONN_LOG_COMPRESS_WRITES 1120
/*! log: log records not compressed */
#define WT_STAT_CONN_LOG_COMPRESS_WRITE_FAILS 1115
#define WT_STAT_CONN_LOG_COMPRESS_WRITE_FAILS 1121
/*! log: log records too small to compress */
#define WT_STAT_CONN_LOG_COMPRESS_SMALL 1116
#define WT_STAT_CONN_LOG_COMPRESS_SMALL 1122
/*! log: log release advances write LSN */
#define WT_STAT_CONN_LOG_RELEASE_WRITE_LSN 1117
#define WT_STAT_CONN_LOG_RELEASE_WRITE_LSN 1123
/*! log: log scan operations */
#define WT_STAT_CONN_LOG_SCANS 1118
#define WT_STAT_CONN_LOG_SCANS 1124
/*! log: log scan records requiring two reads */
#define WT_STAT_CONN_LOG_SCAN_REREADS 1119
#define WT_STAT_CONN_LOG_SCAN_REREADS 1125
/*! log: log server thread advances write LSN */
#define WT_STAT_CONN_LOG_WRITE_LSN 1120
#define WT_STAT_CONN_LOG_WRITE_LSN 1126
/*! log: log server thread write LSN walk skipped */
#define WT_STAT_CONN_LOG_WRITE_LSN_SKIP 1121
#define WT_STAT_CONN_LOG_WRITE_LSN_SKIP 1127
/*! log: log sync operations */
#define WT_STAT_CONN_LOG_SYNC 1122
#define WT_STAT_CONN_LOG_SYNC 1128
/*! log: log sync_dir operations */
#define WT_STAT_CONN_LOG_SYNC_DIR 1123
#define WT_STAT_CONN_LOG_SYNC_DIR 1129
/*! log: log write operations */
#define WT_STAT_CONN_LOG_WRITES 1124
#define WT_STAT_CONN_LOG_WRITES 1130
/*! log: logging bytes consolidated */
#define WT_STAT_CONN_LOG_SLOT_CONSOLIDATED 1125
#define WT_STAT_CONN_LOG_SLOT_CONSOLIDATED 1131
/*! log: maximum log file size */
#define WT_STAT_CONN_LOG_MAX_FILESIZE 1126
#define WT_STAT_CONN_LOG_MAX_FILESIZE 1132
/*! log: number of pre-allocated log files to create */
#define WT_STAT_CONN_LOG_PREALLOC_MAX 1127
#define WT_STAT_CONN_LOG_PREALLOC_MAX 1133
/*! log: pre-allocated log files not ready and missed */
#define WT_STAT_CONN_LOG_PREALLOC_MISSED 1128
#define WT_STAT_CONN_LOG_PREALLOC_MISSED 1134
/*! log: pre-allocated log files prepared */
#define WT_STAT_CONN_LOG_PREALLOC_FILES 1129
#define WT_STAT_CONN_LOG_PREALLOC_FILES 1135
/*! log: pre-allocated log files used */
#define WT_STAT_CONN_LOG_PREALLOC_USED 1130
#define WT_STAT_CONN_LOG_PREALLOC_USED 1136
/*! log: records processed by log scan */
#define WT_STAT_CONN_LOG_SCAN_RECORDS 1131
#define WT_STAT_CONN_LOG_SCAN_RECORDS 1137
/*! log: total in-memory size of compressed records */
#define WT_STAT_CONN_LOG_COMPRESS_MEM 1132
#define WT_STAT_CONN_LOG_COMPRESS_MEM 1138
/*! log: total log buffer size */
#define WT_STAT_CONN_LOG_BUFFER_SIZE 1133
#define WT_STAT_CONN_LOG_BUFFER_SIZE 1139
/*! log: total size of compressed records */
#define WT_STAT_CONN_LOG_COMPRESS_LEN 1134
#define WT_STAT_CONN_LOG_COMPRESS_LEN 1140
/*! log: written slots coalesced */
#define WT_STAT_CONN_LOG_SLOT_COALESCED 1135
#define WT_STAT_CONN_LOG_SLOT_COALESCED 1141
/*! log: yields waiting for previous log file close */
#define WT_STAT_CONN_LOG_CLOSE_YIELDS 1136
#define WT_STAT_CONN_LOG_CLOSE_YIELDS 1142
/*! reconciliation: fast-path pages deleted */
#define WT_STAT_CONN_REC_PAGE_DELETE_FAST 1137
#define WT_STAT_CONN_REC_PAGE_DELETE_FAST 1143
/*! reconciliation: page reconciliation calls */
#define WT_STAT_CONN_REC_PAGES 1138
#define WT_STAT_CONN_REC_PAGES 1144
/*! reconciliation: page reconciliation calls for eviction */
#define WT_STAT_CONN_REC_PAGES_EVICTION 1139
#define WT_STAT_CONN_REC_PAGES_EVICTION 1145
/*! reconciliation: pages deleted */
#define WT_STAT_CONN_REC_PAGE_DELETE 1140
#define WT_STAT_CONN_REC_PAGE_DELETE 1146
/*! reconciliation: split bytes currently awaiting free */
#define WT_STAT_CONN_REC_SPLIT_STASHED_BYTES 1141
#define WT_STAT_CONN_REC_SPLIT_STASHED_BYTES 1147
/*! reconciliation: split objects currently awaiting free */
#define WT_STAT_CONN_REC_SPLIT_STASHED_OBJECTS 1142
#define WT_STAT_CONN_REC_SPLIT_STASHED_OBJECTS 1148
/*! session: open cursor count */
#define WT_STAT_CONN_SESSION_CURSOR_OPEN 1143
#define WT_STAT_CONN_SESSION_CURSOR_OPEN 1149
/*! session: open session count */
#define WT_STAT_CONN_SESSION_OPEN 1144
#define WT_STAT_CONN_SESSION_OPEN 1150
/*! thread-yield: page acquire busy blocked */
#define WT_STAT_CONN_PAGE_BUSY_BLOCKED 1145
#define WT_STAT_CONN_PAGE_BUSY_BLOCKED 1151
/*! thread-yield: page acquire eviction blocked */
#define WT_STAT_CONN_PAGE_FORCIBLE_EVICT_BLOCKED 1146
#define WT_STAT_CONN_PAGE_FORCIBLE_EVICT_BLOCKED 1152
/*! thread-yield: page acquire locked blocked */
#define WT_STAT_CONN_PAGE_LOCKED_BLOCKED 1147
#define WT_STAT_CONN_PAGE_LOCKED_BLOCKED 1153
/*! thread-yield: page acquire read blocked */
#define WT_STAT_CONN_PAGE_READ_BLOCKED 1148
#define WT_STAT_CONN_PAGE_READ_BLOCKED 1154
/*! thread-yield: page acquire time sleeping (usecs) */
#define WT_STAT_CONN_PAGE_SLEEP 1149
#define WT_STAT_CONN_PAGE_SLEEP 1155
/*! transaction: number of named snapshots created */
#define WT_STAT_CONN_TXN_SNAPSHOTS_CREATED 1150
#define WT_STAT_CONN_TXN_SNAPSHOTS_CREATED 1156
/*! transaction: number of named snapshots dropped */
#define WT_STAT_CONN_TXN_SNAPSHOTS_DROPPED 1151
#define WT_STAT_CONN_TXN_SNAPSHOTS_DROPPED 1157
/*! transaction: transaction begins */
#define WT_STAT_CONN_TXN_BEGIN 1152
#define WT_STAT_CONN_TXN_BEGIN 1158
/*! transaction: transaction checkpoint currently running */
#define WT_STAT_CONN_TXN_CHECKPOINT_RUNNING 1153
#define WT_STAT_CONN_TXN_CHECKPOINT_RUNNING 1159
/*! transaction: transaction checkpoint generation */
#define WT_STAT_CONN_TXN_CHECKPOINT_GENERATION 1154
#define WT_STAT_CONN_TXN_CHECKPOINT_GENERATION 1160
/*! transaction: transaction checkpoint max time (msecs) */
#define WT_STAT_CONN_TXN_CHECKPOINT_TIME_MAX 1155
#define WT_STAT_CONN_TXN_CHECKPOINT_TIME_MAX 1161
/*! transaction: transaction checkpoint min time (msecs) */
#define WT_STAT_CONN_TXN_CHECKPOINT_TIME_MIN 1156
#define WT_STAT_CONN_TXN_CHECKPOINT_TIME_MIN 1162
/*! transaction: transaction checkpoint most recent time (msecs) */
#define WT_STAT_CONN_TXN_CHECKPOINT_TIME_RECENT 1157
#define WT_STAT_CONN_TXN_CHECKPOINT_TIME_RECENT 1163
/*! transaction: transaction checkpoint total time (msecs) */
#define WT_STAT_CONN_TXN_CHECKPOINT_TIME_TOTAL 1158
#define WT_STAT_CONN_TXN_CHECKPOINT_TIME_TOTAL 1164
/*! transaction: transaction checkpoints */
#define WT_STAT_CONN_TXN_CHECKPOINT 1159
#define WT_STAT_CONN_TXN_CHECKPOINT 1165
/*! transaction: transaction failures due to cache overflow */
#define WT_STAT_CONN_TXN_FAIL_CACHE 1160
#define WT_STAT_CONN_TXN_FAIL_CACHE 1166
/*! transaction: transaction range of IDs currently pinned */
#define WT_STAT_CONN_TXN_PINNED_RANGE 1161
#define WT_STAT_CONN_TXN_PINNED_RANGE 1167
/*! transaction: transaction range of IDs currently pinned by a checkpoint */
#define WT_STAT_CONN_TXN_PINNED_CHECKPOINT_RANGE 1162
#define WT_STAT_CONN_TXN_PINNED_CHECKPOINT_RANGE 1168
/*! transaction: transaction range of IDs currently pinned by named
* snapshots */
#define WT_STAT_CONN_TXN_PINNED_SNAPSHOT_RANGE 1163
#define WT_STAT_CONN_TXN_PINNED_SNAPSHOT_RANGE 1169
/*! transaction: transaction sync calls */
#define WT_STAT_CONN_TXN_SYNC 1164
#define WT_STAT_CONN_TXN_SYNC 1170
/*! transaction: transactions committed */
#define WT_STAT_CONN_TXN_COMMIT 1165
#define WT_STAT_CONN_TXN_COMMIT 1171
/*! transaction: transactions rolled back */
#define WT_STAT_CONN_TXN_ROLLBACK 1166
#define WT_STAT_CONN_TXN_ROLLBACK 1172
/*!
* @}
@@ -4129,125 +4138,127 @@ extern int wiredtiger_extension_terminate(WT_CONNECTION *connection);
#define WT_STAT_DSRC_BTREE_ROW_INTERNAL 2038
/*! btree: row-store leaf pages */
#define WT_STAT_DSRC_BTREE_ROW_LEAF 2039
/*! cache: bytes currently in the cache */
#define WT_STAT_DSRC_CACHE_BYTES_INUSE 2040
/*! cache: bytes read into cache */
#define WT_STAT_DSRC_CACHE_BYTES_READ 2040
#define WT_STAT_DSRC_CACHE_BYTES_READ 2041
/*! cache: bytes written from cache */
#define WT_STAT_DSRC_CACHE_BYTES_WRITE 2041
#define WT_STAT_DSRC_CACHE_BYTES_WRITE 2042
/*! cache: checkpoint blocked page eviction */
#define WT_STAT_DSRC_CACHE_EVICTION_CHECKPOINT 2042
#define WT_STAT_DSRC_CACHE_EVICTION_CHECKPOINT 2043
/*! cache: data source pages selected for eviction unable to be evicted */
#define WT_STAT_DSRC_CACHE_EVICTION_FAIL 2043
#define WT_STAT_DSRC_CACHE_EVICTION_FAIL 2044
/*! cache: hazard pointer blocked page eviction */
#define WT_STAT_DSRC_CACHE_EVICTION_HAZARD 2044
#define WT_STAT_DSRC_CACHE_EVICTION_HAZARD 2045
/*! cache: in-memory page passed criteria to be split */
#define WT_STAT_DSRC_CACHE_INMEM_SPLITTABLE 2045
#define WT_STAT_DSRC_CACHE_INMEM_SPLITTABLE 2046
/*! cache: in-memory page splits */
#define WT_STAT_DSRC_CACHE_INMEM_SPLIT 2046
#define WT_STAT_DSRC_CACHE_INMEM_SPLIT 2047
/*! cache: internal pages evicted */
#define WT_STAT_DSRC_CACHE_EVICTION_INTERNAL 2047
#define WT_STAT_DSRC_CACHE_EVICTION_INTERNAL 2048
/*! cache: internal pages split during eviction */
#define WT_STAT_DSRC_CACHE_EVICTION_SPLIT_INTERNAL 2048
#define WT_STAT_DSRC_CACHE_EVICTION_SPLIT_INTERNAL 2049
/*! cache: leaf pages split during eviction */
#define WT_STAT_DSRC_CACHE_EVICTION_SPLIT_LEAF 2049
#define WT_STAT_DSRC_CACHE_EVICTION_SPLIT_LEAF 2050
/*! cache: modified pages evicted */
#define WT_STAT_DSRC_CACHE_EVICTION_DIRTY 2050
#define WT_STAT_DSRC_CACHE_EVICTION_DIRTY 2051
/*! cache: overflow pages read into cache */
#define WT_STAT_DSRC_CACHE_READ_OVERFLOW 2051
#define WT_STAT_DSRC_CACHE_READ_OVERFLOW 2052
/*! cache: overflow values cached in memory */
#define WT_STAT_DSRC_CACHE_OVERFLOW_VALUE 2052
#define WT_STAT_DSRC_CACHE_OVERFLOW_VALUE 2053
/*! cache: page split during eviction deepened the tree */
#define WT_STAT_DSRC_CACHE_EVICTION_DEEPEN 2053
#define WT_STAT_DSRC_CACHE_EVICTION_DEEPEN 2054
/*! cache: page written requiring lookaside records */
#define WT_STAT_DSRC_CACHE_WRITE_LOOKASIDE 2054
#define WT_STAT_DSRC_CACHE_WRITE_LOOKASIDE 2055
/*! cache: pages read into cache */
#define WT_STAT_DSRC_CACHE_READ 2055
#define WT_STAT_DSRC_CACHE_READ 2056
/*! cache: pages read into cache requiring lookaside entries */
#define WT_STAT_DSRC_CACHE_READ_LOOKASIDE 2056
#define WT_STAT_DSRC_CACHE_READ_LOOKASIDE 2057
/*! cache: pages written from cache */
#define WT_STAT_DSRC_CACHE_WRITE 2057
#define WT_STAT_DSRC_CACHE_WRITE 2058
/*! cache: pages written requiring in-memory restoration */
#define WT_STAT_DSRC_CACHE_WRITE_RESTORE 2058
#define WT_STAT_DSRC_CACHE_WRITE_RESTORE 2059
/*! cache: unmodified pages evicted */
#define WT_STAT_DSRC_CACHE_EVICTION_CLEAN 2059
#define WT_STAT_DSRC_CACHE_EVICTION_CLEAN 2060
/*! compression: compressed pages read */
#define WT_STAT_DSRC_COMPRESS_READ 2060
#define WT_STAT_DSRC_COMPRESS_READ 2061
/*! compression: compressed pages written */
#define WT_STAT_DSRC_COMPRESS_WRITE 2061
#define WT_STAT_DSRC_COMPRESS_WRITE 2062
/*! compression: page written failed to compress */
#define WT_STAT_DSRC_COMPRESS_WRITE_FAIL 2062
#define WT_STAT_DSRC_COMPRESS_WRITE_FAIL 2063
/*! compression: page written was too small to compress */
#define WT_STAT_DSRC_COMPRESS_WRITE_TOO_SMALL 2063
#define WT_STAT_DSRC_COMPRESS_WRITE_TOO_SMALL 2064
/*! compression: raw compression call failed, additional data available */
#define WT_STAT_DSRC_COMPRESS_RAW_FAIL_TEMPORARY 2064
#define WT_STAT_DSRC_COMPRESS_RAW_FAIL_TEMPORARY 2065
/*! compression: raw compression call failed, no additional data available */
#define WT_STAT_DSRC_COMPRESS_RAW_FAIL 2065
#define WT_STAT_DSRC_COMPRESS_RAW_FAIL 2066
/*! compression: raw compression call succeeded */
#define WT_STAT_DSRC_COMPRESS_RAW_OK 2066
#define WT_STAT_DSRC_COMPRESS_RAW_OK 2067
/*! cursor: bulk-loaded cursor-insert calls */
#define WT_STAT_DSRC_CURSOR_INSERT_BULK 2067
#define WT_STAT_DSRC_CURSOR_INSERT_BULK 2068
/*! cursor: create calls */
#define WT_STAT_DSRC_CURSOR_CREATE 2068
#define WT_STAT_DSRC_CURSOR_CREATE 2069
/*! cursor: cursor-insert key and value bytes inserted */
#define WT_STAT_DSRC_CURSOR_INSERT_BYTES 2069
#define WT_STAT_DSRC_CURSOR_INSERT_BYTES 2070
/*! cursor: cursor-remove key bytes removed */
#define WT_STAT_DSRC_CURSOR_REMOVE_BYTES 2070
#define WT_STAT_DSRC_CURSOR_REMOVE_BYTES 2071
/*! cursor: cursor-update value bytes updated */
#define WT_STAT_DSRC_CURSOR_UPDATE_BYTES 2071
#define WT_STAT_DSRC_CURSOR_UPDATE_BYTES 2072
/*! cursor: insert calls */
#define WT_STAT_DSRC_CURSOR_INSERT 2072
#define WT_STAT_DSRC_CURSOR_INSERT 2073
/*! cursor: next calls */
#define WT_STAT_DSRC_CURSOR_NEXT 2073
#define WT_STAT_DSRC_CURSOR_NEXT 2074
/*! cursor: prev calls */
#define WT_STAT_DSRC_CURSOR_PREV 2074
#define WT_STAT_DSRC_CURSOR_PREV 2075
/*! cursor: remove calls */
#define WT_STAT_DSRC_CURSOR_REMOVE 2075
#define WT_STAT_DSRC_CURSOR_REMOVE 2076
/*! cursor: reset calls */
#define WT_STAT_DSRC_CURSOR_RESET 2076
#define WT_STAT_DSRC_CURSOR_RESET 2077
/*! cursor: restarted searches */
#define WT_STAT_DSRC_CURSOR_RESTART 2077
#define WT_STAT_DSRC_CURSOR_RESTART 2078
/*! cursor: search calls */
#define WT_STAT_DSRC_CURSOR_SEARCH 2078
#define WT_STAT_DSRC_CURSOR_SEARCH 2079
/*! cursor: search near calls */
#define WT_STAT_DSRC_CURSOR_SEARCH_NEAR 2079
#define WT_STAT_DSRC_CURSOR_SEARCH_NEAR 2080
/*! cursor: truncate calls */
#define WT_STAT_DSRC_CURSOR_TRUNCATE 2080
#define WT_STAT_DSRC_CURSOR_TRUNCATE 2081
/*! cursor: update calls */
#define WT_STAT_DSRC_CURSOR_UPDATE 2081
#define WT_STAT_DSRC_CURSOR_UPDATE 2082
/*! reconciliation: dictionary matches */
#define WT_STAT_DSRC_REC_DICTIONARY 2082
#define WT_STAT_DSRC_REC_DICTIONARY 2083
/*! reconciliation: fast-path pages deleted */
#define WT_STAT_DSRC_REC_PAGE_DELETE_FAST 2083
#define WT_STAT_DSRC_REC_PAGE_DELETE_FAST 2084
/*! reconciliation: internal page key bytes discarded using suffix
* compression */
#define WT_STAT_DSRC_REC_SUFFIX_COMPRESSION 2084
#define WT_STAT_DSRC_REC_SUFFIX_COMPRESSION 2085
/*! reconciliation: internal page multi-block writes */
#define WT_STAT_DSRC_REC_MULTIBLOCK_INTERNAL 2085
#define WT_STAT_DSRC_REC_MULTIBLOCK_INTERNAL 2086
/*! reconciliation: internal-page overflow keys */
#define WT_STAT_DSRC_REC_OVERFLOW_KEY_INTERNAL 2086
#define WT_STAT_DSRC_REC_OVERFLOW_KEY_INTERNAL 2087
/*! reconciliation: leaf page key bytes discarded using prefix compression */
#define WT_STAT_DSRC_REC_PREFIX_COMPRESSION 2087
#define WT_STAT_DSRC_REC_PREFIX_COMPRESSION 2088
/*! reconciliation: leaf page multi-block writes */
#define WT_STAT_DSRC_REC_MULTIBLOCK_LEAF 2088
#define WT_STAT_DSRC_REC_MULTIBLOCK_LEAF 2089
/*! reconciliation: leaf-page overflow keys */
#define WT_STAT_DSRC_REC_OVERFLOW_KEY_LEAF 2089
#define WT_STAT_DSRC_REC_OVERFLOW_KEY_LEAF 2090
/*! reconciliation: maximum blocks required for a page */
#define WT_STAT_DSRC_REC_MULTIBLOCK_MAX 2090
#define WT_STAT_DSRC_REC_MULTIBLOCK_MAX 2091
/*! reconciliation: overflow values written */
#define WT_STAT_DSRC_REC_OVERFLOW_VALUE 2091
#define WT_STAT_DSRC_REC_OVERFLOW_VALUE 2092
/*! reconciliation: page checksum matches */
#define WT_STAT_DSRC_REC_PAGE_MATCH 2092
#define WT_STAT_DSRC_REC_PAGE_MATCH 2093
/*! reconciliation: page reconciliation calls */
#define WT_STAT_DSRC_REC_PAGES 2093
#define WT_STAT_DSRC_REC_PAGES 2094
/*! reconciliation: page reconciliation calls for eviction */
#define WT_STAT_DSRC_REC_PAGES_EVICTION 2094
#define WT_STAT_DSRC_REC_PAGES_EVICTION 2095
/*! reconciliation: pages deleted */
#define WT_STAT_DSRC_REC_PAGE_DELETE 2095
#define WT_STAT_DSRC_REC_PAGE_DELETE 2096
/*! session: object compaction */
#define WT_STAT_DSRC_SESSION_COMPACT 2096
#define WT_STAT_DSRC_SESSION_COMPACT 2097
/*! session: open cursor count */
#define WT_STAT_DSRC_SESSION_CURSOR_OPEN 2097
#define WT_STAT_DSRC_SESSION_CURSOR_OPEN 2098
/*! transaction: update conflicts */
#define WT_STAT_DSRC_TXN_UPDATE_CONFLICT 2098
#define WT_STAT_DSRC_TXN_UPDATE_CONFLICT 2099
/*!
* @}

View File

@@ -8,6 +8,8 @@
#include "wt_internal.h"
static int __log_openfile(
WT_SESSION_IMPL *, bool, WT_FH **, const char *, uint32_t);
static int __log_write_internal(
WT_SESSION_IMPL *, WT_ITEM *, WT_LSN *, uint32_t);
@@ -93,8 +95,9 @@ __wt_log_background(WT_SESSION_IMPL *session, WT_LSN *lsn)
int
__wt_log_force_sync(WT_SESSION_IMPL *session, WT_LSN *min_lsn)
{
WT_LOG *log;
WT_DECL_RET;
WT_FH *log_fh;
WT_LOG *log;
log = S2C(session)->log;
@@ -129,12 +132,21 @@ __wt_log_force_sync(WT_SESSION_IMPL *session, WT_LSN *min_lsn)
* Sync the log file if needed.
*/
if (__wt_log_cmp(&log->sync_lsn, min_lsn) < 0) {
/*
* Get our own file handle to the log file. It is possible
* for the file handle in the log structure to change out
* from under us and either be NULL or point to a different
* file than we want.
*/
WT_ERR(__log_openfile(session,
false, &log_fh, WT_LOG_FILENAME, min_lsn->l.file));
WT_ERR(__wt_verbose(session, WT_VERB_LOG,
"log_force_sync: sync %s to LSN %" PRIu32 "/%" PRIu32,
log->log_fh->name, min_lsn->l.file, min_lsn->l.offset));
WT_ERR(__wt_fsync(session, log->log_fh, true));
log_fh->name, min_lsn->l.file, min_lsn->l.offset));
WT_ERR(__wt_fsync(session, log_fh, true));
log->sync_lsn = *min_lsn;
WT_STAT_FAST_CONN_INCR(session, log_sync);
WT_ERR(__wt_close(session, &log_fh));
WT_ERR(__wt_cond_signal(session, log->log_sync_cond));
}
err:
@@ -763,6 +775,7 @@ __log_newfile(WT_SESSION_IMPL *session, bool conn_open, bool *created)
{
WT_CONNECTION_IMPL *conn;
WT_DECL_RET;
WT_FH *log_fh;
WT_LOG *log;
WT_LSN end_lsn;
int yield_cnt;
@@ -835,8 +848,15 @@ __log_newfile(WT_SESSION_IMPL *session, bool conn_open, bool *created)
WT_RET(__wt_log_allocfile(
session, log->fileid, WT_LOG_FILENAME));
}
/*
* Since the file system clears the output file handle pointer before
* searching the handle list and filling in the new file handle,
* we must pass in a local file handle. Otherwise there is a wide
* window where another thread could see a NULL log file handle.
*/
WT_RET(__log_openfile(session,
false, &log->log_fh, WT_LOG_FILENAME, log->fileid));
false, &log_fh, WT_LOG_FILENAME, log->fileid));
WT_PUBLISH(log->log_fh, log_fh);
/*
* We need to setup the LSNs. Set the end LSN and alloc LSN to
* the end of the header.
@@ -1153,6 +1173,8 @@ __wt_log_open(WT_SESSION_IMPL *session)
err: if (logfiles != NULL)
__wt_log_files_free(session, logfiles, logcount);
if (ret == 0)
F_SET(log, WT_LOG_OPENED);
return (ret);
}
@@ -1193,6 +1215,7 @@ __wt_log_close(WT_SESSION_IMPL *session)
WT_RET(__wt_close(session, &log->log_dir_fh));
log->log_dir_fh = NULL;
}
F_CLR(log, WT_LOG_OPENED);
return (0);
}
@@ -1806,9 +1829,10 @@ __wt_log_write(WT_SESSION_IMPL *session, WT_ITEM *record, WT_LSN *lsnp,
/*
* An error during opening the logging subsystem can result in it
* being enabled, but without an open log file. In that case,
* just return.
* just return. We can also have logging opened for reading in a
* read-only database and attempt to write a record on close.
*/
if (log->log_fh == NULL)
if (!F_ISSET(log, WT_LOG_OPENED) || F_ISSET(conn, WT_CONN_READONLY))
return (0);
ip = record;
if ((compressor = conn->log_compressor) != NULL &&
@@ -2128,9 +2152,18 @@ __wt_log_flush(WT_SESSION_IMPL *session, uint32_t flags)
* We need to flush out the current slot first to get the real
* end of log LSN in log->alloc_lsn.
*/
WT_RET(__wt_log_flush_lsn(session, &lsn, 0));
WT_RET(__wt_log_flush_lsn(session, &lsn, false));
last_lsn = log->alloc_lsn;
/*
* If the last write caused a switch to a new log file, we should only
* wait for the last write to be flushed. Otherwise, if the workload
* is single-threaded we could wait here forever because the write LSN
* doesn't switch into the new file until it contains a record.
*/
if (last_lsn.l.offset == WT_LOG_FIRST_RECORD)
last_lsn = log->log_close_lsn;
/*
* Wait until all current outstanding writes have been written
* to the file system.

View File

@@ -94,6 +94,17 @@ retry:
if (WT_LOG_SLOT_DONE(new_state))
*releasep = 1;
slot->slot_end_lsn = slot->slot_start_lsn;
/*
* A thread setting the unbuffered flag sets the unbuffered size after
* setting the flag. There could be a delay between a thread setting
* the flag, a thread closing the slot, and the original thread setting
* that value. If the state is unbuffered, wait for the unbuffered
* size to be set.
*/
while (WT_LOG_SLOT_UNBUFFERED_ISSET(old_state) &&
slot->slot_unbuffered == 0)
__wt_yield();
end_offset =
WT_LOG_SLOT_JOINED_BUFFERED(old_state) + slot->slot_unbuffered;
slot->slot_end_lsn.l.offset += (uint32_t)end_offset;

View File

@@ -210,7 +210,7 @@ __clsm_enter(WT_CURSOR_LSM *clsm, bool reset, bool update)
goto open;
if (txn->isolation == WT_ISO_SNAPSHOT)
__wt_txn_cursor_op(session);
WT_RET(__wt_txn_cursor_op(session));
/*
* Figure out how many updates are required for

View File

@@ -289,7 +289,8 @@ __wt_lsm_checkpoint_chunk(WT_SESSION_IMPL *session,
}
/* Stop if a running transaction needs the chunk. */
__wt_txn_update_oldest(session, true);
WT_RET(__wt_txn_update_oldest(
session, WT_TXN_OLDEST_STRICT | WT_TXN_OLDEST_WAIT));
if (chunk->switch_txn == WT_TXN_NONE ||
!__wt_txn_visible_all(session, chunk->switch_txn)) {
WT_RET(__wt_verbose(session, WT_VERB_LSM,

View File

@@ -36,7 +36,7 @@ __wt_posix_directory_list(WT_SESSION_IMPL *session, const char *dir,
dirsz = 0;
entries = NULL;
WT_SYSCALL_RETRY(((dirp = opendir(path)) == NULL ? 1 : 0), ret);
WT_SYSCALL_RETRY(((dirp = opendir(path)) == NULL ? -1 : 0), ret);
if (ret != 0)
WT_ERR_MSG(session, ret, "%s: directory-list: opendir", path);

View File

@@ -52,7 +52,7 @@ __posix_sync(WT_SESSION_IMPL *session,
* "This is currently implemented on HFS, MS-DOS (FAT), and Universal
* Disk Format (UDF) file systems."
*/
WT_SYSCALL_RETRY(fcntl(fd, F_FULLFSYNC, 0), ret);
WT_SYSCALL_RETRY(fcntl(fd, F_FULLFSYNC, 0) == -1 ? -1 : 0, ret);
if (ret == 0)
return (0);
/*
@@ -107,7 +107,7 @@ __posix_directory_sync(WT_SESSION_IMPL *session, const char *path)
}
WT_SYSCALL_RETRY((
(fd = open(path, O_RDONLY, 0444)) == -1 ? 1 : 0), ret);
(fd = open(path, O_RDONLY, 0444)) == -1 ? -1 : 0), ret);
if (ret != 0)
WT_ERR_MSG(session, ret, "%s: directory-sync: open", path);
@@ -172,14 +172,19 @@ __posix_file_remove(WT_SESSION_IMPL *session, const char *name)
#endif
WT_RET(__wt_filename(session, name, &path));
name = path;
WT_SYSCALL_RETRY(remove(name), ret);
if (ret != 0)
__wt_err(session, ret, "%s: file-remove: remove", name);
/*
* ISO C doesn't require remove return -1 on failure or set errno (note
* POSIX 1003.1 extends C with those requirements). Regardless, use the
* unlink system call, instead of remove, to simplify error handling;
* where we're not doing any special checking for standards compliance,
* using unlink may be marginally safer.
*/
WT_SYSCALL_RETRY(unlink(path), ret);
__wt_free(session, path);
return (ret);
if (ret == 0)
return (0);
WT_RET_MSG(session, ret, "%s: file-remove: unlink", name);
}
/*
@@ -203,18 +208,22 @@ __posix_file_rename(WT_SESSION_IMPL *session, const char *from, const char *to)
from_path = to_path = NULL;
WT_ERR(__wt_filename(session, from, &from_path));
from = from_path;
WT_ERR(__wt_filename(session, to, &to_path));
to = to_path;
WT_SYSCALL_RETRY(rename(from, to), ret);
if (ret != 0)
__wt_err(session, ret,
"%s to %s: file-rename: rename", from, to);
/*
* ISO C doesn't require rename return -1 on failure or set errno (note
* POSIX 1003.1 extends C with those requirements). Be cautious, force
* any non-zero return to -1 so we'll check errno. We can still end up
* with the wrong errno (if errno is garbage), or the generic WT_ERROR
* return (if errno is 0), but we've done the best we can.
*/
WT_SYSCALL_RETRY(rename(from_path, to_path) != 0 ? -1 : 0, ret);
err: __wt_free(session, from_path);
__wt_free(session, to_path);
return (ret);
if (ret == 0)
return (0);
WT_RET_MSG(session, ret, "%s to %s: file-rename: rename", from, to);
}
/*
@@ -360,7 +369,7 @@ __posix_handle_lock(WT_SESSION_IMPL *session, WT_FH *fh, bool lock)
fl.l_type = lock ? F_WRLCK : F_UNLCK;
fl.l_whence = SEEK_SET;
WT_SYSCALL_RETRY(fcntl(fh->fd, F_SETLK, &fl), ret);
WT_SYSCALL_RETRY(fcntl(fh->fd, F_SETLK, &fl) == -1 ? -1 : 0, ret);
if (ret == 0)
return (0);
WT_RET_MSG(session, ret, "%s: handle-lock: fcntl", fh->name);
@@ -560,7 +569,7 @@ __posix_handle_open(WT_SESSION_IMPL *session,
f |= O_CLOEXEC;
#endif
WT_SYSCALL_RETRY((
(fd = open(name, f, 0444)) == -1 ? 1 : 0), ret);
(fd = open(name, f, 0444)) == -1 ? -1 : 0), ret);
if (ret != 0)
WT_ERR_MSG(session, ret, "%s: handle-open: open", name);
WT_ERR(__posix_handle_open_cloexec(session, fd, name));
@@ -622,7 +631,7 @@ __posix_handle_open(WT_SESSION_IMPL *session,
#endif
}
WT_SYSCALL_RETRY(((fd = open(name, f, mode)) == -1 ? 1 : 0), ret);
WT_SYSCALL_RETRY(((fd = open(name, f, mode)) == -1 ? -1 : 0), ret);
if (ret != 0)
WT_ERR_MSG(session, ret,
direct_io ?

View File

@@ -98,6 +98,7 @@ __posix_map_preload_madvise(
if (size <= (size_t)conn->page_size ||
(ret = posix_madvise(blk, size, POSIX_MADV_WILLNEED)) == 0)
return (0);
WT_RET_MSG(session, ret,
"%s: memory-map preload: posix_madvise: POSIX_MADV_WILLNEED",
fh->name);
@@ -145,6 +146,7 @@ __posix_map_discard_madvise(
if ((ret = posix_madvise(blk, size, POSIX_MADV_DONTNEED)) == 0)
return (0);
WT_RET_MSG(session, ret,
"%s: memory-map discard: posix_madvise: POSIX_MADV_DONTNEED",
fh->name);

View File

@@ -286,11 +286,6 @@ __win_handle_lock(WT_SESSION_IMPL *session, WT_FH *fh, bool lock)
* WiredTiger requires this function be able to acquire locks past
* the end of file.
*
* Note we're using fcntl(2) locking: all fcntl locks associated with a
* file for a given process are removed when any file descriptor for the
* file is closed by the process, even if a lock was never requested for
* that file descriptor.
*
* http://msdn.microsoft.com/
* en-us/library/windows/desktop/aa365202%28v=vs.85%29.aspx
*

View File

@@ -722,18 +722,29 @@ __wt_session_drop(WT_SESSION_IMPL *session, const char *uri, const char *cfg[])
{
WT_DECL_RET;
WT_CONFIG_ITEM cval;
bool lock_wait;
bool checkpoint_wait, lock_wait;
WT_RET(__wt_config_gets_def(session, cfg, "checkpoint_wait", 1, &cval));
checkpoint_wait = cval.val != 0;
WT_RET(__wt_config_gets_def(session, cfg, "lock_wait", 1, &cval));
lock_wait = cval.val != 0 || F_ISSET(session, WT_SESSION_LOCK_NO_WAIT);
if (!lock_wait)
F_SET(session, WT_SESSION_LOCK_NO_WAIT);
WT_WITH_CHECKPOINT_LOCK(session, ret,
WT_WITH_SCHEMA_LOCK(session, ret,
WT_WITH_TABLE_LOCK(session, ret,
ret = __wt_schema_drop(session, uri, cfg))));
/*
* The checkpoint lock only is needed to avoid a spurious EBUSY error
* return.
*/
if (checkpoint_wait)
WT_WITH_CHECKPOINT_LOCK(session, ret,
WT_WITH_SCHEMA_LOCK(session, ret,
WT_WITH_TABLE_LOCK(session, ret,
ret = __wt_schema_drop(session, uri, cfg))));
else
WT_WITH_SCHEMA_LOCK(session, ret,
WT_WITH_TABLE_LOCK(session, ret,
ret = __wt_schema_drop(session, uri, cfg)));
if (!lock_wait)
F_CLR(session, WT_SESSION_LOCK_NO_WAIT);

View File

@@ -769,3 +769,10 @@ FUNC_START(__crc32_vpmsum)
FUNC_END(__crc32_vpmsum)
#endif
/*
* Make sure the stack isn't executable with GCC (regardless of platform).
*/
#ifndef __clang__
.section .note.GNU-stack,"",@progbits
#endif

View File

@@ -43,6 +43,7 @@ static const char * const __stats_dsrc_desc[] = {
"btree: pages rewritten by compaction",
"btree: row-store internal pages",
"btree: row-store leaf pages",
"cache: bytes currently in the cache",
"cache: bytes read into cache",
"cache: bytes written from cache",
"cache: checkpoint blocked page eviction",
@@ -172,6 +173,7 @@ __wt_stat_dsrc_clear_single(WT_DSRC_STATS *stats)
stats->btree_compact_rewrite = 0;
stats->btree_row_internal = 0;
stats->btree_row_leaf = 0;
/* not clearing cache_bytes_inuse */
stats->cache_bytes_read = 0;
stats->cache_bytes_write = 0;
stats->cache_eviction_checkpoint = 0;
@@ -298,6 +300,7 @@ __wt_stat_dsrc_aggregate_single(
to->btree_compact_rewrite += from->btree_compact_rewrite;
to->btree_row_internal += from->btree_row_internal;
to->btree_row_leaf += from->btree_row_leaf;
to->cache_bytes_inuse += from->cache_bytes_inuse;
to->cache_bytes_read += from->cache_bytes_read;
to->cache_bytes_write += from->cache_bytes_write;
to->cache_eviction_checkpoint += from->cache_eviction_checkpoint;
@@ -430,6 +433,7 @@ __wt_stat_dsrc_aggregate(
WT_STAT_READ(from, btree_compact_rewrite);
to->btree_row_internal += WT_STAT_READ(from, btree_row_internal);
to->btree_row_leaf += WT_STAT_READ(from, btree_row_leaf);
to->cache_bytes_inuse += WT_STAT_READ(from, cache_bytes_inuse);
to->cache_bytes_read += WT_STAT_READ(from, cache_bytes_read);
to->cache_bytes_write += WT_STAT_READ(from, cache_bytes_write);
to->cache_eviction_checkpoint +=
@@ -551,6 +555,8 @@ static const char * const __stats_connection_desc[] = {
"cache: eviction server unable to reach eviction goal",
"cache: eviction worker thread evicting pages",
"cache: failed eviction of pages that exceeded the in-memory maximum",
"cache: files with active eviction walks",
"cache: files with new eviction walks started",
"cache: hazard pointer blocked page eviction",
"cache: in-memory page passed criteria to be split",
"cache: in-memory page splits",
@@ -562,14 +568,18 @@ static const char * const __stats_connection_desc[] = {
"cache: maximum bytes configured",
"cache: maximum page size at eviction",
"cache: modified pages evicted",
"cache: modified pages evicted by application threads",
"cache: page split during eviction deepened the tree",
"cache: page written requiring lookaside records",
"cache: pages currently held in the cache",
"cache: pages evicted because they exceeded the in-memory maximum",
"cache: pages evicted because they had chains of deleted items",
"cache: pages evicted by application threads",
"cache: pages queued for eviction",
"cache: pages queued for urgent eviction",
"cache: pages read into cache",
"cache: pages read into cache requiring lookaside entries",
"cache: pages seen by eviction walk",
"cache: pages selected for eviction unable to be evicted",
"cache: pages walked for eviction",
"cache: pages written from cache",
@@ -748,6 +758,8 @@ __wt_stat_connection_clear_single(WT_CONNECTION_STATS *stats)
stats->cache_eviction_slow = 0;
stats->cache_eviction_worker_evicting = 0;
stats->cache_eviction_force_fail = 0;
/* not clearing cache_eviction_walks_active */
stats->cache_eviction_walks_started = 0;
stats->cache_eviction_hazard = 0;
stats->cache_inmem_splittable = 0;
stats->cache_inmem_split = 0;
@@ -759,14 +771,18 @@ __wt_stat_connection_clear_single(WT_CONNECTION_STATS *stats)
/* not clearing cache_bytes_max */
/* not clearing cache_eviction_maximum_page_size */
stats->cache_eviction_dirty = 0;
stats->cache_eviction_app_dirty = 0;
stats->cache_eviction_deepen = 0;
stats->cache_write_lookaside = 0;
/* not clearing cache_pages_inuse */
stats->cache_eviction_force = 0;
stats->cache_eviction_force_delete = 0;
stats->cache_eviction_app = 0;
stats->cache_eviction_pages_queued = 0;
stats->cache_eviction_pages_queued_oldest = 0;
stats->cache_read = 0;
stats->cache_read_lookaside = 0;
stats->cache_eviction_pages_seen = 0;
stats->cache_eviction_fail = 0;
stats->cache_eviction_walk = 0;
stats->cache_write = 0;
@@ -943,6 +959,10 @@ __wt_stat_connection_aggregate(
WT_STAT_READ(from, cache_eviction_worker_evicting);
to->cache_eviction_force_fail +=
WT_STAT_READ(from, cache_eviction_force_fail);
to->cache_eviction_walks_active +=
WT_STAT_READ(from, cache_eviction_walks_active);
to->cache_eviction_walks_started +=
WT_STAT_READ(from, cache_eviction_walks_started);
to->cache_eviction_hazard +=
WT_STAT_READ(from, cache_eviction_hazard);
to->cache_inmem_splittable +=
@@ -962,6 +982,8 @@ __wt_stat_connection_aggregate(
to->cache_eviction_maximum_page_size +=
WT_STAT_READ(from, cache_eviction_maximum_page_size);
to->cache_eviction_dirty += WT_STAT_READ(from, cache_eviction_dirty);
to->cache_eviction_app_dirty +=
WT_STAT_READ(from, cache_eviction_app_dirty);
to->cache_eviction_deepen +=
WT_STAT_READ(from, cache_eviction_deepen);
to->cache_write_lookaside +=
@@ -971,8 +993,14 @@ __wt_stat_connection_aggregate(
to->cache_eviction_force_delete +=
WT_STAT_READ(from, cache_eviction_force_delete);
to->cache_eviction_app += WT_STAT_READ(from, cache_eviction_app);
to->cache_eviction_pages_queued +=
WT_STAT_READ(from, cache_eviction_pages_queued);
to->cache_eviction_pages_queued_oldest +=
WT_STAT_READ(from, cache_eviction_pages_queued_oldest);
to->cache_read += WT_STAT_READ(from, cache_read);
to->cache_read_lookaside += WT_STAT_READ(from, cache_read_lookaside);
to->cache_eviction_pages_seen +=
WT_STAT_READ(from, cache_eviction_pages_seen);
to->cache_eviction_fail += WT_STAT_READ(from, cache_eviction_fail);
to->cache_eviction_walk += WT_STAT_READ(from, cache_eviction_walk);
to->cache_write += WT_STAT_READ(from, cache_write);

View File

@@ -108,17 +108,17 @@ __wt_txn_release_snapshot(WT_SESSION_IMPL *session)
* __wt_txn_get_snapshot --
* Allocate a snapshot.
*/
void
int
__wt_txn_get_snapshot(WT_SESSION_IMPL *session)
{
WT_CONNECTION_IMPL *conn;
WT_DECL_RET;
WT_TXN *txn;
WT_TXN_GLOBAL *txn_global;
WT_TXN_STATE *s, *txn_state;
uint64_t current_id, id;
uint64_t prev_oldest_id, snap_min;
uint32_t i, n, session_cnt;
int32_t count;
conn = S2C(session);
txn = &session->txn;
@@ -126,15 +126,13 @@ __wt_txn_get_snapshot(WT_SESSION_IMPL *session)
txn_state = WT_SESSION_TXN_STATE(session);
/*
* We're going to scan. Increment the count of scanners to prevent the
* oldest ID from moving forwards. Spin if the count is negative,
* which indicates that some thread is moving the oldest ID forwards.
* Spin waiting for the lock: the sleeps in our blocking readlock
* implementation are too slow for scanning the transaction table.
*/
do {
if ((count = txn_global->scan_count) < 0)
WT_PAUSE();
} while (count < 0 ||
!__wt_atomic_casiv32(&txn_global->scan_count, count, count + 1));
while ((ret =
__wt_try_readlock(session, txn_global->scan_rwlock)) == EBUSY)
WT_PAUSE();
WT_RET(ret);
current_id = snap_min = txn_global->current;
prev_oldest_id = txn_global->oldest_id;
@@ -145,11 +143,9 @@ __wt_txn_get_snapshot(WT_SESSION_IMPL *session)
__txn_sort_snapshot(session, 0, current_id);
/* Check that the oldest ID has not moved in the meantime. */
if (prev_oldest_id == txn_global->oldest_id) {
WT_ASSERT(session, txn_global->scan_count > 0);
(void)__wt_atomic_subiv32(&txn_global->scan_count, 1);
return;
}
WT_ASSERT(session, prev_oldest_id == txn_global->oldest_id);
WT_RET(__wt_readunlock(session, txn_global->scan_rwlock));
return (0);
}
/* Walk the array of concurrent transactions. */
@@ -182,67 +178,35 @@ __wt_txn_get_snapshot(WT_SESSION_IMPL *session)
WT_ASSERT(session, prev_oldest_id == txn_global->oldest_id);
txn_state->snap_min = snap_min;
WT_ASSERT(session, txn_global->scan_count > 0);
(void)__wt_atomic_subiv32(&txn_global->scan_count, 1);
WT_RET(__wt_readunlock(session, txn_global->scan_rwlock));
__txn_sort_snapshot(session, n, current_id);
return (0);
}
/*
* __wt_txn_update_oldest --
* Sweep the running transactions to update the oldest ID required.
* !!!
* If a data-source is calling the WT_EXTENSION_API.transaction_oldest
* method (for the oldest transaction ID not yet visible to a running
* transaction), and then comparing that oldest ID against committed
* transactions to see if updates for a committed transaction are still
* visible to running transactions, the oldest transaction ID may be
* the same as the last committed transaction ID, if the transaction
* state wasn't refreshed after the last transaction committed. Push
* past the last committed transaction.
*/
void
__wt_txn_update_oldest(WT_SESSION_IMPL *session, bool force)
* __txn_oldest_scan --
* Sweep the running transactions to calculate the oldest ID required.
*/
static void
__txn_oldest_scan(WT_SESSION_IMPL *session,
uint64_t *oldest_idp, uint64_t *last_runningp,
WT_SESSION_IMPL **oldest_sessionp)
{
WT_CONNECTION_IMPL *conn;
WT_SESSION_IMPL *oldest_session;
WT_TXN_GLOBAL *txn_global;
WT_TXN_STATE *s;
uint64_t current_id, id, last_running, oldest_id, prev_oldest_id;
uint64_t id, last_running, oldest_id, prev_oldest_id;
uint32_t i, session_cnt;
int32_t count;
bool last_running_moved;
conn = S2C(session);
txn_global = &conn->txn_global;
retry:
current_id = last_running = txn_global->current;
oldest_session = NULL;
/* The oldest ID cannot change while we are holding the scan lock. */
prev_oldest_id = txn_global->oldest_id;
/*
* For pure read-only workloads, or if the update isn't forced and the
* oldest ID isn't too far behind, avoid scanning.
*/
if (prev_oldest_id == current_id ||
(!force && WT_TXNID_LT(current_id, prev_oldest_id + 100)))
return;
/*
* We're going to scan. Increment the count of scanners to prevent the
* oldest ID from moving forwards. Spin if the count is negative,
* which indicates that some thread is moving the oldest ID forwards.
*/
do {
if ((count = txn_global->scan_count) < 0)
WT_PAUSE();
} while (count < 0 ||
!__wt_atomic_casiv32(&txn_global->scan_count, count, count + 1));
/* The oldest ID cannot change until the scan count goes to zero. */
prev_oldest_id = txn_global->oldest_id;
current_id = oldest_id = last_running = txn_global->current;
oldest_id = last_running = txn_global->current;
/* Walk the array of concurrent transactions. */
WT_ORDERED_READ(session_cnt, conn->session_cnt);
@@ -264,7 +228,7 @@ retry:
* !!!
* Note: Don't ignore snap_min values older than the previous
* oldest ID. Read-uncommitted operations publish snap_min
* values without incrementing scan_count to protect the global
* values without acquiring the scan lock to protect the global
* table. See the comment in __wt_txn_cursor_op for
* more details.
*/
@@ -283,76 +247,121 @@ retry:
WT_TXNID_LT(id, oldest_id))
oldest_id = id;
/* Update the last running ID. */
last_running_moved =
WT_TXNID_LT(txn_global->last_running, last_running);
*oldest_idp = oldest_id;
*oldest_sessionp = oldest_session;
*last_runningp = last_running;
}
/* Update the oldest ID. */
if (WT_TXNID_LT(prev_oldest_id, oldest_id) || last_running_moved) {
/*
* We know we want to update. Check if we're racing.
*/
if (__wt_atomic_casiv32(&txn_global->scan_count, 1, -1)) {
WT_ORDERED_READ(session_cnt, conn->session_cnt);
for (i = 0, s = txn_global->states;
i < session_cnt; i++, s++) {
if ((id = s->id) != WT_TXN_NONE &&
WT_TXNID_LT(id, last_running))
last_running = id;
if ((id = s->snap_min) != WT_TXN_NONE &&
WT_TXNID_LT(id, oldest_id))
oldest_id = id;
}
/*
* __wt_txn_update_oldest --
* Sweep the running transactions to update the oldest ID required.
*/
int
__wt_txn_update_oldest(WT_SESSION_IMPL *session, uint32_t flags)
{
WT_CONNECTION_IMPL *conn;
WT_DECL_RET;
WT_SESSION_IMPL *oldest_session;
WT_TXN_GLOBAL *txn_global;
uint64_t current_id, last_running, oldest_id;
uint64_t prev_last_running, prev_oldest_id;
bool strict, wait;
if (WT_TXNID_LT(last_running, oldest_id))
oldest_id = last_running;
conn = S2C(session);
txn_global = &conn->txn_global;
strict = LF_ISSET(WT_TXN_OLDEST_STRICT);
wait = LF_ISSET(WT_TXN_OLDEST_WAIT);
current_id = last_running = txn_global->current;
prev_last_running = txn_global->last_running;
prev_oldest_id = txn_global->oldest_id;
/*
* For pure read-only workloads, or if the update isn't forced and the
* oldest ID isn't too far behind, avoid scanning.
*/
if (prev_oldest_id == current_id ||
(!strict && WT_TXNID_LT(current_id, prev_oldest_id + 100)))
return (0);
/* First do a read-only scan. */
if (wait)
WT_RET(__wt_readlock(session, txn_global->scan_rwlock));
else if ((ret =
__wt_try_readlock(session, txn_global->scan_rwlock)) != 0)
return (ret == EBUSY ? 0 : ret);
__txn_oldest_scan(session, &oldest_id, &last_running, &oldest_session);
WT_RET(__wt_readunlock(session, txn_global->scan_rwlock));
/*
* If the state hasn't changed (or hasn't moved far enough for
* non-forced updates), give up.
*/
if ((oldest_id == prev_oldest_id ||
(!strict && WT_TXNID_LT(oldest_id, prev_oldest_id + 100))) &&
((last_running == prev_last_running) ||
(!strict && WT_TXNID_LT(last_running, prev_last_running + 100))))
return (0);
/* It looks like an update is necessary, wait for exclusive access. */
if (wait)
WT_RET(__wt_writelock(session, txn_global->scan_rwlock));
else if ((ret =
__wt_try_writelock(session, txn_global->scan_rwlock)) != 0)
return (ret == EBUSY ? 0 : ret);
/*
* If the oldest ID has been updated while we waited, don't bother
* scanning.
*/
if (WT_TXNID_LE(oldest_id, txn_global->oldest_id) &&
WT_TXNID_LE(last_running, txn_global->last_running))
goto done;
/*
* Re-scan now that we have exclusive access. This is necessary because
* threads get transaction snapshots with read locks, and we have to be
* sure that there isn't a thread that has got a snapshot locally but
* not yet published its snap_min.
*/
__txn_oldest_scan(session, &oldest_id, &last_running, &oldest_session);
#ifdef HAVE_DIAGNOSTIC
/*
* Make sure the ID doesn't move past any named
* snapshots.
*
* Don't include the read/assignment in the assert
* statement. Coverity complains if there are
* assignments only done in diagnostic builds, and
* when the read is from a volatile.
*/
id = txn_global->nsnap_oldest_id;
WT_ASSERT(session,
id == WT_TXN_NONE || !WT_TXNID_LT(id, oldest_id));
{
/*
* Make sure the ID doesn't move past any named snapshots.
*
* Don't include the read/assignment in the assert statement. Coverity
* complains if there are assignments only done in diagnostic builds,
* and when the read is from a volatile.
*/
uint64_t id = txn_global->nsnap_oldest_id;
WT_ASSERT(session,
id == WT_TXN_NONE || !WT_TXNID_LT(id, oldest_id));
}
#endif
if (WT_TXNID_LT(txn_global->last_running, last_running))
txn_global->last_running = last_running;
if (WT_TXNID_LT(txn_global->oldest_id, oldest_id))
txn_global->oldest_id = oldest_id;
WT_ASSERT(session, txn_global->scan_count == -1);
txn_global->scan_count = 0;
} else {
/*
* We wanted to update the oldest ID but we're racing
* another thread. Retry if this is a forced update.
*/
WT_ASSERT(session, txn_global->scan_count > 0);
(void)__wt_atomic_subiv32(&txn_global->scan_count, 1);
if (force) {
__wt_yield();
goto retry;
}
}
} else {
/* Update the oldest ID. */
if (WT_TXNID_LT(txn_global->oldest_id, oldest_id))
txn_global->oldest_id = oldest_id;
if (WT_TXNID_LT(txn_global->last_running, last_running)) {
txn_global->last_running = last_running;
/* Output a verbose message about long-running transactions,
* but only when some progress is being made. */
if (WT_VERBOSE_ISSET(session, WT_VERB_TRANSACTION) &&
current_id - oldest_id > 10000 && oldest_session != NULL) {
(void)__wt_verbose(session, WT_VERB_TRANSACTION,
WT_TRET(__wt_verbose(session, WT_VERB_TRANSACTION,
"old snapshot %" PRIu64
" pinned in session %" PRIu32 " [%s]"
" with snap_min %" PRIu64 "\n",
oldest_id, oldest_session->id,
oldest_session->lastop,
oldest_session->txn.snap_min);
oldest_session->txn.snap_min));
}
WT_ASSERT(session, txn_global->scan_count > 0);
(void)__wt_atomic_subiv32(&txn_global->scan_count, 1);
}
done: WT_TRET(__wt_writeunlock(session, txn_global->scan_rwlock));
return (ret);
}
/*
@@ -735,6 +744,8 @@ __wt_txn_global_init(WT_SESSION_IMPL *session, const char *cfg[])
WT_RET(__wt_spin_init(session,
&txn_global->id_lock, "transaction id lock"));
WT_RET(__wt_rwlock_alloc(session,
&txn_global->scan_rwlock, "transaction scan lock"));
WT_RET(__wt_rwlock_alloc(session,
&txn_global->nsnap_rwlock, "named snapshot lock"));
txn_global->nsnap_oldest_id = WT_TXN_NONE;
@@ -768,6 +779,7 @@ __wt_txn_global_destroy(WT_SESSION_IMPL *session)
return (0);
__wt_spin_destroy(session, &txn_global->id_lock);
WT_TRET(__wt_rwlock_destroy(session, &txn_global->scan_rwlock));
WT_TRET(__wt_rwlock_destroy(session, &txn_global->nsnap_rwlock));
__wt_free(session, txn_global->states);

View File

@@ -404,7 +404,8 @@ __txn_checkpoint(WT_SESSION_IMPL *session, const char *cfg[])
* This is particularly important for compact, so that all dirty pages
* can be fully written.
*/
__wt_txn_update_oldest(session, true);
WT_ERR(__wt_txn_update_oldest(
session, WT_TXN_OLDEST_STRICT | WT_TXN_OLDEST_WAIT));
/* Flush data-sources before we start the checkpoint. */
WT_ERR(__checkpoint_data_source(session, cfg));
@@ -792,6 +793,9 @@ __checkpoint_lock_tree(WT_SESSION_IMPL *session,
hot_backup_locked = false;
name_alloc = NULL;
/* Only referenced in diagnostic builds. */
WT_UNUSED(is_checkpoint);
/*
* Only referenced in diagnostic builds and gcc 5.1 isn't satisfied
* with wrapping the entire assert condition in the unused macro.
@@ -1281,7 +1285,8 @@ __wt_checkpoint_close(WT_SESSION_IMPL *session, bool final)
* for active readers.
*/
if (!btree->modified && !bulk) {
__wt_txn_update_oldest(session, true);
WT_RET(__wt_txn_update_oldest(
session, WT_TXN_OLDEST_STRICT | WT_TXN_OLDEST_WAIT));
return (__wt_txn_visible_all(session, btree->rec_max_txn) ?
__wt_cache_op(session, WT_SYNC_DISCARD) : EBUSY);
}

View File

@@ -36,7 +36,7 @@
#include <unistd.h>
#endif
#include <wiredtiger.h>
#include <wt_internal.h>
#include "test_util.i"
@@ -44,7 +44,11 @@ static char home[512]; /* Program working dir */
static const char *progname; /* Program name */
static const char * const uri = "table:main";
#define RECORDS_FILE "records"
#define MAX_TH 12
#define MIN_TH 5
#define MAX_TIME 40
#define MIN_TIME 10
#define RECORDS_FILE "records-%u"
#define ENV_CONFIG \
"create,log=(file_max=10M,archive=false,enabled)," \
@@ -55,71 +59,66 @@ static const char * const uri = "table:main";
static void
usage(void)
{
fprintf(stderr, "usage: %s [-h dir]\n", progname);
fprintf(stderr, "usage: %s [-h dir] [-T threads]\n", progname);
exit(EXIT_FAILURE);
}
typedef struct {
WT_CONNECTION *conn;
uint64_t start;
uint32_t id;
} WT_THREAD_DATA;
/*
* Child process creates the database and table, and then writes data into
* the table until it is killed by the parent.
*/
static void
fill_db(void)
static void *
thread_run(void *arg)
{
FILE *fp;
WT_CONNECTION *conn;
WT_CURSOR *cursor;
WT_ITEM data;
WT_RAND_STATE rnd;
WT_SESSION *session;
WT_THREAD_DATA *td;
uint64_t i;
int ret;
uint8_t buf[MAX_VAL];
char buf[MAX_VAL], kname[64];
__wt_random_init(&rnd);
memset(buf, 0, sizeof(buf));
/*
* Initialize the first 25% to random values. Leave a bunch of data
* space at the end to emphasize zero data.
*/
for (i = 0; i < MAX_VAL/4; i++)
buf[i] = (uint8_t)__wt_random(&rnd);
memset(kname, 0, sizeof(kname));
td = (WT_THREAD_DATA *)arg;
/*
* Run in the home directory so that the records file is in there too.
* The value is the name of the record file with our id appended.
*/
if (chdir(home) != 0)
testutil_die(errno, "chdir: %s", home);
if ((ret = wiredtiger_open(NULL, NULL, ENV_CONFIG, &conn)) != 0)
testutil_die(ret, "wiredtiger_open");
if ((ret = conn->open_session(conn, NULL, NULL, &session)) != 0)
testutil_die(ret, "WT_CONNECTION:open_session");
if ((ret = session->create(session,
uri, "key_format=Q,value_format=u")) != 0)
testutil_die(ret, "WT_SESSION.create: %s", uri);
if ((ret =
session->open_cursor(session, uri, NULL, NULL, &cursor)) != 0)
testutil_die(ret, "WT_SESSION.open_cursor: %s", uri);
snprintf(buf, sizeof(buf), RECORDS_FILE, td->id);
/*
* Keep a separate file with the records we wrote for checking.
*/
(void)unlink(RECORDS_FILE);
if ((fp = fopen(RECORDS_FILE, "w")) == NULL)
(void)unlink(buf);
if ((fp = fopen(buf, "w")) == NULL)
testutil_die(errno, "fopen");
/*
* Set to no buffering.
*/
__wt_stream_set_no_buffer(fp);
/*
* Write data into the table until we are killed by the parent.
* The data in the buffer is already set to random content.
*/
if ((ret = td->conn->open_session(td->conn, NULL, NULL, &session)) != 0)
testutil_die(ret, "WT_CONNECTION:open_session");
if ((ret =
session->open_cursor(session, uri, NULL, NULL, &cursor)) != 0)
testutil_die(ret, "WT_SESSION.open_cursor: %s", uri);
data.data = buf;
for (i = 0;; ++i) {
data.size = sizeof(buf);
/*
* Write our portion of the key space until we're killed.
*/
for (i = td->start; ; ++i) {
snprintf(kname, sizeof(kname), "%" PRIu64, i);
data.size = __wt_random(&rnd) % MAX_VAL;
cursor->set_key(cursor, i);
cursor->set_key(cursor, kname);
cursor->set_value(cursor, &data);
if ((ret = cursor->insert(cursor)) != 0)
testutil_die(ret, "WT_CURSOR.insert");
@@ -128,9 +127,63 @@ fill_db(void)
*/
if (fprintf(fp, "%" PRIu64 "\n", i) == -1)
testutil_die(errno, "fprintf");
if (i % 5000)
__wt_yield();
}
return (NULL);
}
/*
* Child process creates the database and table, and then creates worker
* threads to add data until it is killed by the parent.
*/
static void fill_db(uint32_t)
WT_GCC_FUNC_DECL_ATTRIBUTE((noreturn));
static void
fill_db(uint32_t nth)
{
pthread_t *thr;
WT_CONNECTION *conn;
WT_SESSION *session;
WT_THREAD_DATA *td;
uint32_t i;
int ret;
thr = calloc(nth, sizeof(pthread_t));
td = calloc(nth, sizeof(WT_THREAD_DATA));
if (chdir(home) != 0)
testutil_die(errno, "Child chdir: %s", home);
if ((ret = wiredtiger_open(NULL, NULL, ENV_CONFIG, &conn)) != 0)
testutil_die(ret, "wiredtiger_open");
if ((ret = conn->open_session(conn, NULL, NULL, &session)) != 0)
testutil_die(ret, "WT_CONNECTION:open_session");
if ((ret = session->create(session,
uri, "key_format=S,value_format=u")) != 0)
testutil_die(ret, "WT_SESSION.create: %s", uri);
if ((ret = session->close(session, NULL)) != 0)
testutil_die(ret, "WT_SESSION:close");
printf("Create %" PRIu32 " writer threads\n", nth);
for (i = 0; i < nth; ++i) {
td[i].conn = conn;
td[i].start = (UINT64_MAX / nth) * i;
td[i].id = i;
if ((ret = pthread_create(
&thr[i], NULL, thread_run, &td[i])) != 0)
testutil_die(ret, "pthread_create");
}
printf("Spawned %" PRIu32 " writer threads\n", nth);
fflush(stdout);
/*
* The threads never exit, so the child will just wait here until
* it is killed.
*/
for (i = 0; i < nth; ++i)
pthread_join(thr[i], NULL);
/*
* NOTREACHED
*/
free(thr);
free(td);
exit(EXIT_SUCCESS);
}
extern int __wt_optind;
@@ -147,24 +200,34 @@ main(int argc, char *argv[])
WT_SESSION *session;
WT_RAND_STATE rnd;
uint64_t key;
uint32_t absent, count, timeout;
uint32_t absent, count, i, nth, timeout;
int ch, status, ret;
pid_t pid;
bool rand_th, rand_time;
const char *working_dir;
char fname[64], kname[64];
if ((progname = strrchr(argv[0], DIR_DELIM)) == NULL)
progname = argv[0];
else
++progname;
nth = MIN_TH;
rand_th = rand_time = true;
timeout = MIN_TIME;
working_dir = "WT_TEST.random-abort";
timeout = 10;
while ((ch = __wt_getopt(progname, argc, argv, "h:t:")) != EOF)
while ((ch = __wt_getopt(progname, argc, argv, "h:T:t:")) != EOF)
switch (ch) {
case 'h':
working_dir = __wt_optarg;
break;
case 'T':
rand_th = false;
nth = (uint32_t)atoi(__wt_optarg);
break;
case 't':
rand_time = false;
timeout = (uint32_t)atoi(__wt_optarg);
break;
default:
@@ -178,6 +241,19 @@ main(int argc, char *argv[])
testutil_work_dir_from_path(home, 512, working_dir);
testutil_make_work_dir(home);
__wt_random_init_seed(NULL, &rnd);
if (rand_time) {
timeout = __wt_random(&rnd) % MAX_TIME;
if (timeout < MIN_TIME)
timeout = MIN_TIME;
}
if (rand_th) {
nth = __wt_random(&rnd) % MAX_TH;
if (nth < MIN_TH)
nth = MIN_TH;
}
printf("Parent: Create %u threads; sleep %" PRIu32 " seconds\n",
nth, timeout);
/*
* Fork a child to insert as many items. We will then randomly
* kill the child, run recovery and make sure all items we wrote
@@ -187,14 +263,12 @@ main(int argc, char *argv[])
testutil_die(errno, "fork");
if (pid == 0) { /* child */
fill_db();
fill_db(nth);
return (EXIT_SUCCESS);
}
/* parent */
__wt_random_init(&rnd);
/* Sleep for the configured amount of time before killing the child. */
printf("Parent: sleep %" PRIu32 " seconds, then kill child\n", timeout);
sleep(timeout);
/*
@@ -212,7 +286,7 @@ main(int argc, char *argv[])
* this is the place to do it.
*/
if (chdir(home) != 0)
testutil_die(errno, "chdir: %s", home);
testutil_die(errno, "parent chdir: %s", home);
printf("Open database, run recovery and verify content\n");
if ((ret = wiredtiger_open(NULL, NULL, ENV_CONFIG_REC, &conn)) != 0)
testutil_die(ret, "wiredtiger_open");
@@ -222,30 +296,37 @@ main(int argc, char *argv[])
session->open_cursor(session, uri, NULL, NULL, &cursor)) != 0)
testutil_die(ret, "WT_SESSION.open_cursor: %s", uri);
if ((fp = fopen(RECORDS_FILE, "r")) == NULL)
testutil_die(errno, "fopen");
/*
* For every key in the saved file, verify that the key exists
* in the table after recovery. Since we did write-no-sync, we
* expect every key to have been recovered.
*/
for (absent = count = 0;; ++count) {
ret = fscanf(fp, "%" SCNu64 "\n", &key);
if (ret != EOF && ret != 1)
testutil_die(errno, "fscanf");
if (ret == EOF)
break;
cursor->set_key(cursor, key);
if ((ret = cursor->search(cursor)) != 0) {
if (ret != WT_NOTFOUND)
testutil_die(ret, "search");
printf("no record with key %" PRIu64 "\n", key);
++absent;
absent = count = 0;
for (i = 0; i < nth; ++i) {
snprintf(fname, sizeof(fname), RECORDS_FILE, i);
if ((fp = fopen(fname, "r")) == NULL) {
fprintf(stderr, "Failed to open %s. i %u\n", fname, i);
testutil_die(errno, "fopen");
}
/*
* For every key in the saved file, verify that the key exists
* in the table after recovery. Since we did write-no-sync, we
* expect every key to have been recovered.
*/
for (count = 0;; ++count) {
ret = fscanf(fp, "%" SCNu64 "\n", &key);
if (ret != EOF && ret != 1)
testutil_die(errno, "fscanf");
if (ret == EOF)
break;
snprintf(kname, sizeof(kname), "%" PRIu64, key);
cursor->set_key(cursor, kname);
if ((ret = cursor->search(cursor)) != 0) {
if (ret != WT_NOTFOUND)
testutil_die(ret, "search");
printf("no record with key %" PRIu64 "\n", key);
++absent;
}
}
if (fclose(fp) != 0)
testutil_die(errno, "fclose");
}
if (fclose(fp) != 0)
testutil_die(errno, "fclose");
if ((ret = conn->close(conn, NULL)) != 0)
testutil_die(ret, "WT_CONNECTION:close");
if (absent) {

View File

@@ -5,6 +5,7 @@ no_scale_per_second_list = [
'async: maximum work queue length',
'cache: bytes currently in the cache',
'cache: eviction currently operating in aggressive mode',
'cache: files with active eviction walks',
'cache: maximum bytes configured',
'cache: maximum page size at eviction',
'cache: pages currently held in the cache',
@@ -59,6 +60,7 @@ no_scale_per_second_list = [
'btree: overflow pages',
'btree: row-store internal pages',
'btree: row-store leaf pages',
'cache: bytes currently in the cache',
'cache: overflow values cached in memory',
'LSM: bloom filters in the LSM tree',
'LSM: chunks in the LSM tree',
@@ -71,6 +73,7 @@ no_clear_list = [
'async: maximum work queue length',
'cache: bytes currently in the cache',
'cache: eviction currently operating in aggressive mode',
'cache: files with active eviction walks',
'cache: maximum bytes configured',
'cache: maximum page size at eviction',
'cache: pages currently held in the cache',
@@ -102,6 +105,7 @@ no_clear_list = [
'transaction: transaction range of IDs currently pinned by a checkpoint',
'transaction: transaction range of IDs currently pinned by named snapshots',
'btree: btree checkpoint generation',
'cache: bytes currently in the cache',
'session: open cursor count',
]
prefix_list = [