* Improve two recent assertions, one from WT-2798 relating to writing metadata updates to disk that are part of a running transaction, and another from WT-2802 that checks that we don't try to copy values from a cursor without a transaction pinned. The latter doesn't apply to cursors on checkpoints (including chunk cursors in an LSM tree).
* Copy cursor values before rollback in autocommit.
If an autocommit operation such as WT_CURSOR::update touches multiple trees (e.g., multiple column groups in a table, or index updates, or multiple chunks in an LSM tree), then some cursors may have consumed the application's key/value pair when the operation has to roll back. Take a copy of any such values before attempting to retry the operation.
(cherry picked from commit 41eb2dcaac)
When logging is disabled, a create operation (and potentially other
metadata updates) could write partially completed checkpoint metadata,
leaving on-disk files inconsistent until the checkpoint completes.
(cherry picked from commit 7e1a47dd45)
Change the default remove/rename calls to flush the enclosing directory.
Simplify the pluggable file system API by replacing the directory-sync method
with "durable" boolean argument to the remove, rename and open-file methods.
* Add "durable" arguments to relevant functions so that each remove or rename
call specifies its durability requirements.
* Switch the WT_FILE_SYSTEM::fs_open_file type enum from WT_OPEN_FILE_TYPE,
with WT_OPEN_XXX names, to the WT_FS_OPEN_FILE_TYPE, with WT_FS_OPEN_XXX
names.
Switch the WT_FILE_SYSTEM::fs_open_file flags from WT_OPEN_XXX names to
WT_FS_OPEN_XXX names.
* Replace the "bool durable" argument to WT_FILE_SYSTEM.fs_remove and
WT_FILE_SYSTEM.fs_rename with a "uint32_t flags" argument, and the
WT_FS_DURABLE flag.
* Remove a stray bracket.
(cherry picked from commit 11f018322c)
This problem can occur for both row and column store.
The WT_CURSOR_BTREE.rip_saved field potentially has the same problem
as the cip_saved field, initializing it on point-searches is wrong,
it should be initialized as a cursor moves to a new page.
* Clear cip_saved and rip_saved when starting to iterate from a search
position. This wasn't necessary before because we cleared them in
__cursor_pos_clear(), but I removed that code.
In summary, we now clear them in the iteration code, both when starting
an iteration and when switching to a new page. That's correct because
they have nothing to do with searches so the clear doesn't belong in
__cursor_pos_clear(), and we have to do the clear when switching to a
new page regardless, __cursor_pos_clear() isn't called when switching
to a new page.
(cherry picked from commit 1b6a9220c3)
Reset the column-store saved slot information on each new page, otherwise
it's possible for it to match the last page we were traversing.
(cherry picked from commit 51a4e1593d)
* WT-2711 Remove posix expanded strftime values and use older C89 values
* Fix issues with s_string
* Add a comment so nobody rewrites the strftime format and reintroduces the bug.
* Fix strings sort order.
(cherry picked from commit 1c67c4e0f0)
If there's no server running, discard any configuration information so
we don't leak memory during reconfiguration.
(cherry picked from commit e001657e5c)
No longer support setting the statistics_log path in WT_CONNECTION::reconfigure.
No longer support setting a custom name for statistics files, only allow a destination directory.
Be more explicit about which logging configuration options are allowed in WT_CONNECTION::reconfigure.
The aim of these changes is to avoid situations where applications that embed WiredTiger allow their users to overwrite unexpected files on a file system.
This potentially requires an upgrade step for applications that were specifying a non-standard file name component for statistics log file names, it's not backward compatible.
(cherry picked from commit 9cc5d0f4b1)
Randomize visits to trees that use a tiny fraction of the cache.
Eviction optimizations.
Now that we are queuing more entries (potentially), make sure enough of
them become candidates. Previously, a skewed distribution of read
generations could mean that only 10% of queue entries were considered.
Improve the efficiency of sorting the queue by calculating the score
once when pages are added to the queue.
Take care to bound the maximum eviction slot.
(cherry picked from commit 521270d54c)
When splitting the root page and updating the child's WT_REF.addr, reconciliation/eviction can race with us, updating WT_REF.addr after our read and before our update. The update is necessary because the child's
address points into the page being split: if the address changes, then it can no longer point into the page being split and the update is no longer necessary.
Define system call success as a 0 return, and split error handling into two parts: if the call returns -1, use errno, otherwise expect the failing return to be an error value.
Replace calls to remove with unlink, so we know errno will be set. Do the best we can with rename, there's no easy workaround.
POSIX requires posix_madvise return an errno value, but some OS versions return a -1/errno pair instead (at least FreeBSD and OS X). I don't care about retrying posix_madvise calls on failure, but since WT_SYSCALL_RETRY includes the necessary error handling magic, wrap the posix_madvise calls in WT_SYSCALL_RETRY.
(cherry picked from commit ced588aecd)
Add more options for callers when updating the oldest ID to control how much they care about the ID being updated.
(cherry picked from commit 116e41e5e1)
When the cache hits eviction triggers, all application threads can
hammer the eviction queue lock, starving each other and server threads.
Also, noticed with the same workload, the eviction server doesn't need
to force updates to the oldest ID (which can starve the eviction server
thread if there are hundreds of application threads getting snapshots).
It is sufficient to update it lazily.
* Clear the eviction walk if we don't find any candidates.
Otherwise, we are keeping a page pinned in what might be an idle file,
and tying up a hazard pointer that could prevent eviction from an active
file (since the eviction server tracks how many hazard pointers it is
using to avoid going over the limit).
(cherry picked from commit 7f9d7aecea)
* Default checkpoint_wait is true. This change is useful because it means concurrent create/drop calls don't generate EBUSY returns.
* Mark lock_wait and checkpoint_wait as undoc
(cherry picked from commit 4b48ad6fb7)
uninitialized in __ref_is_leaf() (based on a call to __wt_ref_info()).
It's not really possible because the path where type isn't set is a path
where we panic because the WT_ADDR structure has an impossible type.
We already ignore the __wt_ref_info() error return in one path, and
there are only two paths that care about the returned type; remove the
error check from __wt_ref_info() and set type to 0 in the failing case
(the same value we use when there's no WT_REF addr to check), the code
that calls this function already checks addr on return.
This simplifies __ref_is_leaf() slightly, it now returns a boolean
instead of an error code with a boolean pointer argument.
# Smoke-test recovery as part of running "make check".
$TEST_WRAPPER ./random-abort -t 10 -T 5
$TEST_WRAPPER ./random-abort -m -t 10 -T 5
$TEST_WRAPPER ./truncated-log
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.