2020-02-12 12:21:17 -05:00
|
|
|
|
# Exception Architecture
|
|
|
|
|
|
|
|
|
|
|
|
MongoDB code uses the following types of assertions that are available for use:
|
2024-02-27 11:47:14 -08:00
|
|
|
|
|
2024-04-03 16:12:53 -07:00
|
|
|
|
- `uassert` and `iassert`
|
|
|
|
|
|
- Checks for per-operation user errors. Operation-fatal.
|
|
|
|
|
|
- `tassert`
|
|
|
|
|
|
- Like uassert in that it checks for per-operation user errors, but inhibits clean shutdown
|
|
|
|
|
|
in tests. Operation-fatal, but process-fatal in testing environments during shutdown.
|
|
|
|
|
|
- `massert`
|
|
|
|
|
|
- Checks per-operation invariants. Operation-fatal.
|
|
|
|
|
|
- `fassert`
|
|
|
|
|
|
- Checks fatal process invariants. Process-fatal. Use to detect unexpected situations (such
|
|
|
|
|
|
as a system function returning an unexpected error status).
|
|
|
|
|
|
- `invariant`
|
|
|
|
|
|
- Checks process invariant. Process-fatal. Use to detect code logic errors ("pointer should
|
|
|
|
|
|
never be null", "we should always be locked").
|
2020-02-12 12:21:17 -05:00
|
|
|
|
|
2024-02-27 11:47:14 -08:00
|
|
|
|
**Note**: Calling C function `assert` is not allowed. Use one of the above instead.
|
2020-02-12 12:21:17 -05:00
|
|
|
|
|
|
|
|
|
|
The following types of assertions are deprecated:
|
|
|
|
|
|
|
2024-04-03 16:12:53 -07:00
|
|
|
|
- `MONGO_verify`
|
|
|
|
|
|
- Checks per-operation invariants. A synonym for massert but doesn't require an error code.
|
|
|
|
|
|
Process fatal in debug mode. Do not use for new code; use invariant or fassert instead.
|
|
|
|
|
|
- `dassert`
|
|
|
|
|
|
- Calls `invariant` but only in debug mode. Do not use!
|
2020-02-12 12:21:17 -05:00
|
|
|
|
|
2020-11-19 17:22:13 +00:00
|
|
|
|
MongoDB uses a series of `ErrorCodes` (defined in [mongo/base/error_codes.yml][error_codes_yml]) to
|
|
|
|
|
|
identify and categorize error conditions. `ErrorCodes` are defined in a YAML file and converted to
|
|
|
|
|
|
C++ files using [MongoDB's IDL parser][idlc_py] at compile time. We also use error codes to create
|
|
|
|
|
|
`Status` objects, which convey the success or failure of function invocations across the code base.
|
|
|
|
|
|
`Status` objects are also used internally by `DBException`, MongoDB's primary exception class, and
|
|
|
|
|
|
its children (e.g., `AssertionException`) as a means of maintaining metadata for exceptions. The
|
2020-06-17 18:25:59 +00:00
|
|
|
|
proper usage of these constructs is described below.
|
2020-02-12 12:21:17 -05:00
|
|
|
|
|
2022-10-18 20:47:25 +00:00
|
|
|
|
## Assertion Counters
|
|
|
|
|
|
|
|
|
|
|
|
Some assertions will increment an assertion counter. The `serverStatus` command will generate an
|
|
|
|
|
|
"asserts" section including these counters:
|
|
|
|
|
|
|
2024-04-03 16:12:53 -07:00
|
|
|
|
- `regular`
|
|
|
|
|
|
- Incremented by `MONGO_verify`.
|
|
|
|
|
|
- `warning`
|
|
|
|
|
|
- Always 0. Nothing increments this anymore.
|
|
|
|
|
|
- `msg`
|
|
|
|
|
|
- Incremented by `massert`.
|
|
|
|
|
|
- `user`
|
|
|
|
|
|
- Incremented by `uassert`.
|
|
|
|
|
|
- `tripwire`
|
|
|
|
|
|
- Incremented by `tassert`.
|
|
|
|
|
|
- `rollovers`
|
|
|
|
|
|
- When any counter reaches a value of `1 << 30`, all of the counters are reset and
|
|
|
|
|
|
the "rollovers" counter is incremented.
|
2022-10-18 20:47:25 +00:00
|
|
|
|
|
2020-02-12 12:21:17 -05:00
|
|
|
|
## Considerations
|
|
|
|
|
|
|
|
|
|
|
|
When per-operation invariant checks fail, the current operation fails, but the process and
|
2023-05-23 19:39:20 +00:00
|
|
|
|
connection persist. This means that `massert`, `uassert`, `iassert` and `MONGO_verify` only
|
2020-11-19 17:22:13 +00:00
|
|
|
|
terminate the current operation, not the whole process. Be careful not to corrupt process state by
|
2022-10-18 20:47:25 +00:00
|
|
|
|
mistakenly using these assertions midway through mutating process state.
|
2020-02-12 12:21:17 -05:00
|
|
|
|
|
|
|
|
|
|
`fassert` failures will terminate the entire process; this is used for low-level checks where
|
2022-10-18 20:47:25 +00:00
|
|
|
|
continuing might lead to corrupt data or loss of data on disk. Additionally, `fassert` will log
|
|
|
|
|
|
the assertion message with fatal severity and add a breakpoint before terminating.
|
|
|
|
|
|
|
|
|
|
|
|
`tassert` will fail the operation like `uassert`, but also triggers a "deferred-fatality tripwire
|
|
|
|
|
|
flag". In testing environments, if the tripwire flag is set during shutdown, the process will
|
|
|
|
|
|
invoke the tripwire fatal assertion. In non-testing environments, there will only be a warning
|
|
|
|
|
|
during shutdown that tripwire assertions have failed.
|
2020-02-12 12:21:17 -05:00
|
|
|
|
|
2022-10-18 20:47:25 +00:00
|
|
|
|
`tassert` presents more diagnostics than `uassert`. `tassert` will log the assertion as an error,
|
|
|
|
|
|
log scoped debug info (for more info, see ScopedDebugInfoStack defined in
|
|
|
|
|
|
[mongo/util/assert_util.h][assert_util_h]), print the stack trace, and add a breakpoint.
|
|
|
|
|
|
The purpose of `tassert` is to ensure that operation failures will cause a test suite to fail
|
|
|
|
|
|
without resorting to different behavior during testing. `tassert` should only be used to check
|
|
|
|
|
|
for unexpected values produced by defined behavior.
|
2020-10-26 17:11:18 +11:00
|
|
|
|
|
2020-11-19 17:22:13 +00:00
|
|
|
|
Both `massert` and `uassert` take error codes, so that all assertions have codes associated with
|
2021-08-12 16:25:43 +00:00
|
|
|
|
them. Currently, programmers are free to provide the error code by either [using a unique location
|
2022-10-18 20:47:25 +00:00
|
|
|
|
number](#choosing-a-unique-location-number) or choosing a named code from `ErrorCodes`. Unique location
|
2021-08-12 16:25:43 +00:00
|
|
|
|
numbers have no meaning other than a way to associate a log message with a line of code.
|
2020-06-17 18:25:59 +00:00
|
|
|
|
|
2022-10-18 20:47:25 +00:00
|
|
|
|
`massert` will log the assertion message as an error, while `uassert` will log the message with
|
|
|
|
|
|
debug level of 1 (for more info about log debug level, see [docs/logging.md][logging_md]).
|
|
|
|
|
|
|
|
|
|
|
|
`iassert` provides similar functionality to `uassert`, but it logs at a debug level of 3 and
|
2020-11-19 17:22:13 +00:00
|
|
|
|
does not increment user assertion counters. We should always choose `iassert` over `uassert`
|
2020-06-17 18:25:59 +00:00
|
|
|
|
when we expect a failure, a failure might be recoverable, or failure accounting is not interesting.
|
2020-02-12 12:21:17 -05:00
|
|
|
|
|
2021-08-12 16:25:43 +00:00
|
|
|
|
### Choosing a unique location number
|
|
|
|
|
|
|
2025-02-05 09:50:57 +01:00
|
|
|
|
The current convention for choosing a unique location number is to use the 5 or 6 digit SERVER ticket number
|
2024-02-27 11:47:14 -08:00
|
|
|
|
for the ticket being addressed when the assertion is added, followed by a two digit counter to distinguish
|
|
|
|
|
|
between codes added as part of the same ticket. For example, if you're working on SERVER-12345, the first
|
|
|
|
|
|
error code would be 1234500, the second would be 1234501, etc. This convention can also be used for LOGV2
|
2021-08-12 16:25:43 +00:00
|
|
|
|
logging id numbers.
|
|
|
|
|
|
|
2024-02-27 11:47:14 -08:00
|
|
|
|
The only real constraint for unique location numbers is that they must be unique across the codebase. This is
|
2021-08-12 16:25:43 +00:00
|
|
|
|
verified at compile time with a [python script][errorcodes_py].
|
2020-02-12 12:21:17 -05:00
|
|
|
|
|
|
|
|
|
|
## Exception
|
|
|
|
|
|
|
|
|
|
|
|
A failed operation-fatal assertion throws an `AssertionException` or a child of that.
|
|
|
|
|
|
The inheritance hierarchy resembles:
|
|
|
|
|
|
|
2024-04-03 16:12:53 -07:00
|
|
|
|
- `std::exception`
|
|
|
|
|
|
- `mongo::DBException`
|
|
|
|
|
|
- `mongo::AssertionException`
|
|
|
|
|
|
- `mongo::UserException`
|
|
|
|
|
|
- `mongo::MsgAssertionException`
|
2020-02-12 12:21:17 -05:00
|
|
|
|
|
|
|
|
|
|
See util/assert_util.h.
|
|
|
|
|
|
|
2020-11-19 17:22:13 +00:00
|
|
|
|
Generally, code in the server should be able to tolerate (e.g., catch) a `DBException`. Server
|
|
|
|
|
|
functions must be structured with exception safety in mind, such that `DBException` can propagate
|
|
|
|
|
|
upwards harmlessly. The code should also expect, and properly handle, `UserException`. We use
|
2020-06-17 18:25:59 +00:00
|
|
|
|
[Resource Acquisition Is Initialization][raii] heavily.
|
|
|
|
|
|
|
|
|
|
|
|
## ErrorCodes and Status
|
|
|
|
|
|
|
2020-11-19 17:22:13 +00:00
|
|
|
|
MongoDB uses `ErrorCodes` both internally and externally: a subset of error codes (e.g.,
|
|
|
|
|
|
`BadValue`) are used externally to pass errors over the wire and to clients. These error codes are
|
2024-02-27 11:47:14 -08:00
|
|
|
|
the means for MongoDB processes (e.g., _mongod_ and _mongo_) to communicate errors, and are visible
|
2020-11-19 17:22:13 +00:00
|
|
|
|
to client applications. Other error codes are used internally to indicate the underlying reason for
|
|
|
|
|
|
a failed operation. For instance, `PeriodicJobIsStopped` is an internal error code that is passed
|
|
|
|
|
|
to callback functions running inside a [`PeriodicRunner`][periodic_runner_h] once the runner is
|
|
|
|
|
|
stopped. The internal error codes are for internal use only and must never be returned to clients
|
2020-06-17 18:25:59 +00:00
|
|
|
|
(i.e., in a network response).
|
2020-11-19 17:22:13 +00:00
|
|
|
|
|
|
|
|
|
|
Zero or more error categories can be assigned to `ErrorCodes`, which allows a single handler to
|
|
|
|
|
|
serve a group of `ErrorCodes`. `RetriableError`, for instance, is an `ErrorCategory` that includes
|
|
|
|
|
|
all retriable `ErrorCodes` (e.g., `HostUnreachable` and `HostNotFound`). This implies that an
|
|
|
|
|
|
operation that fails with any error code in this category can be safely retried. We can use
|
|
|
|
|
|
`ErrorCodes::isA<${category}>(${error})` to check if `error` belongs to `category`. Alternatively,
|
|
|
|
|
|
we can use `ErrorCodes::is${category}(${error})` to check error categories. Both methods provide
|
2020-06-17 18:25:59 +00:00
|
|
|
|
similar functionality.
|
|
|
|
|
|
|
2020-11-19 17:22:13 +00:00
|
|
|
|
To represent the status of an executed operation (e.g., a command or a function invocation), we
|
|
|
|
|
|
use `Status` objects, which represent an error state or the absence thereof. A `Status` uses the
|
|
|
|
|
|
standardized `ErrorCodes` to determine the underlying cause of an error. It also allows assigning
|
|
|
|
|
|
a textual description, as well as code-specific extra info, to the error code for further
|
|
|
|
|
|
clarification. The extra info is a subclass of `ErrorExtraInfo` and specific to `ErrorCodes`. Look
|
2020-06-17 18:25:59 +00:00
|
|
|
|
for `extra` in [here][error_codes_yml] for reference.
|
|
|
|
|
|
|
2020-11-19 17:22:13 +00:00
|
|
|
|
MongoDB provides `StatusWith` to enable functions to return an error code or a value without
|
|
|
|
|
|
requiring them to have multiple outputs. This makes exception-free code cleaner by avoiding
|
|
|
|
|
|
functions with multiple out parameters. We can either pass an error code or an actual value to a
|
|
|
|
|
|
`StatusWith` object, indicating failure or success of the operation. For examples of the proper
|
|
|
|
|
|
usage of `StatusWith`, see [mongo/base/status_with.h][status_with_h] and
|
|
|
|
|
|
[mongo/base/status_with_test.cpp][status_with_test_cpp]. It is highly recommended to use `uassert`
|
|
|
|
|
|
or `iassert` over `StatusWith`, and catch exceptions instead of checking `Status` objects
|
|
|
|
|
|
returned from functions. Using `StatusWith` to indicate exceptions, instead of throwing via
|
|
|
|
|
|
`uassert` and `iassert`, makes it very difficult to identify that an error has occurred, and
|
2020-06-17 18:25:59 +00:00
|
|
|
|
could lead to the wrong error being propagated.
|
2020-02-12 12:21:17 -05:00
|
|
|
|
|
2024-12-31 11:45:45 -05:00
|
|
|
|
## Using noexcept
|
|
|
|
|
|
|
2025-03-18 18:25:29 +01:00
|
|
|
|
Server code should generally be written to be exception safe. Historically,
|
|
|
|
|
|
we've had bugs due to code being overzealously marked `noexcept`. In such
|
|
|
|
|
|
contexts, throwing an exception crashes the server, which can compromise
|
|
|
|
|
|
availability. However, _just_ removing `noexcept` from such code is not a viable
|
|
|
|
|
|
solution \- exception unsafe code may _need_ to crash in order to avoid causing
|
|
|
|
|
|
an even worse failure. We want to work towards ensuring that functions that
|
|
|
|
|
|
ought to be are in fact exception safe, and remove `noexcept` usage where it's
|
|
|
|
|
|
not warranted. Here, we outline guidelines for doing so.
|
|
|
|
|
|
|
|
|
|
|
|
Noexcept is a runtime check that terminates the process rather than allowing
|
|
|
|
|
|
the function to exit because of a throw. Noexcept may be used when it can be
|
|
|
|
|
|
thought of as a bug for any uncaught exception to be thrown. There is no
|
|
|
|
|
|
compile-time check that exceptions will not be thrown within a `noexcept`
|
|
|
|
|
|
function. Instead, putting `noexcept` on a function may be thought of as similar
|
|
|
|
|
|
to using invariant in the following way:
|
|
|
|
|
|
|
|
|
|
|
|
```c
|
|
|
|
|
|
// Example noexcept code.
|
|
|
|
|
|
void func() noexcept {
|
|
|
|
|
|
...
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// Similar alternative pseudocode.
|
|
|
|
|
|
void func() try {
|
|
|
|
|
|
...
|
|
|
|
|
|
} catch (...) {
|
|
|
|
|
|
invariant(!"unexpected exception");
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**As with invariant, be very careful when putting `noexcept` on a function that
|
2025-04-24 13:20:18 -04:00
|
|
|
|
interacts with untrusted input.** This has been the root cause of serious past
|
|
|
|
|
|
bugs.
|
2025-03-18 18:25:29 +01:00
|
|
|
|
|
|
|
|
|
|
### Adding or Removing noexcept
|
|
|
|
|
|
|
|
|
|
|
|
When considering removing `noexcept` from a function, the author of that change
|
|
|
|
|
|
must ensure that the function’s implementation and its callsites are not
|
|
|
|
|
|
relying on the function not throwing for correctness. Because of this, **be
|
|
|
|
|
|
careful putting `noexcept` on a function** if there’s a chance it may need to be
|
|
|
|
|
|
removed later. `noexcept` generally **should not be used** solely for reasons of
|
|
|
|
|
|
performance optimization. Aside from the cases listed in the next section, it
|
|
|
|
|
|
should not be assumed to improve performance without solid evidence.
|
|
|
|
|
|
|
|
|
|
|
|
If a part of the implementation would benefit from relying on not throwing, but
|
|
|
|
|
|
`noexcept` is not meant to be a part of the function’s contract, it is acceptable
|
|
|
|
|
|
to use a try/catch/invariant construction similar to the example above or an
|
|
|
|
|
|
internal `noexcept` helper function.
|
|
|
|
|
|
|
|
|
|
|
|
When adding or removing `noexcept`, also consider what types of exceptions are
|
|
|
|
|
|
possible in that context and in our codebase. Refer to the “Where Exceptions
|
|
|
|
|
|
are Possible” section for more details.
|
|
|
|
|
|
|
|
|
|
|
|
If you are uncertain about adding or removing `noexcept` in a given situation,
|
|
|
|
|
|
reach out to \#server-programmability on slack.
|
|
|
|
|
|
|
|
|
|
|
|
### Cases Where noexcept is Encouraged
|
|
|
|
|
|
|
|
|
|
|
|
This list is not exhaustive and there are cases not enumerated here that are
|
|
|
|
|
|
valid uses of `noexcept`.
|
|
|
|
|
|
|
|
|
|
|
|
#### Move operations
|
|
|
|
|
|
|
|
|
|
|
|
Using `noexcept` with move operations allows operations to skip generating
|
|
|
|
|
|
exception handling code. If a type’s move operation will not throw exceptions,
|
|
|
|
|
|
it is strictly worse not to use `noexcept`. For instance, std::vector\<T\> can
|
|
|
|
|
|
use optimized versions of certain operations when T has `noexcept` move
|
|
|
|
|
|
operations. In these cases, **`noexcept` can be considered a requirement**. Of
|
|
|
|
|
|
course, if a move operation genuinely needs to throw exceptions, then don’t
|
|
|
|
|
|
mark it `noexcept`. This should be very rare – moves should be non-throwing in
|
|
|
|
|
|
almost all cases.
|
|
|
|
|
|
|
|
|
|
|
|
#### Swap operations
|
|
|
|
|
|
|
|
|
|
|
|
Allows callers to optimize for an exception-free pathway. **Swap operations
|
|
|
|
|
|
should follow the same `noexcept` guidelines as move operations**.
|
|
|
|
|
|
|
|
|
|
|
|
#### Hash functions
|
|
|
|
|
|
|
|
|
|
|
|
Allows some hashing library types to optimize for an exception-free pathway.
|
|
|
|
|
|
This can even affect the behavior, performance, and even layout of certain
|
|
|
|
|
|
container types (such as libstdc++’s
|
|
|
|
|
|
[unordered_map](https://gcc.gnu.org/onlinedocs/libstdc++/manual/unordered_associative.html)).
|
|
|
|
|
|
**Hash functions should follow the `noexcept` guidelines as move operations.**
|
|
|
|
|
|
|
|
|
|
|
|
#### Destructors and “Destructor-Safe” Functions
|
|
|
|
|
|
|
|
|
|
|
|
Destructors are generally implicitly `noexcept`, and are encouraged to remain
|
|
|
|
|
|
implicitly `noexcept` \- that is, by not marking them with `noexcept(false)`.
|
|
|
|
|
|
Functions where “destructor safety” is a core part of their functionality **may
|
|
|
|
|
|
be marked `noexcept`**. This is not a requirement – destructors are allowed to
|
|
|
|
|
|
call potentially-throwing functions. It is also not a blanket recommendation to
|
|
|
|
|
|
consider `noexcept` for all functions called from destructors. When calling a
|
|
|
|
|
|
potentially-throwing function from a destructor, think about whether or not it
|
|
|
|
|
|
can indeed throw in that context, and if exceptions need to be handled. If it
|
|
|
|
|
|
can indeed throw in that context, exceptions almost certainly need to be
|
|
|
|
|
|
handled \- otherwise the server will crash.
|
|
|
|
|
|
|
|
|
|
|
|
The lambda passed to `ON_BLOCK_EXIT()` and `ScopeGuard()` should be treated
|
|
|
|
|
|
similarly to destructors: it is executed in a `noexcept` context (a destructor)
|
|
|
|
|
|
and marking it as such is discouraged as being noisy. But code intended to be
|
|
|
|
|
|
called from them can be.
|
|
|
|
|
|
|
|
|
|
|
|
### Where Exceptions are Possible
|
|
|
|
|
|
|
|
|
|
|
|
In our codebase, generally DBException is the only type of exception that
|
|
|
|
|
|
should be crossing API boundaries. If an exception other than a DBException
|
|
|
|
|
|
does cross an API boundary, it should be considered a bug. Whichever component
|
|
|
|
|
|
throws the exception should handle it locally, even if only by translating it
|
|
|
|
|
|
to a DBException. Generally any caller you would consider to be an external
|
|
|
|
|
|
caller should be able to rely on DBException being the only exception type your
|
|
|
|
|
|
function will throw.
|
|
|
|
|
|
|
|
|
|
|
|
Allocations using the global new allocator or std::allocator in our codebase do
|
|
|
|
|
|
not throw, instead terminating the process directly when OOM conditions are
|
|
|
|
|
|
encountered. As such, there is no need to handle exceptions from these sources.
|
2024-12-31 11:45:45 -05:00
|
|
|
|
|
2020-02-12 12:21:17 -05:00
|
|
|
|
## Gotchas
|
|
|
|
|
|
|
|
|
|
|
|
Gotchas to watch out for:
|
|
|
|
|
|
|
2024-04-03 16:12:53 -07:00
|
|
|
|
- Generally, do not throw an `AssertionException` directly. Functions like `uasserted()` do work
|
|
|
|
|
|
beyond just that. In particular, it makes sure that the `getLastError` structures are set up
|
|
|
|
|
|
properly.
|
|
|
|
|
|
- Think about the location of your asserts in constructors, as the destructor would not be
|
|
|
|
|
|
called. But at a minimum, use `wassert` a lot therein, we want to know if something is wrong.
|
|
|
|
|
|
- Do **not** throw in destructors or allow exceptions to leak out (if you call a function that
|
|
|
|
|
|
may throw).
|
2020-02-12 12:21:17 -05:00
|
|
|
|
|
2020-06-17 18:25:59 +00:00
|
|
|
|
[raii]: https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization
|
|
|
|
|
|
[error_codes_yml]: ../src/mongo/base/error_codes.yml
|
|
|
|
|
|
[periodic_runner_h]: ../src/mongo/util/periodic_runner.h
|
|
|
|
|
|
[status_with_h]: ../src/mongo/base/status_with.h
|
|
|
|
|
|
[idlc_py]: ../buildscripts/idl/idlc.py
|
|
|
|
|
|
[status_with_test_cpp]: ../src/mongo/base/status_with_test.cpp
|
2021-08-12 16:25:43 +00:00
|
|
|
|
[errorcodes_py]: ../buildscripts/errorcodes.py
|
2022-10-18 20:47:25 +00:00
|
|
|
|
[assert_util_h]: ../src/mongo/util/assert_util.h
|
|
|
|
|
|
[logging_md]: logging.md
|