wuwt-e02-dchecks.md | Explore in Territory

# What’s Up With DCHECKs

This is a transcript of [What's Up With
That](https://www.youtube.com/playlist?list=PL9ioqAuyl6ULIdZQys3fwRxi3G3ns39Hq)
Episode 2, a 2022 video discussion between [Sharon ([email protected])
and Peter ([email protected])](https://www.youtube.com/watch?v=MpwbWSEDfjM).

The transcript was automatically generated by speech-to-text software. It may
contain minor errors.

---

You've seen DCHECKs around and been asked to use them in code review, but what
are they? What's the difference between a CHECK and a DCHECK? How do you use
them? Here to answer that is special guest is Peter, who works on UI and
improving crash reports.

Notes:

- https://docs.google.com/document/d/146LoJ1E3N3E6fb4zDh92HPQc6yhRpNI7DSKlJjaYlLw/edit

Links:

- [What's Up With Pointers]

---

00:00 SHARON: Hello, and welcome to What's Up With That?, the series that
demystifies all things Chrome. I'm your host, Sharon. And today, we're talking
about DCHECKs. You've seen them around. You've probably been told to add one in
code review before. But what are they? What are they for, and what do they do?
Our guest today is Peter, who works on desktop UI and Core UI. He's also
working on improving Chrome's crash reports, which includes DCHECKs. Today
he'll help us answer, what's up with DCHECKs? Welcome, Peter.

00:30 PETER: Thanks for having me.

00:32 SHARON: Yeah. Thanks for being here. So the most obvious question to
start with, what is a DCHECK?

00:39 PETER: So a CHECK and a DCHECK are both sort of things that make sure
that what you think is true is true. Right? So this should never be called with
an empty vector. You might add a CHECK for it, or you might add a DCHECK for
it. And it's sort of similar to asserts, which you may have hit during earlier
programming outside of Chrome. And what it means is when this line gets hit, we
check and see if it's true. And if it's not true, we crash. DCHECKs differ from
CHECKs in that they are traditionally only in debug builds, or local
development builds, or on our try-bots. So they have zero overhead when Chrome
hits stable, because the CHECK just won't be there.

01:24 SHARON: OK. So I guess the D stands for Debug. That make sense.

01:28 PETER: Yeah. I want debug to turn into developer, because now we have
them by default if you're no longer - if you're doing a release build, and
you're not turning them off, and you're not doing an official build, you get
them.

01:42 SHARON: OK. Well, you heard it here first, or maybe you heard it before.
I heard it here first. So you mentioned asserts. So something that I've seen a
couple times in Chrome, and also is part of the standard library, is
`static_assert`. So how is that similar or different to DCHECKs? And why do we
use or not use them?

02:00 PETER: Right. So `static_assert`s are - and you're going to have to ask
C++ experts, who can probably take some of the sharp edges off of this - but
it's basically, if you can assert something in compile time, then you can use a
`static_assert`, which means that you don't have to hit a code path where it's
wrong. It sort of has to always hold true. And whenever you can use a
`static_assert`, use a `static_assert`, because it's free. And basically, you
can't compile the program if it's not true.

02:31 SHARON: OK. That's good to know, because I definitely thought that was
one of the C++ standard library things we should avoid, because we have a
similar thing in Chromium. But I guess that's not the case.

02:41 PETER: Yeah. Assert is the one that is - OK, so this is a little
complicated, right? `static_assert` is a language feature, not a library
feature. And someone will tell me that I'm wrong about something about this.
Asserts are just sort of a poorer version of DCHECKs. So they won't go through
our crash handling. It won't print the pretty stacks, et cetera.
`static_assert`s, on the other hand, are a compile time feature. And we don't,
as far as I know, have our own wrapper around it. We just use `static_assert`.
So what you would maybe use this for is like if you have a constant - like, say
you have an array, and the code makes an assumption that some constant is the
size of this array, you can assert that in compile time, and that would be a
good use of a `static_assert`.

03:26 SHARON: OK. Cool. So you mentioned that some things have changed with how
DCHECKs work. So can you give us a brief overview of the history of DCHECKs -
what they used to be, people who have been using them for a while, how might
they have changed from the idea of what they have as a DCHECK in their mind?

03:43 PETER: Sure. So this is as best I know. I'm just sort of extrapolating
from what I've seen. And what I think originally was true is that a CHECK used
to be this logging statement, where you essentially compile the file name and
the line number. And if this ever hits, then we'll log some stuff and then
crash. Right? Which comes with a little bit of overhead, especially on size,
that you basically take the file name and line number for every instance, and
that generates a bunch of strings and numbers that essentially add to Chrome's
binary size. I don't know how many steps between that and where we currently
are. But right now, our CHECKs are just, if condition is false, crash, which
means that you won't, out of the CHECK, get file name and line number. We'll
get those out of debugging symbols. And you also won't get any of the logging
messages that you can add to the end of a CHECK, which means that your debug
info will be poorer, but it will be cheaper to use. So they've gotten from
being pretty heavy CHECKs to being really cheap.

05:01 SHARON: OK. So that kind of leads us into the question that I think most
people want to have answered, which is, when should I use a DCHECK? When should
I use a CHECK? When should I use neither?

05:13 PETER: I would say that historically, we've said CHECKs are expensive.
Don't use them unless you sort of have to. And I don't think that holds true
anymore. So basically, unless you are in really performance-critical code, then
use a CHECK. If there's anything that you care about where the program state
will be unpredictable from this point on if it's not true, CHECK it. It's not
that expensive. Right? We have a lot of code where we push a string onto a
vector, and that never gets flagged in code review. And it's probably like 10
times more expensive, if not 100 times more expensive, than adding a CHECK. The
exception to that is if you're in a really hot loop where you don't want to
dereference a pointer, then a CHECK might add some cost. And the other is if
the condition that you're trying to validate is really expensive. It's not the
CHECK itself that's expensive. It's the thing you're evaluating. And if that's
expensive, then you might not afford doing a CHECK. If you don't know that it's
expensive, it's probably not expensive.

06:20 SHARON: Can you give us an example of something expensive to evaluate for
a CHECK?

06:24 PETER: Right. So say that you have something in video code that for every
video frame, for every pixel validates the alpha value is opaque, or something.
That would probably make video conferencing a little bit worse performance.
Another thing would just be if you have to traverse a graph on every frame, and
it will sort of jump all over memory to see if some reachability problem in
your graph is true, that's going to be a lot more expensive. But CHECKing that
index is less than some vector bounds, I think that should fall under cheap.
And -

07:02 SHARON: OK.

07:02 PETER: culturally, we've tried to avoid doing a lot of these. And I think
it's just hurting us.

07:09 SHARON: OK. So since most places we should use CHECKs, are there any
places where a DCHECK would be better then? Or any time you would have normally
previously used a DCHECK, you should just make that a CHECK?

07:23 PETER: So we have a new construct that's called `EXPENSIVE_DCHECK`s, or
if `EXPENSIVE_DCHECKS_ARE_ON`, I think we should add a corresponding macro for
`EXPENSIVE_DCHECK`. And then you should be able to just say, either it's
expensive and has to be a DCHECK, so use `EXPENSIVE_DCHECK`; otherwise, use
CHECK. And my hunch would be like 95% of what we have as DCHECKs would probably
serve us better as CHECKs. But your code owner and reviewer might disagree with
that. And it's not yet documented policy that we say CHECKs are cheap; just add
a billion of them. But I would like to get there eventually.

08:04 SHARON: OK. So if you put in a CHECK, and your reviewer tells them this
should be a DCHECK, the person writing the CL can point them to this video, and
then they can discuss from there.

08:13 PETER: I mean, yeah, you can either say Peter disagrees with you, or I
can get further along this and say we make policy that CHECKs are cheap, so
they are preferable. So a lot of foot-shooters with DCHECKs is that you expect
this property to hold true, but you never effectively CHECK it. And that can
lead to all sorts of bad stuff, right? Like if you're trying to DCHECK that
some origin for some frame makes some assumptions of site iso - I don't know
site isolation well enough to say this. But basically, if you're DCHECKing that
the code that you're running runs under some sort of permissions, then that is
effectively unchecked in stable, right? And we do care about those properties,
and it would be really good if we crashed rather than leaked information
between sites.

09:12 SHARON: Right.

09:14 PETER: Yeah.

09:16 SHARON: So that seems like a good tie-in for the fact that within some
security people, they don't have the most positive impression of DCHECKs, shall
we say? So a couple examples of this, for listeners who maybe aren't familiar
with this, is one person previously on security saying DCHECKs are pronounced
as "code that's not tested". Someone else I told about this episode - I said,
we're going to talk about DCHECKs - they immediately said, is it going to be
about why DCHECKs are bad? So amongst the Chrome security folks, they are not a
huge fan of DCHECKs. Can you tell us maybe why that is?

09:51 PETER: So if we go back a little bit in time, it used to be that DCHECKs
were only built for developers if they do a debug build. And Chrome has gotten
so big that you don't want to do a debug build or the UI is incredibly slow.
Unfortunately, it's sort of not that great an experience to work in a debug
build. So people work in a release build. That doesn't mean that they don't
care about the things they put under DCHECK. It just means they want to go on
with their lives and not wait x minutes for the browser to launch, or however
bad it is nowadays. And that means that they, unfortunately, lose coverage for
the DCHECKs. So this means that if your code is not exercised well under tests,
then this is completely not enforced. But it's slightly better than a comment,
in that you're really expecting this thing to hold true, and that's clearly an
expectation. But how good is the expectation if you don't look at it? So last
year, I believe, we made it so that DCHECKs are on by default if you're not
doing an official build. And this included release builds. So now, it's like at
least if you're doing development and you hit this condition, it's going to
explode, which is really good, because then you can find a lot of issues, and
we can prevent a lot of issues from ever happening in the first place. It is
really hard for you, as a developer, to make the assumption that if this
invariant is ever false, I will find it during development, and it will never
happen in the wild. And DCHECKs are essentially either, I will find this
locally before I submit it, or all bets are off; or it is I don't care that
much if this thing doesn't hold true, which is sort of a weird assertion to
make. So I think we're in this little awkward in-between state. And this
in-between state, remember, mostly exists as a performance optimization from
when CHECKs used to be a lot more expensive, in terms of code size. So did I
cover most of this?

12:06 SHARON: Yeah. I think, based on that, I think it's pretty easy to see why
people who are more concerned about security are not a fan of this.

12:13 PETER: I mean, if you care about it, especially if it causes privacy or
security or user-harm sort of things, just CHECK. Just CHECK, right? If it
makes your code animate a thing slightly weirder, like it will just jump to the
end position instead of going through your fancy little... whatever. Maybe you can
make that a DCHECK. Maybe it doesn't matter. Like it's wrong, but it's not that
bad. But most of the cases, you DCHECK something, where it's like the program
is going to be in some indeterminate state, and we actually care about if it's
ever false. So maybe we can afford to make it a CHECK. Maybe we should look
more about our sort of vector pushbacks than we should look at our CHECKs, and
then just have more CHECKs. More CHECKs. Because it's also like when things
break, it's a lot cheaper to debug a DCHECK than your program is in some
indeterminate state, because it was allowed to pass through a DCHECK that you
thought was - and when you read the code, unless you're used to reading it as
DCHECKs - oh, that just didn't get enforced - it's sort of hard to try to
figure out why the thing was doing the wrong thing in the first place.

13:22 SHARON: OK. How is this as a summary? When in doubt, CHECK it out.

13:27 PETER: I like that. I like that. And you might get pushback by reviewers,
who aren't on my side of the fence yet. And then you can decide on which hill
you want to die on, at least until we've made policy to just not complain about
DCHECKs, or not complain about CHECKs.

13:45 SHARON: All right. That sounds good. So you mentioned stuff failing in
the wild. And for people who might not know, do you want to just briefly
explain what failing in the wild means?

13:54 PETER: OK. So there's two things. Just failing in the wild just means
that when this thing rolls out to Canary, Dev, Beta, Stable, if you have a
CHECK that will crash and generate a crash report as if you had a memory bug,
but it crashes in a deterministic way, at a deterministic spot - so you can
find out exactly what assumption was violated. Say that this should never be
called with a null pointer. Then you can say, look at this line where it
crashed. It clearly got hit with a null pointer. And then you can try to figure
out, from the stack, why that happened, rather than after you post this pointer
to a task, it crashes somewhere completely irrelevant from the actual call
site. Well, so in the wild specifically means it generates a crash report so
you can look at it, or in the wild means it crashes at a user computer rather
than - in the wildness outside of development. And as for the other part of in
the wild, it's that we have started running non-crashy DCHECKs for a percentage
of Windows Canary. And we're looking to expand that. And we're gathering
information, basically, about which assertions or invariants that we have are
violated in practice in the wild, even though we don't think that they should
be. And that will sort of also culturally move the needle so that we do care
about DCHECKs. And when we care about DCHECKs, sort of similarly to how we care
about CHECKs, is it really that important to make the big distinction between
the two? Except for the case where you have really expensive DCHECKs, they
might still be worth keeping separate. And those will be things like, if you do
things for - say that you zero out memory or something for every memory block
that you allocate and free, or you do things for every audio sample, or for
every video frame pixel, those sort of things. And then we can sort of keep
expensive stuff gated out from CHECKs. And then maybe we don't need this
in-between where people don't know whether they can trust a DCHECK or not.

16:04 SHARON: So you mentioned that certain release builds now have DCHECKs
enabled. So for those in the wild versus regular CHECKs in the wild, if those
happen to fail, do the reports for those look the same? Are they in the same
place? Can they be treated the same?

16:20 PETER: Yeah. Well, they are uploaded to the same crash-reporting thing.
They show up under a special branch. And you likely will get bugs filed to you
if they hit very frequently, just like you would with crashes. There's a sort
of slight difference, in that they say DumpWithoutCrashing. And that's just
sort of a rollout strategy for us. Because if we made DCHECK builds incredibly
crashy, because they hit more than CHECKs, then we can never roll this thing
out. Or it gets a lot scarier for us to put this on 5% of a new platform that
we haven't tested. But as it is right now, the first DCHECK that gets hit for
every process gets a crash dump uploaded.

17:07 SHARON: OK. So I've been definitely told to use DumpWithoutCrashing at
certain points in CLs, where it's like, OK, we think that this shouldn't
happen. But if it does, we don't necessarily want to crash the browser because
of it. With the changes you've mentioned to DCHECKs happening, should those
just be CHECKs instead now or should those still be DumpWithoutCrashing?

17:29 PETER: So if you want DumpWithoutCrashing, and you made those a DCHECK,
then you would only have coverage in the Canary channels that we are testing.
Right? So if you want to get dump reports from the platforms that we're not
currently testing, including all the way up to Stable, you probably still want
to keep that a DumpWithoutCrashing. You want to make sure that you're not
using the sort of - you want to make sure that you triage these, because you
don't want to keep these generating crash dumps forever. You should still
treat them as if they were crashes. And I think the same thing should hold true
for DCHECKs. You should only add them for an invariant that you care about
being violated, right? So as it is violated, you should either figure out why
your invariant was wrong, or you should try to fix the breakage. And you can
probably add more information to logging to figure out why that happened.

18:41 SHARON: So when you have a CHECK, and it crashes in the wild, you get a
stack trace. And that's what you have to work on to figure out what went wrong
for debugging. Right? So what are some things that you can do, as a developer,
to make these CHECKs a bit more useful for you - ways to incorporate other
information that you can use to help yourself debug?

19:01 PETER: So some of the stuff that we have is we have something called
crash keys, which are essentially, you can write a piece of string data,
essentially - there's probably some other data types - and if you write those
before you're running DumpWithoutCrashing, or before you hit a CHECK, or
before you hit a DCHECK, then those will be uploaded along the crash dump. And
if you talk to someone who knows where to find them, you can basically go in
under a crash report, and then under Fields, Product data, or something like
that, you should be able to find your key-value pair. And if you have
information in there, you'll be able to look at it. The other thing that I like
to do, which is probably the more obvious thing, is if you have somewhat of a
hypothesis that this thing should only fail if a or b or c is not true, then
you can add CHECKs for those. Like, if a CHECK is failing, you can add more
CHECKs to see why the CHECK was failing. In general, you're not going to get as
much out of a mini-dump that you want. You're not going to have the full heap
available to you, because that would be a mega-dump. You can usually find
whatever is on the stack if you go in with a debugger. And I know that you
wanted to lead me into talking about CHECK\_GT and CHECK\_EQ, which are
essentially, if you want to check that x is greater than y, then you should use
CHECK\_GT(x,y). The problem with those, in this sort of context, is that,
similarly to CHECKs - so CHECK\_GT gets compiled into, basically, if not x is
greater than y, crash. So unfortunately, the values of x and y are optimized
out when you're doing an official build.

21:02 SHARON: So this makes me think of some stuff we mentioned in the last
episode, which was with Dana. Check it out if you haven't. But one of the types
we mentioned there was SafeRef, which enforces a certain condition. And if that
fails - so in the case of a SafeRef, it ensures that the value you have there
is not null. And if that's ever not true, then you do get a crash similar to if
a CHECK fails. So in general, would you say it's better practice to enforce and
make sure your assumptions are held in these other, more structural ways than
relying on CHECKs instead?

21:41 PETER: So let me see if I can get at what you actually want out of that
one. So if we look at - there's a RawRef type, right? So what's good with the
RawRef is that you have a type that annotates that this thing cannot possibly
be null. So if you assign to it, and you're assigning a null pointer, your
program is going to crash, and you don't need to think about whether you throw
a null pointer in or not. If you keep passing a RawRef around, then that's
essentially you passing around a non-null pointer. And therefore, you don't
have to check that it's not `nullptr` in every step of the way. You only
need to do it when you're - I mean, the type will do it for you, but it only
needs to happen when you're converting from a pointer to a ref, essentially, or
a RawRef. And what's so good about that is now you have the - previously, you
might just CHECK that this isn't called with `nullptr` or whatever. But then
you would do that for four or five arguments. And you'd be like, null pointer
CHECKs are this part of the function body. And then it just gets super-noisy.
But if you're using the RawRef types, then the semantics of the type will
enforce that for you. And you don't have to think about that when reading the
code, because usually when you read the code, you're going to be like, it's a
pointer. Can it be null or not? What does it point to? And this thing will at
least tell you, it can't be null. And you still have the question of, what does
it point to? And that's fine. So I like enforcing this through types more than
checking those assumptions, and then checking inside of what happens. If you
were assigned to this RawRef, then it's going to crash in the constructor if
you have a null pointer. And then based on that stack trace, if we have good
stack data, you're going to know at what line you created the RawRef. And
therefore, it's equivalent to checking for not null pointer, because you can
trust the type to do the checking. And since I know Dana made this, I can
probably with 200% certainty say that it's a CHECK and not a DCHECK. But we do
have a couple of other places where you have a WeakPtr that shouldn't be
dereferenced on the wrong sequence. And those are complicated words. And that,
unfortunately, is a DCHECK. So we're hitting some sort of - I don't know if
that CHECK is actually expensive, or if it should be a CHECK, or if it could be
a CHECK. I think, especially, if you're in core types, the size overhead of
adding a CHECK is negligible, because all of the users of it benefit from that
CHECK. So unless it's incredibly -

24:28 SHARON: What do you mean by core types?

24:30 PETER: Say that you make a `scoped_refptr` or something, that ref pointer is
used everywhere. So if you CHECKed in the destructor, then you're validating
all of the clients of your `scoped_refptr`. So for one CHECK, you get the
price of a lot of CHECKing. Whereas if in your client code you're validating
some parameters of an API call that only gets called once, then that's one
CHECK you add for one case. But if you're re-use, then your CHECK gets a lot
more value. And it's also easier to get parameters wrong sometimes if you have
500 clients that are calling your API. You can't trust all of them to get it
right. Whereas if you're just developing your feature, and it's only used by
your feature, then you can be a little bit more certain with how it's being
called. I would say, still add CHECKs, because code evolves over time. It's
sort of like how you can add unit tests to make sure that no one breaks your
code in the future. If you add CHECKs, then no one can break your code in the
future.

25:37 SHARON: Mm-hmm. OK. So you mentioned a few things about how CHECKs and
DCHECKs are changing. [AUDIO OUT] what is currently in the works, and what is
the long-term goal and plan for CHECKs and DCHECKs.

25:53 PETER: So currently what's in the work is we've made sure that some
libraries that we use, like Abseil and WebRTC, which is a first-party
third-party library, that they both use Chrome's crashing report system, which
means that you get more predictable crash stacks because it's using the
IMMEDIATE\_CRASH macro. But also, you get the fatal logging field that I talked
about. That gets logged as part of crash dumps. So you hopefully have more
glanceable, actionable crash reports whenever a CHECK is violated inside of
Abseil, or in WebRTC, as it were. And then upcoming is we want to make sure
that we keep an eye out for our DCHECKs on other platforms, such as Mac. I know
that there's some issues with getting that fatal log field in the GPU process,
and I'm working on fixing that as well. So hopefully, it just means more
reports for the things you care about and easier to action on reports. That's
what we're hoping.

27:03 SHARON: If people think that this sounds really cool, want to have some
more involvement, or want to ask more questions, what's a good place for them
to do that?

27:11 PETER: I like Slack as a thing for this. So the #cxx channel on Slack,
the #base channel on Slack, the #halp channel on Slack is really good. #halp is
really, I think, unintimidating. You can just throw whatever question you have
in there, and I happen to be around there. If you can find out what my last
name is through sheer force of will, you can send me an email to my Chromium
username. What else would we have? I think if they want to get involved, just
add CHECKs to your code. That's a really good way to do it. Just make sure that
your code does what you expect it to in more cases.

27:48 SHARON: Maybe if you have a CL, and you're just doing some drive-by
cleanup, you can turn some DCHECKs into CHECKs also?

27:56 PETER: If your reviewer is cool with that, I'm cool with that. Otherwise,
you can just try to hope for us making that policy that we use CHECKs - if it's
something we care about, we use a CHECK instead of a DCHECK, unless we have a
really good reason to use a DCHECK. And that would be performance.

28:15 SHARON: That sounds good. And one last question is, what do you want
people to take away as their main takeaway from this discussion?

28:26 PETER: I think validating code assumptions is really valuable. So you
think that you're pretty smart when you're writing something, or you remember -
I mean, you're sometimes kind of smart when you're writing something. And
you're like, this can't possibly be wrong. And in practice, looking at crash
reports, these things are wrong all the time. So please validate any
assumptions that you make. It's also, I would say, better than a comment,
because it's a comment that doesn't get outdated without you noticing it. So, I
think, validate your assumptions to make sure that your code is more robust.
And validate properties you care about. And don't be afraid to use CHECKs.

29:13 SHARON: All right. That sounds like a good summary. Thank you very much
for being here, Peter. It was great to learn about DCHECKs.

29:18 PETER: Yeah. Thanks for having me.

29:24 SHARON: Action. Hello.

29:26 PETER: Oh. Take four.

29:29 SHARON: [LAUGHS] Take four. And action.

[What's Up With Pointers]: https://www.youtube.com/watch?v=MpwbWSEDfjM
chromium/docs/transcripts/wuwt-e02-dchecks.md